date:20140312

[ PATCH 0/8] sched: remove cpu_load array

2014-03-12 Thread Alex Shi

In the cpu_load decay usage, we mixed the long term, short term load with
balance bias, randomly pick a big/small value from them according to balance 
destination or source. This mix is wrong, the balance bias should be based
on task moving cost between cpu groups, not on random history or instant load.
History load maybe diverage a lot from real load, that lead to incorrect bias.

In fact, the cpu_load decays can be replaced by the sched_avg decay, that 
also decays load on time. The balance bias part can fullly use fixed bias --
imbalance_pct, which is already used in newly idle, wake, forkexec balancing
and numa balancing scenarios.

Currently the only working idx is busy_idx and idle_idx.
As to busy_idx:
We mix history load decay and bias together. The ridiculous thing is, when 
all cpu load are continuous stable, long/short term load is same. then we 
lose the bias meaning, so any minimum imbalance may cause unnecessary task
moving. To prevent this funny thing happen, we have to reuse the 
imbalance_pct again in find_busiest_group().  But that clearly causes over
bias in normal time. If there are some burst load in system, it is more worse.

As to idle_idx:
Though I have some cencern of usage corretion, 
https://lkml.org/lkml/2014/3/12/247, but since we are working on cpu
idle migration into scheduler. The problem will be reconsidered. We don't
need to care it now.

This patch removed the cpu_load idx decay, since it can be replaced by
sched_avg feature. and left the imbalance_pct bias untouched, since only 
idle_idx missed it, but it is fine. and will be reconsidered soon.


V5,
1, remove unify bias patch and biased_load function. Thanks for PeterZ's 
comments!
2, remove get_sd_load_idx() in the 1st patch as SrikarD's suggestion.
3, remove LB_BIAS feature, it is not needed now.

V4,
1, rebase on latest tip/master
2, replace target_load by biased_load as Morten's suggestion

V3,
1, correct the wake_affine bias. Thanks for Morten's reminder!
2, replace source_load by weighted_cpuload for better function name meaning.

V2,
1, This version do some tuning on load bias of target load.
2, Got further to remove the cpu_load in rq.
3, Revert the patch 'Limit sd->*_idx range on sysctl' since no needs

Any testing/comments are appreciated.

This patch rebase on latest tip/master.
The git tree for this patchset at:
 g...@github.com:alexshi/power-scheduling.git noload

Thanks
Alex

 [PATCH 1/8] sched: shortcut to remove load_idx
 [PATCH 2/8] sched: remove rq->cpu_load[load_idx] array
 [PATCH 3/8] sched: remove source_load and target_load
 [PATCH 4/8] sched: remove LB_BIAS
 [PATCH 5/8] sched: clean up cpu_load update
 [PATCH 6/8] sched: rewrite update_cpu_load_nohz
 [PATCH 7/8] sched: remove rq->cpu_load and rq->nr_load_updates
 [PATCH 8/8] sched: rename update_*_cpu_load
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: fix i_writecount on shmem and friends

2014-03-12 Thread NeilBrown

On Thu, 13 Mar 2014 04:29:34 + Al Viro  wrote:

> On Thu, Mar 13, 2014 at 03:08:00PM +1100, NeilBrown wrote:
> > +   inode = mddev->bitmap_info.file->f_mapping->host;
> > +   if (!S_ISREG(inode->i_mode)) {
> > +   printk(KERN_ERR "%s: error: bitmap file must be a 
> > regular file\n",
> > +  mdname(mddev));
> > +   fput(mddev->bitmap_info.file);
> > +   mddev->bitmap_info.file = NULL;
> > +   return -EBADF;
> > +   }
> > +   if (atomic_read(>i_writecount) != 1) {
> 
> Umm...  I think you ought to check more than that.  At the very least you
> want to check that you have it opened for write - you don't want e.g.
> a filesystem containing that puppy remounted r/o under you.  Another thing
> is, what happens if it's not a buffer cache backed one?  Hell, what happens
> if it's a file on NFS?  You are relying on bmap() working, right?  So it
> looks like you ought to check if ->bmap() is there.  And I really wonder
> how well does it play with journalling fs...

Can we do direct writes from kernel space yet?  If so I'll change the code to
do that so that it will work with any filesystem (which supports direct
writes).
(The documentation says we that bitmap files should only be used on ext2 or
ext3.  Most people use bitmaps on the raw devices so hopefully the few who
have a need for files will read the documentation :-)

(and yes, I check for FMODE_WRITE)

Thanks,
NeilBrown


signature.asc
Description: PGP signature

[PATCH] staging: dgap: Fixed sparse error: same symbol redeclared with different type

2014-03-12 Thread Masood Mehmood

sparse reported dgap_do_fep_load is redeclared with different type. while
fixing, I noticed __user attribute is used incorrectly in declaration.
There is no need to define __user for firware->data.

Replaced the __user with 'const uchar *' from function dgap_do_fep_load and
did the same for function dgap_do_bios_load.

Signed-off-by: Masood Mehmood 
---
 drivers/staging/dgap/dgap.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/dgap/dgap.c b/drivers/staging/dgap/dgap.c
index d00283a..ab81479 100644
--- a/drivers/staging/dgap/dgap.c
+++ b/drivers/staging/dgap/dgap.c
@@ -210,9 +210,8 @@ static uint dgap_config_get_useintr(struct board_t *bd);
 static uint dgap_config_get_altpin(struct board_t *bd);
 
 static int dgap_ms_sleep(ulong ms);
-static void dgap_do_bios_load(struct board_t *brd, uchar __user *ubios,
-   int len);
-static void dgap_do_fep_load(struct board_t *brd, uchar __user *ufep, int len);
+static void dgap_do_bios_load(struct board_t *brd, const uchar *ubios, int 
len);
+static void dgap_do_fep_load(struct board_t *brd, const uchar *ufep, int len);
 #ifdef DIGI_CONCENTRATORS_SUPPORTED
 static void dgap_do_conc_load(struct board_t *brd, uchar *uaddr, int len);
 #endif
@@ -935,7 +934,7 @@ static int dgap_firmware_load(struct pci_dev *pdev, int 
card_type)
fw_info[card_type].bios_name);
return ret;
}
-   dgap_do_bios_load(brd, (char *)fw->data, fw->size);
+   dgap_do_bios_load(brd, fw->data, fw->size);
release_firmware(fw);
 
/* Wait for BIOS to test board... */
@@ -953,7 +952,7 @@ static int dgap_firmware_load(struct pci_dev *pdev, int 
card_type)
fw_info[card_type].fep_name);
return ret;
}
-   dgap_do_fep_load(brd, (char *)fw->data, fw->size);
+   dgap_do_fep_load(brd, fw->data, fw->size);
release_firmware(fw);
 
/* Wait for FEP to load on board... */
@@ -4349,7 +4348,7 @@ static int dgap_tty_register_ports(struct board_t *brd)
  * Copies the BIOS code from the user to the board,
  * and starts the BIOS running.
  */
-static void dgap_do_bios_load(struct board_t *brd, uchar __user *ubios, int 
len)
+static void dgap_do_bios_load(struct board_t *brd, const uchar *ubios, int len)
 {
uchar *addr;
uint offset;
@@ -4425,7 +4424,7 @@ static void dgap_do_wait_for_bios(struct board_t *brd)
  * Copies the FEP code from the user to the board,
  * and starts the FEP running.
  */
-static void dgap_do_fep_load(struct board_t *brd, uchar *ufep, int len)
+static void dgap_do_fep_load(struct board_t *brd, const uchar *ufep, int len)
 {
uchar *addr;
uint offset;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] backlight: add new LP8860 backlight driver

2014-03-12 Thread Daniel Jeong

 This patch adds LP8860 backlight device driver.
LP8860 is a low EMI and High performance 4 channel LED Driver of TI.
This device driver provides the way to control brightness and current
of each channel and provides the way to change values in the eeprom.

Signed-off-by: Daniel Jeong 
---
 drivers/video/backlight/Kconfig |7 +
 drivers/video/backlight/Makefile|1 +
 drivers/video/backlight/lp8860_bl.c |  525 +++
 include/linux/platform_data/lp8860_bl.h |   54 
 4 files changed, 587 insertions(+)
 create mode 100644 drivers/video/backlight/lp8860_bl.c
 create mode 100644 include/linux/platform_data/lp8860_bl.h

diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
index 5a3eb2e..908048f 100644
--- a/drivers/video/backlight/Kconfig
+++ b/drivers/video/backlight/Kconfig
@@ -397,6 +397,13 @@ config BACKLIGHT_LP8788
help
  This supports TI LP8788 backlight driver.
 
+config BACKLIGHT_LP8860
+   tristate "Backlight Driver for LP8860"
+   depends on BACKLIGHT_CLASS_DEVICE && I2C
+   select REGMAP_I2C
+   help
+ This supports TI LP8860 Backlight Driver
+
 config BACKLIGHT_OT200
tristate "Backlight driver for ot200 visualisation device"
depends on BACKLIGHT_CLASS_DEVICE && CS5535_MFGPT && GPIO_CS5535
diff --git a/drivers/video/backlight/Makefile b/drivers/video/backlight/Makefile
index bb82002..cbc5ac3 100644
--- a/drivers/video/backlight/Makefile
+++ b/drivers/video/backlight/Makefile
@@ -42,6 +42,7 @@ obj-$(CONFIG_BACKLIGHT_LM3639)+= lm3639_bl.o
 obj-$(CONFIG_BACKLIGHT_LOCOMO) += locomolcd.o
 obj-$(CONFIG_BACKLIGHT_LP855X) += lp855x_bl.o
 obj-$(CONFIG_BACKLIGHT_LP8788) += lp8788_bl.o
+obj-$(CONFIG_BACKLIGHT_LP8860) += lp8860_bl.o
 obj-$(CONFIG_BACKLIGHT_LV5207LP)   += lv5207lp.o
 obj-$(CONFIG_BACKLIGHT_MAX8925)+= max8925_bl.o
 obj-$(CONFIG_BACKLIGHT_OMAP1)  += omap1_bl.o
diff --git a/drivers/video/backlight/lp8860_bl.c 
b/drivers/video/backlight/lp8860_bl.c
new file mode 100644
index 000..d2be950
--- /dev/null
+++ b/drivers/video/backlight/lp8860_bl.c
@@ -0,0 +1,525 @@
+/*
+* Simple driver for Texas Instruments lp8860 Backlight driver chip
+*
+* Copyright (C) 2014 Texas Instruments
+* Author: Daniel Jeong  
+*Ldd Mlp 
+*
+* This program is free software; you can redistribute it and/or modify
+* it under the terms of the GNU General Public License version 2 as
+* published by the Free Software Foundation.
+*
+*/
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define REG_CL0_BRT_H  0x00
+#define REG_CL0_BRT_L  0x01
+#define REG_CL0_I_H0x02
+#define REG_CL0_I_L0x03
+
+#define REG_CL1_BRT_H  0x04
+#define REG_CL1_BRT_L  0x05
+#define REG_CL1_I  0x06
+
+#define REG_CL2_BRT_H  0x07
+#define REG_CL2_BRT_L  0x08
+#define REG_CL2_I  0x09
+
+#define REG_CL3_BRT_H  0x0a
+#define REG_CL3_BRT_L  0x0b
+#define REG_CL3_I  0x0c
+
+#define REG_CONF   0x0d
+#define REG_STATUS 0x0e
+#define REG_ID 0x12
+
+#define REG_ROM_CTRL   0x19
+#define REG_ROM_UNLOCK 0x1a
+#define REG_ROM_START  0x60
+#define REG_ROM_END0x78
+
+#define REG_EEPROM_START   0x60
+#define REG_EEPROM_END 0x78
+#define REG_MAX0xFF
+
+struct lp8860_chip {
+   struct device *dev;
+   struct lp8860_platform_data *pdata;
+   struct backlight_device *bled[LP8860_LED_MAX];
+   struct regmap *regmap;
+};
+
+/* brightness control */
+static int lp8860_bled_update_status(struct backlight_device *bl,
+enum lp8860_leds nsr)
+{
+   int ret = -EINVAL;
+   struct lp8860_chip *pchip = bl_get_data(bl);
+
+   if (pchip->pdata->mode)
+   return 0;
+
+   if (bl->props.state & (BL_CORE_SUSPENDED | BL_CORE_FBBLANK))
+   bl->props.brightness = 0;
+
+   switch (nsr) {
+   case LP8860_LED0:
+   ret = regmap_write(pchip->regmap,
+  REG_CL0_BRT_H, bl->props.brightness >> 8);
+   ret |= regmap_write(pchip->regmap,
+   REG_CL0_BRT_L, bl->props.brightness & 0xff);
+   break;
+   case LP8860_LED1:
+   ret = regmap_write(pchip->regmap,
+  REG_CL1_BRT_H,
+  (bl->props.brightness >> 8) & 0x1f);
+   ret |= regmap_write(pchip->regmap,
+   REG_CL1_BRT_L, bl->props.brightness & 0xff);
+   break;
+   case LP8860_LED2:
+   ret = regmap_write(pchip->regmap,
+  REG_CL2_BRT_H,
+  (bl->props.brightness >> 8) & 0x1f);
+   ret |= regmap_write(pchip->regmap,
+

Re: [PATCH] backlight: add new LP8860 backlight driver

2014-03-12 Thread Daniel Jeong


Thank you for your comments


On Monday, March 03, 2014 6:15 PM, Daniel Jeong wrote:
(+CC Bryan Wu, Lee Jones)

Please add Bryan Wu, Lee Jones to CC list, when you send
patches for backlight.


  This patch adds LP8860 backlight device driver.
LP8860 is a low EMI and High performance 4 channel LED Driver of TI.
This device driver provide the way to control brightness and currnet

(+CC Bryan Wu, Lee Jones)

s/provide/provides
s/currnet/current


of each channel and provide the way to write eeprom.

s/provide/provides


To support dt structure, another patch file will be sent.

Signed-off-by: Daniel Jeong 
---

'To support dt structure, another patch file will be sent.' is
NOT appropriate for the commit message. So, please move it as below.
Then, this message will not be included to the commit message, when
this patch will be merged to maintainer's tree.

Signed-off-by: Daniel Jeong 
---
To support dt structure, another patch file will be sent.



  drivers/video/backlight/Kconfig |7 +
  drivers/video/backlight/Makefile|1 +
  drivers/video/backlight/lp8860_bl.c |  528 +++
  include/linux/platform_data/lp8860_bl.h |   54 
  4 files changed, 590 insertions(+)
  create mode 100644 drivers/video/backlight/lp8860_bl.c
  create mode 100644 include/linux/platform_data/lp8860_bl.h

diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
index 5a3eb2e..908048f 100644
--- a/drivers/video/backlight/Kconfig
+++ b/drivers/video/backlight/Kconfig
@@ -397,6 +397,13 @@ config BACKLIGHT_LP8788
help
  This supports TI LP8788 backlight driver.

+config BACKLIGHT_LP8860
+   tristate "Backlight Driver for LP8860"
+   depends on BACKLIGHT_CLASS_DEVICE && I2C
+   select REGMAP_I2C
+   help
+ This supports TI LP8860 Backlight Driver
+
  config BACKLIGHT_OT200
tristate "Backlight driver for ot200 visualisation device"
depends on BACKLIGHT_CLASS_DEVICE && CS5535_MFGPT && GPIO_CS5535
diff --git a/drivers/video/backlight/Makefile b/drivers/video/backlight/Makefile
index bb82002..cbc5ac3 100644
--- a/drivers/video/backlight/Makefile
+++ b/drivers/video/backlight/Makefile
@@ -42,6 +42,7 @@ obj-$(CONFIG_BACKLIGHT_LM3639)+= lm3639_bl.o
  obj-$(CONFIG_BACKLIGHT_LOCOMO)+= locomolcd.o
  obj-$(CONFIG_BACKLIGHT_LP855X)+= lp855x_bl.o
  obj-$(CONFIG_BACKLIGHT_LP8788)+= lp8788_bl.o
+obj-$(CONFIG_BACKLIGHT_LP8860) += lp8860_bl.o
  obj-$(CONFIG_BACKLIGHT_LV5207LP)  += lv5207lp.o
  obj-$(CONFIG_BACKLIGHT_MAX8925)   += max8925_bl.o
  obj-$(CONFIG_BACKLIGHT_OMAP1) += omap1_bl.o
diff --git a/drivers/video/backlight/lp8860_bl.c 
b/drivers/video/backlight/lp8860_bl.c
new file mode 100644
index 000..4712e84
--- /dev/null
+++ b/drivers/video/backlight/lp8860_bl.c
@@ -0,0 +1,528 @@
+/*
+* Simple driver for Texas Instruments lp8860 Backlight driver chip
+*
+* Copyright (C) 2014 Texas Instruments
+* Author: Daniel Jeong  
+*Ldd Mlp 
+*
+* This program is free software; you can redistribute it and/or modify
+* it under the terms of the GNU General Public License version 2 as
+* published by the Free Software Foundation.
+*
+*/
+#include 

Please move this header in alphabetical order.


+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define REG_CL0_BRT_H  0x00
+#define REG_CL0_BRT_L  0x01
+#define REG_CL0_I_H0x02
+#define REG_CL0_I_L0x03
+
+#define REG_CL1_BRT_H  0x04
+#define REG_CL1_BRT_L  0x05
+#define REG_CL1_I  0x06
+
+#define REG_CL2_BRT_H  0x07
+#define REG_CL2_BRT_L  0x08
+#define REG_CL2_I  0x09
+
+#define REG_CL3_BRT_H  0x0a
+#define REG_CL3_BRT_L  0x0b
+#define REG_CL3_I  0x0c
+
+#define REG_CONF   0x0d
+#define REG_STATUS 0x0e
+#define REG_ID 0x12
+
+#define REG_ROM_CTRL   0x19
+#define REG_ROM_UNLOCK 0x1a
+#define REG_ROM_START  0x60
+#define REG_ROM_END0x78
+
+#define REG_EEPROM_START   0x60
+#define REG_EEPROM_END 0x78
+#define REG_MAX0xFF
+
+struct lp8860_chip {
+   struct device *dev;
+   struct lp8860_platform_data *pdata;
+   struct backlight_device *bled[LP8860_LED_MAX];
+   struct regmap *regmap;
+};
+
+/* brightness control */
+static int lp8860_bled_update_status(struct backlight_device *bl,
+enum lp8860_leds nsr)
+{
+   int ret = -EINVAL;
+   struct lp8860_chip *pchip = bl_get_data(bl);
+
+   if (pchip->pdata->mode)
+   return 0;
+
+   if (bl->props.state & (BL_CORE_SUSPENDED | BL_CORE_FBBLANK))
+   bl->props.brightness = 0;
+
+   switch (nsr) {
+   case LP8860_LED0:
+   ret = regmap_write(pchip->regmap,
+  REG_CL0_BRT_H, bl->props.brightness >> 8);
+   ret |=

Re: Trusted kernel patchset for Secure Boot lockdown

2014-03-12 Thread Matthew Garrett

On Fri, 2014-02-28 at 14:03 +1100, James Morris wrote:

> Ok, which tree should take this?  I'm happy to, although most of it is 
> outside security/ .

Should I be looking for someone else to take them instead? :)

-- 
Matthew Garrett

Re: [PATCH v2 1/5] gpiolib: Allow GPIO chips to request their own GPIOs

2014-03-12 Thread Alexandre Courbot

On Mon, Mar 10, 2014 at 9:54 PM, Mika Westerberg
 wrote:
> Sometimes it is useful to allow GPIO chips themselves to request GPIOs they
> own through gpiolib API. One use case is ACPI ASL code that should be able
> to toggle GPIOs through GPIO operation regions.
>
> We can't use gpio_request() because it will pin the module to the kernel
> forever (it calls try_module_get()). To solve this we move module refcount
> manipulation to gpiod_request() and let __gpiod_request() handle the actual
> request. This changes the sequence a bit as now try_module_get() is called
> outside of gpio_lock (I think this is safe, try_module_get() handles
> serialization it needs already).
>
> Then we provide gpiolib internal functions gpiochip_request/free_own_desc()
> that do the same as gpio_request() but don't manipulate module refrence
> count. This allows the GPIO chip driver to request and free descriptors it
> owns without being pinned to the kernel forever.
>
> Signed-off-by: Mika Westerberg 

Reviewed-by: Alexandre Courbot 

The change is clear and does not add too much complexity to the code,
so no reason to oppose it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] kallsyms: handle special absolute symbols

2014-03-12 Thread Rusty Russell

Kees Cook  writes:
> Why not just do this with 0-base-address detection like my v2? That
> would mean we don't need to remember to add this flag in the future to
> imagined new architectures that might want this 0-based per_cpu
> feature.

Because future architectures will get this right and emit absolute
symbols.  I hope!

I'm swamped at the moment, but am hoping to investigate that for
x86-64.  This is a stop-gap.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: fix i_writecount on shmem and friends

2014-03-12 Thread Al Viro

On Thu, Mar 13, 2014 at 03:08:00PM +1100, NeilBrown wrote:
> + inode = mddev->bitmap_info.file->f_mapping->host;
> + if (!S_ISREG(inode->i_mode)) {
> + printk(KERN_ERR "%s: error: bitmap file must be a 
> regular file\n",
> +mdname(mddev));
> + fput(mddev->bitmap_info.file);
> + mddev->bitmap_info.file = NULL;
> + return -EBADF;
> + }
> + if (atomic_read(>i_writecount) != 1) {

Umm...  I think you ought to check more than that.  At the very least you
want to check that you have it opened for write - you don't want e.g.
a filesystem containing that puppy remounted r/o under you.  Another thing
is, what happens if it's not a buffer cache backed one?  Hell, what happens
if it's a file on NFS?  You are relying on bmap() working, right?  So it
looks like you ought to check if ->bmap() is there.  And I really wonder
how well does it play with journalling fs...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] zram: support REQ_DISCARD

2014-03-12 Thread Joonsoo Kim

On Wed, Mar 12, 2014 at 08:03:03PM -0700, Andrew Morton wrote:
> On Thu, 13 Mar 2014 11:46:17 +0900 Joonsoo Kim  wrote:
> 
> > +   while (n >= PAGE_SIZE) {
> > +   /*
> > +* discard request can be too large so that the zram can
> > +* be stucked for a long time if we handle the request
> > +* at once. So handle the request by PAGE_SIZE unit at a time.
> > +*/
> > +   write_lock(>meta->tb_lock);
> > +   zram_free_page(zram, index);
> > +   write_unlock(>meta->tb_lock);
> > +   index++;
> > +   n -= PAGE_SIZE;
> > +   }
> 
> Well, you could use something like
> 
>   if (need_resched()) {
>   unlock()
>   schedule()
>   lock()
>   }
> 
> here, or free 100 pages at a time or something silly like that.  I
> guess we retain these as options if/when that lock turns out to be
> contended.

Okay! I postpone this until that lock turns out to be contented.
Here goes new one.

Thanks.

-->8---
>From 81f1be2f095c175ad29505344f11eb86f51fdc93 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim 
Date: Mon, 24 Feb 2014 14:30:43 +0900
Subject: [PATCH v5] zram: support REQ_DISCARD

zram is ram based block device and can be used by backend of filesystem.
When filesystem deletes a file, it normally doesn't do anything on data
block of that file. It just marks on metadata of that file. This behavior
has no problem on disk based block device, but has problems on ram based
block device, since we can't free memory used for data block. To overcome
this disadvantage, there is REQ_DISCARD functionality. If block device
support REQ_DISCARD and filesystem is mounted with discard option,
filesystem sends REQ_DISCARD to block device whenever some data blocks are
discarded. All we have to do is to handle this request.

This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
REQ_DISCARD request. With it, we can free memory used by zram if it isn't
used.

v2: handle unaligned case commented by Jerome
v3: conditionally set zero to discard_zeroes_data commented by Minchan
reuse index, offset in __zram_make_request() commented by Sergey.
v4: replenish code comments suggested by Andrew.
v5: handle all range of discard request at once suggested by Andrew.

Signed-off-by: Joonsoo Kim 

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 7631ef0..1118086 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -541,6 +541,43 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec 
*bvec, u32 index,
return ret;
 }
 
+/*
+ * zram_bio_discard - handler on discard request
+ * @index: physical block index by PAGE_SIZE unit
+ * @offset: offset within physical block
+ */
+static void zram_bio_discard(struct zram *zram, u32 index,
+int offset, struct bio *bio)
+{
+   size_t n = bio->bi_iter.bi_size;
+
+   /*
+* zram manages data by physical block size unit. Because logical block
+* size isn't identical with physical block size on some arch, we
+* could get discard request pointing to specific offset within certain
+* physical block. Although we can handle this request by reading that
+* physiclal block and decompressing and partially zeroing and
+* re-compressing and then re-storing it, it isn't reasonable because
+* our intention of handling discard request is to save memory.
+* So skipping this logical block is approriate here.
+*/
+   if (offset) {
+   if (n < offset)
+   return;
+
+   n -= offset;
+   index++;
+   }
+
+   write_lock(>meta->tb_lock);
+   while (n >= PAGE_SIZE) {
+   zram_free_page(zram, index);
+   index++;
+   n -= PAGE_SIZE;
+   }
+   write_unlock(>meta->tb_lock);
+}
+
 static void zram_reset_device(struct zram *zram, bool reset_capacity)
 {
size_t index;
@@ -676,6 +713,12 @@ static void __zram_make_request(struct zram *zram, struct 
bio *bio)
offset = (bio->bi_iter.bi_sector &
  (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT;
 
+   if (unlikely(bio->bi_rw & REQ_DISCARD)) {
+   zram_bio_discard(zram, index, offset, bio);
+   bio_endio(bio, 0);
+   return;
+   }
+
bio_for_each_segment(bvec, bio, iter) {
int max_transfer_size = PAGE_SIZE - offset;
 
@@ -845,6 +888,20 @@ static int create_device(struct zram *zram, int device_id)
ZRAM_LOGICAL_BLOCK_SIZE);
blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
+   zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
+   zram->disk->queue->limits.max_discard_sectors = UINT_MAX;
+   /*
+* zram_bio_discard() will

Re: [PATCH 3/4] devfreq: exynos4: Add ppmu's clock control and code clean about regulator control

2014-03-12 Thread Chanwoo Choi

Hi Bartlomiej,

On 03/13/2014 11:15 AM, Chanwoo Choi wrote:
> Hi Batlomiej,
> 
> On 03/13/2014 12:17 AM, Bartlomiej Zolnierkiewicz wrote:
>>
>> Hi,
>>
>> On Wednesday, March 12, 2014 08:48:01 PM Chanwoo Choi wrote:
>>> There are not the clock controller of ppmudmc0/1. This patch control the 
>>> clock
>>> of ppmudmc0/1 which is used for monitoring memory bus utilization.
>>>
>>> Also, this patch code clean about regulator control and free resource
>>> when calling exit/remove function.
>>>
>>> For example,
>>> busfreq@106A {
>>> compatible = "samsung,exynos4x12-busfreq";
>>>
>>> /* Clock for PPMUDMC0/1 */
>>> clocks = < CLK_PPMUDMC0>, < CLK_PPMUDMC1>;
>>> clock-names = "ppmudmc0", "ppmudmc1";
>>>
>>> /* Regulator for MIF/INT block */
>>> vdd_mif-supply = <_reg>;
>>> vdd_int-supply = <_reg>;
>>> };
>>
>> This should be in Documentation/devicetree/bindings/ documentation.
> 
> OK, I will add documentation about it.
> 
>>
>>> Signed-off-by: Chanwoo Choi 
>>> ---
>>>  drivers/devfreq/exynos/exynos4_bus.c | 107 
>>> ++-
>>>  1 file changed, 93 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/exynos/exynos4_bus.c 
>>> b/drivers/devfreq/exynos/exynos4_bus.c
>>> index 16fb3cb..0c5b99e 100644
>>> --- a/drivers/devfreq/exynos/exynos4_bus.c
>>> +++ b/drivers/devfreq/exynos/exynos4_bus.c
>>> @@ -62,6 +62,11 @@ enum exynos_ppmu_idx {
>>> PPMU_END,
>>>  };
>>>  
>>> +static const char *exynos_ppmu_clk_name[] = {
>>> +   [PPMU_DMC0] = "ppmudmc0",
>>> +   [PPMU_DMC1] = "ppmudmc1",
>>> +};
>>> +
>>>  #define EX4210_LV_MAX  LV_2
>>>  #define EX4x12_LV_MAX  LV_4
>>>  #define EX4210_LV_NUM  (LV_2 + 1)
>>> @@ -86,6 +91,7 @@ struct busfreq_data {
>>> struct regulator *vdd_mif; /* Exynos4412/4212 only */
>>> struct busfreq_opp_info curr_oppinfo;
>>> struct exynos_ppmu ppmu[PPMU_END];
>>> +   struct clk *clk_ppmu[PPMU_END];
>>>  
>>> struct notifier_block pm_notifier;
>>> struct mutex lock;
>>> @@ -722,8 +728,26 @@ static int exynos4_bus_get_dev_status(struct device 
>>> *dev,
>>>  static void exynos4_bus_exit(struct device *dev)
>>>  {
>>> struct busfreq_data *data = dev_get_drvdata(dev);
>>> +   int i;
>>>  
>>> -   devfreq_unregister_opp_notifier(dev, data->devfreq);
>>> +   /*
>>> +* Un-map memory man and disable regulator/clocks
>>> +* to prevent power leakage.
>>> +*/
>>> +   regulator_disable(data->vdd_int);
>>> +   if (data->type == TYPE_BUSF_EXYNOS4x12)
>>> +   regulator_disable(data->vdd_mif);
>>> +
>>> +   for (i = 0; i < PPMU_END; i++) {
>>> +   if (data->clk_ppmu[i])
>>> +   clk_disable_unprepare(data->clk_ppmu[i]);
>>> +   }
>>> +
>>> +   for (i = 0; i < PPMU_END; i++) {
>>> +   if (data->ppmu[i].hw_base)
>>> +   iounmap(data->ppmu[i].hw_base);
>>> +
>>> +   }
>>>  }
>>>  
>>>  static struct devfreq_dev_profile exynos4_devfreq_profile = {
>>> @@ -987,6 +1011,7 @@ static int exynos4_busfreq_parse_dt(struct 
>>> busfreq_data *data)
>>>  {
>>> struct device *dev = data->dev;
>>> struct device_node *np = dev->of_node;
>>> +   const char **clk_name = exynos_ppmu_clk_name;
>>> int i, ret;
>>>  
>>> if (!np) {
>>> @@ -1005,8 +1030,67 @@ static int exynos4_busfreq_parse_dt(struct 
>>> busfreq_data *data)
>>> }
>>> }
>>>  
>>> +   /*
>>> +* Get PPMU's clocks to control them. But, if PPMU's clocks
>>> +* is default 'pass' state, this driver don't need control
>>> +* PPMU's clock.
>>> +*/
>>> +   for (i = 0; i < PPMU_END; i++) {
>>> +   data->clk_ppmu[i] = devm_clk_get(dev, clk_name[i]);
>>> +   if (IS_ERR_OR_NULL(data->clk_ppmu[i])) {
>>> +   dev_warn(dev, "Cannot get %s clock\n", clk_name[i]);
>>> +   data->clk_ppmu[i] = NULL;
>>> +   }
>>> +
>>> +   ret = clk_prepare_enable(data->clk_ppmu[i]);
>>> +   if (ret < 0) {
>>> +   dev_warn(dev, "Cannot enable %s clock\n", clk_name[i]);
>>> +   data->clk_ppmu[i] = NULL;
>>> +   goto err_clocks;
>>> +   }
>>> +   }
>>> +
>>> +
>>> +   /* Get regulators to control voltage of int/mif block */
>>> +   data->vdd_int = devm_regulator_get(dev, "vdd_int");
>>> +   if (IS_ERR(data->vdd_int)) {
>>> +   dev_err(dev, "Failed to get the regulator of vdd_int\n");
>>> +   ret = PTR_ERR(data->vdd_int);
>>> +   goto err_clocks;
>>> +   }
>>> +   ret = regulator_enable(data->vdd_int);
>>> +   if (ret < 0) {
>>> +   dev_err(dev, "Failed to enable regulator of vdd_int\n");
>>> +   goto err_clocks;
>>> +   }
>>> +
>>> +   switch (data->type) {
>>> +   case TYPE_BUSF_EXYNOS4x12:
>>> +   data->vdd_mif = devm_regulator_get(dev, "vdd_mif");
>>> +   if (IS_ERR(data->vdd_mif)) {
>>> +   dev_err(dev, "Failed to get the regulator vdd_mif\n");
>>> +   ret

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread joeyli

於 三，2014-03-12 於 20:59 -0700，H. Peter Anvin 提到：
> On 03/12/2014 08:55 PM, joeyli wrote:
> > 
> > So do not care "CMOS RTC Not Present", if TAD is present then we use it
> > instead of CMOS RTC in all kernel code? or we still can use CMOS RTC?
> > 
> 
> Why would we use *both*!?  How would that possibly make sense?
> 
>   -hpa
> 

Yes, it does not make sense for using both.

I switched the code in get_rtc_time() set_rtc_time() to TAD when it
present, just make sure I'm on the right path.


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/2] block: Use pci_enable_msix_exact() instead of pci_enable_msix()

2014-03-12 Thread Bjorn Helgaas

On Wed, Feb 26, 2014 at 10:02:40AM +0100, Alexander Gordeev wrote:
> Changes since v1:
>   - cciss: patch #1: a weird 'goto' removed;
>   - cciss: patch #2: pci_enable_msix_exact() used, not 
> pci_enable_msix_range();
>   - rsxx: patch dropped - no need to change anything;
> 
> As result of deprecation of MSI-X/MSI enablement functions
> pci_enable_msix() and pci_enable_msi_block() all drivers
> using these two interfaces need to be updated to use the
> new pci_enable_msi_range()  or pci_enable_msi_exact()
> and pci_enable_msix_range() or pci_enable_msix_exact()
> interfaces.
> 
> This change updates 'cciss' only, but there is also 'nvme' update
> pending - I am waiting for Intel guys to clarify if they want to
> route it thru their tree. If not, I will post the 'nvme' patch as
> a follow-up to this series.

Hi Jens,

I'd like to get these merged during the v3.15 merge window.  I'd be glad to
review and apply them through my tree, unless you want to do it.  They do
depend on f7fc32c, which went in after the v3.14 merge window, which makes
it a bit of a hassle.

So let me know if you'd rather handle these; otherwise I'll review them and
put them in my tree next week.

I'll include the nvme update, since it has Keith's "Reviewed-by".

Bjorn

> Cc: Jens Axboe 
> Cc: Mike Miller 
> Cc: iss_storage...@hp.com
> Cc: linux-...@vger.kernel.org
> 
> Alexander Gordeev (2):
>   cciss: Fallback to MSI rather than to INTx if MSI-X failed
>   cciss: Use pci_enable_msix_exact() instead of pci_enable_msix()
> 
>  drivers/block/cciss.c |8 +---
>  1 files changed, 1 insertions(+), 7 deletions(-)
> 
> -- 
> 1.7.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: fix i_writecount on shmem and friends

2014-03-12 Thread NeilBrown

On Wed, 12 Mar 2014 18:19:25 + Al Viro  wrote:

> On Tue, Mar 11, 2014 at 12:05:09PM -0700, Linus Torvalds wrote:
> > 
> > which returns ETXTBSY (most easily seen by just stracing it).
> > 
> > The patch would also seem to make sense, with the i_readcount_inc()
> > being immediately below for the FMODE_READ case.
> 
> I think it's trying to fix the problem in the wrong place.  The bug is real,
> all right, but it's not that alloc_file() for non-regulars doesn't grab
> writecount; it's that drop_file_write_access() drops it for those.
> 
> What the hell would we want to play with that counter for, anyway?  It's not
> as if they could be mmapped, so all it does is making pipe(2) and socket(2)
> more costly, for no visible reason.
> 
> I would prefer to flip
> put_write_access(inode);
> 
> if (special_file(inode->i_mode))
> return;
> in drop_file_write_access() instead.
> 
> 
> Oh, shit...
> 
> drivers/md/md.c:
> /* similar to deny_write_access, but accounts for our holding a reference
>  * to the file ourselves */
> static int deny_bitmap_write_access(struct file * file)
> {
> struct inode *inode = file->f_mapping->host;
> 
> spin_lock(>i_lock);
> if (atomic_read(>i_writecount) > 1) {
> spin_unlock(>i_lock);
> return -ETXTBSY;
> }
> atomic_set(>i_writecount, -1);
> spin_unlock(>i_lock);
> 
> return 0;
> }
> 
> Broken.  get_write_access() will happily increment i_writecount e.g. from
> 1 to 2, without even looking at i_lock.

I guess someone changed exactly how i_writecount is used without check all
users ... that isn't like you Al :-)


>  Moreover, it's paired with
> void restore_bitmap_write_access(struct file *file)
> {
> struct inode *inode = file->f_mapping->host;
> 
> spin_lock(>i_lock);
> atomic_set(>i_writecount, 1);
> spin_unlock(>i_lock);
> }
> Just what will happen if we do denywrite mmap() of that file in between?
> Even worse, the caller take file straight from fget(), with no sanity
> checks whatsoever.  Just what will happen if I give it e.g. a directory?
> Or a procfs/sysfs/whatnot file, for that matter?  Neil?  I realize that
> it's root-only, but still...

But as you point out, even "fixing" it to match the current i_writecount
behaviour wouldn't really be a proper fix.
Probably best to stop messing with i_writecount and just use it to guard
against using the same bitmap twice.
It won't prevent some other process writing to the file but that shouldn't
happen anyway.

Maybe something like the following once I've tested it.

Thanks,
NeilBrown

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 4195a01b1535..9a8e66ae04f5 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1988,7 +1988,6 @@ location_store(struct mddev *mddev, const char *buf, 
size_t len)
if (mddev->bitmap_info.file) {
struct file *f = mddev->bitmap_info.file;
mddev->bitmap_info.file = NULL;
-   restore_bitmap_write_access(f);
fput(f);
}
} else {
diff --git a/drivers/md/md.c b/drivers/md/md.c
index e28c9d2a1166..223126046e02 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5181,32 +5181,6 @@ static int restart_array(struct mddev *mddev)
return 0;
 }
 
-/* similar to deny_write_access, but accounts for our holding a reference
- * to the file ourselves */
-static int deny_bitmap_write_access(struct file * file)
-{
-   struct inode *inode = file->f_mapping->host;
-
-   spin_lock(>i_lock);
-   if (atomic_read(>i_writecount) > 1) {
-   spin_unlock(>i_lock);
-   return -ETXTBSY;
-   }
-   atomic_set(>i_writecount, -1);
-   spin_unlock(>i_lock);
-
-   return 0;
-}
-
-void restore_bitmap_write_access(struct file *file)
-{
-   struct inode *inode = file->f_mapping->host;
-
-   spin_lock(>i_lock);
-   atomic_set(>i_writecount, 1);
-   spin_unlock(>i_lock);
-}
-
 static void md_clean(struct mddev *mddev)
 {
mddev->array_sectors = 0;
@@ -5427,7 +5401,6 @@ static int do_md_stop(struct mddev * mddev, int mode,
 
bitmap_destroy(mddev);
if (mddev->bitmap_info.file) {
-   restore_bitmap_write_access(mddev->bitmap_info.file);
fput(mddev->bitmap_info.file);
mddev->bitmap_info.file = NULL;
}
@@ -5991,6 +5964,7 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
 
 
if (fd >= 0) {
+   struct inode *inode;
if (mddev->bitmap)
return -EEXIST; /* cannot add when bitmap is present */
mddev->bitmap_info.file = fget(fd);
@@ -6001,13 +5975,20 @@ static int set_bitmap_file(struct mddev *mddev, int fd)

Re: [PATCH v2 00/23] scsi: Use pci_enable_msix_range() instead of pci_enable_msix()

2014-03-12 Thread Bjorn Helgaas

On Mon, Feb 24, 2014 at 09:02:00AM +0100, Alexander Gordeev wrote:
> Hello!
> 
> This series is against James Bottomley's SCSI tree [1], but it needs
> commit f7fc32c ("PCI/MSI: Add pci_enable_msi_exact() and
> pci_enable_msix_exact()") from from Bjorn Helgaas's PCI tree [2]:
> 
> 1. git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
> 2. git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git pci/msi
> 
> Recently pci_enable_msix_exact() function has been accepted to
> the mainline. That is a variation of pci_enable_msix_range() which
> allows a device driver to request a particular number of MSI-Xs.
> 
> As result, most of the changes posted in version 1 of this series
> are invalidated and need to use pci_enable_msix_exact() instead of
> originally posted pci_enable_msix_range() usages.
> 
> I removed almost all ACKs, since unlike pci_enable_msix_range()
> function which returns the number of MSI-Xs allocated or negative
> errno, pci_enable_msix_exact() returns either zero success code or
> a negative errno. Although this change is simple, it still entails
> an updated error code analysis and would be better reviewed by
> driver maintainers.

Hi James,

I think Alexander sent these to linux-scsi hoping that you would handle
them, but I know it's a hassle because they depend on f7fc32c, which went
in after the merge window.

I'd be glad to review these and apply them through my tree, unless you want
to do it.  I'd like to get these merged in the v3.15 merge window so
Alexander can move on to something else.  I haven't checked for merge
conflicts with scsi.git yet, but I assume they'd be pretty trivial if there
are any.

Bjorn

> Cc: iss_storage...@hp.com
> Cc: intel-linux-...@intel.com
> Cc: supp...@lsi.com
> Cc: dl-mptfusionli...@lsi.com
> Cc: qla2xxx-upstr...@qlogic.com
> Cc: iscsi-dri...@qlogic.com
> Cc: pv-driv...@vmware.com
> Cc: linux-s...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> 
> Alexander Gordeev (23):
>   be2iscsi: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   bfa: Do not call pci_enable_msix() after it failed once
>   bfa: Cleanup bfad_setup_intr() function
>   bfa: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   csiostor: Remove superfluous call to pci_disable_msix()
>   csiostor: Use pci_enable_msix_range() instead of pci_enable_msix()
>   fnic: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   hpsa: Fallback to MSI rather than to INTx if MSI-X failed
>   hpsa: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   isci: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   lpfc: Remove superfluous call to pci_disable_msix()
>   lpfc: Use pci_enable_msix_range() instead of pci_enable_msix()
>   megaraid: Fail resume if MSI-X re-initialization failed
>   megaraid: Use pci_enable_msix_range() instead of pci_enable_msix()
>   mpt2sas: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   mpt3sas: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   pm8001: Fix invalid return when request_irq() failed
>   pm8001: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   pmcraid: Get rid of a redundant assignment
>   pmcraid: Use pci_enable_msix_range() instead of pci_enable_msix()
>   qla2xxx: Use pci_enable_msix_range() instead of pci_enable_msix()
>   qla4xxx: Use pci_enable_msix_exact() instead of pci_enable_msix()
>   vmw_pvscsi: Use pci_enable_msix_exact() instead of pci_enable_msix()
> 
>  drivers/scsi/be2iscsi/be_main.c   |6 +--
>  drivers/scsi/bfa/bfad.c   |   62 
> -
>  drivers/scsi/csiostor/csio_hw.h   |2 +-
>  drivers/scsi/csiostor/csio_isr.c  |   24 ---
>  drivers/scsi/fnic/fnic_isr.c  |4 +-
>  drivers/scsi/hpsa.c   |   12 +-
>  drivers/scsi/isci/init.c  |2 +-
>  drivers/scsi/lpfc/lpfc_init.c |   54 -
>  drivers/scsi/megaraid/megaraid_sas_base.c |   24 +--
>  drivers/scsi/mpt2sas/mpt2sas_base.c   |6 +-
>  drivers/scsi/mpt3sas/mpt3sas_base.c   |4 +-
>  drivers/scsi/pm8001/pm8001_init.c |   44 +++--
>  drivers/scsi/pmcraid.c|   14 +--
>  drivers/scsi/qla2xxx/qla_isr.c|   27 +---
>  drivers/scsi/qla4xxx/ql4_nx.c |2 +-
>  drivers/scsi/vmw_pvscsi.c |2 +-
>  16 files changed, 121 insertions(+), 168 deletions(-)
> 
> -- 
> 1.7.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 08:55 PM, joeyli wrote:
> 
> So do not care "CMOS RTC Not Present", if TAD is present then we use it
> instead of CMOS RTC in all kernel code? or we still can use CMOS RTC?
> 

Why would we use *both*!?  How would that possibly make sense?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread joeyli

於 三，2014-03-12 於 20:11 -0700，H. Peter Anvin 提到：
> On 03/12/2014 07:38 PM, joeyli wrote:
> > 
> > I sent rtc-acpitad driver for RTC subsystem on last month, I will send
> > second version.
> > 
> > For using TAD to set wall clock is because in ACPI 5.0 spec, there have
> > a "CMOS RTC Not Present" flag in FADT to indicate OSPM should use TAD
> > when this flag set:
> > 
> > CMOS RTC Not Present
> > 
> 
> Bullsh*t.  The TAD should be used if it is present, it has nothing to do
> with this flag.
> 
>   -hpa
> 
> 

So do not care "CMOS RTC Not Present", if TAD is present then we use it
instead of CMOS RTC in all kernel code? or we still can use CMOS RTC?


Regards
Joey Lee


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 RESEND 2/4] portdrv: Use pci_enable_msix_exact() instead of pci_enable_msix()

2014-03-12 Thread Bjorn Helgaas

On Thu, Mar 06, 2014 at 09:11:22PM +0100, Alexander Gordeev wrote:
> As result of deprecation of MSI-X/MSI enablement functions
> pci_enable_msix() and pci_enable_msi_block() all drivers
> using these two interfaces need to be updated to use the
> new pci_enable_msi_range()  or pci_enable_msi_exact()
> and pci_enable_msix_range() or pci_enable_msix_exact()
> interfaces.
> 
> Signed-off-by: Alexander Gordeev 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org

Applied to pci/msi for v3.15, thanks!

> ---
>  drivers/pci/pcie/portdrv_core.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 986f8ea..0b1efb2 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -99,7 +99,7 @@ static int pcie_port_enable_msix(struct pci_dev *dev, int 
> *vectors, int mask)
>   for (i = 0; i < nr_entries; i++)
>   msix_entries[i].entry = i;
>  
> - status = pci_enable_msix(dev, msix_entries, nr_entries);
> + status = pci_enable_msix_exact(dev, msix_entries, nr_entries);
>   if (status)
>   goto Exit;
>  
> @@ -171,7 +171,7 @@ static int pcie_port_enable_msix(struct pci_dev *dev, int 
> *vectors, int mask)
>   pci_disable_msix(dev);
>  
>   /* Now allocate the MSI-X vectors for real */
> - status = pci_enable_msix(dev, msix_entries, nvec);
> + status = pci_enable_msix_exact(dev, msix_entries, nvec);
>   if (status)
>   goto Exit;
>   }
> -- 
> 1.7.7.6
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] Improve 32 bit vDSO time

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 04:11 PM, stef...@seibold.net wrote:
> 
> I will do this when your patch is pulled into tip. For now we have the
> choice, but i preferer our solution removing the compat vdso.
> 

Sorry, that didn't parse from me.

Also, if you state a preference, could you please motivate it?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pstore: fix memory leak when decompress using big_oops_buf

2014-03-12 Thread Kees Cook

On Wed, Mar 12, 2014 at 6:34 AM, Liu Shuo  wrote:
> From: Liu ShuoX 
>
> After sucessful decompressing, the buffer which pointed by 'buf' will be
> lost as 'buf' is overwrite by 'big_oops_buf' and will never be freed.
> Signed-off-by: Liu ShuoX 

Thanks again!

Acked-by: Kees Cook 

-Kees

> ---
> fs/pstore/platform.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 78c3c20..46d269e 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -497,6 +497,7 @@ void pstore_get_records(int quiet)
> big_oops_buf_sz);
>
> if (unzipped_len > 0) {
> +   kfree(buf);
> buf = big_oops_buf;
> size = unzipped_len;
> compressed = false;
> --
> 1.8.3.2
>



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 05:54 PM, Thomas Gleixner wrote:
> 
> From the timekeeping POV there is absolutely no need to set the wall
> clock time early. The kernel boot phase does not care about wall time
> at all. We should have it done before we hit userspace, but not even
> that is a hard requirement.
> 

This is a key observation, I believe.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next v2 1/2] r8152: add RTL8152_EARLY_AGG_TIMEOUT_SUPER

2014-03-12 Thread Hayes Wang

For slow CPU, the frequent bulk transfer would cause poor throughput.
One solution is to increase the timeout of the aggregation. It let
the hw could complete the bulk transfer later and fill more packets
into the buffer. Besides, it could reduce the frequency of the bulk
transfer efficiently and improve the performance.

However, the optimization value of the timeout depends on the
capability of the hardware, especially the CPU. For example, according
to the experiment, the timeout 164 us is better than the default
value for the chromebook with the ARM CPU.

Now add RTL8152_EARLY_AGG_TIMEOUT_SUPER to let someone could choose
desired timeout value if he wants to get the best performance.

Signed-off-by: Hayes Wang 
---
 drivers/net/usb/Kconfig | 12 
 drivers/net/usb/r8152.c |  7 +--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index 7e7269f..a8639b8 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -102,6 +102,18 @@ config USB_RTL8152
  To compile this driver as a module, choose M here: the
  module will be called r8152.
 
+   menu "Aggregation Settings"
+   depends on USB_RTL8152
+
+   config RTL8152_EARLY_AGG_TIMEOUT_SUPER
+   int "rx early agg timeout for super speed (unit: us)"
+   default 85
+   help
+ This is the rx early agg timeout for USB super speed.
+ The vaild value is 1 ~ 525 us.
+
+   endmenu
+
 config USB_USBNET
tristate "Multi-purpose USB Networking Framework"
select MII
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index aa1d5b2..293b4d8 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -316,7 +316,10 @@
 #define PCUT_STATUS0x0001
 
 /* USB_RX_EARLY_AGG */
-#define EARLY_AGG_SUPPER   0x0e832981
+#define EARLY_AGG_SUPERrx_buf_sz - 1522) / 4) << 16) | \
+   (u32)(CONFIG_RTL8152_EARLY_AGG_TIMEOUT_SUPER <= 0 ? 0x2981 : \
+   ((CONFIG_RTL8152_EARLY_AGG_TIMEOUT_SUPER * 125) < 0x ? \
+   CONFIG_RTL8152_EARLY_AGG_TIMEOUT_SUPER * 125 : 0x)))
 #define EARLY_AGG_HIGH 0x0e837a12
 #define EARLY_AGG_SLOW 0x0e83
 
@@ -1978,7 +1981,7 @@ static void r8153_set_rx_agg(struct r8152 *tp)
ocp_write_dword(tp, MCU_TYPE_USB, USB_RX_BUF_TH,
RX_THR_SUPPER);
ocp_write_dword(tp, MCU_TYPE_USB, USB_RX_EARLY_AGG,
-   EARLY_AGG_SUPPER);
+   EARLY_AGG_SUPER);
} else {
ocp_write_dword(tp, MCU_TYPE_USB, USB_RX_BUF_TH,
RX_THR_HIGH);
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next v2 0/2] parameter modification

2014-03-12 Thread Hayes Wang

Add opportunity to change the default setting and reduce the tx/rx
buffers.

v2: modify the patch #1 to let the value readable.

Hayes Wang (2):
  r8152: add RTL8152_EARLY_AGG_TIMEOUT_SUPER
  r8152: reduce the numbers of the bulks

 drivers/net/usb/Kconfig | 11 +++
 drivers/net/usb/r8152.c | 10 ++
 2 files changed, 17 insertions(+), 4 deletions(-)

-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH net-next v2 2/2] r8152: reduce the numbers of the bulks

2014-03-12 Thread Hayes Wang

It is not necessary to have many transfer buffers. Reduce the number
from 10 to 4.

Signed-off-by: Hayes Wang 
---
 drivers/net/usb/r8152.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 293b4d8..1826fcf 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -422,8 +422,8 @@ enum rtl_register_content {
FULL_DUP= 0x01,
 };
 
-#define RTL8152_MAX_TX 10
-#define RTL8152_MAX_RX 10
+#define RTL8152_MAX_TX 4
+#define RTL8152_MAX_RX 4
 #define INTBUFSIZE 2
 #define CRC_SIZE   4
 #define TX_ALIGN   4
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 0/9] socket filtering using nf_tables

2014-03-12 Thread Alexei Starovoitov

On Wed, Mar 12, 2014 at 2:15 AM, Pablo Neira Ayuso  wrote:
> Hi!
>
> I'm going to reply to Daniel and you in the same email, see below.
>
>  struct sk_filter
>  {
> atomic_trefcnt;
> -   unsigned intlen;/* Number of filter blocks */
> +   /* len - number of insns in sock_filter program
> +* len_ext - number of insns in socket_filter_ext program
> +* jited - true if either original or extended program was
> JITed
> +* orig_prog - original sock_filter program if not NULL
> +*/
> +   unsigned intlen;
> +   unsigned intlen_ext;
> +   unsigned intjited:1;
> +   struct sock_filter  *orig_prog;
> struct rcu_head rcu;
> -   unsigned int(*bpf_func)(const struct sk_buff *skb,
> -   const struct sock_filter
> *filter);
> +   union {
> +   unsigned int (*bpf_func)(const struct sk_buff *skb,
> +const struct sock_filter *fp);
> +   unsigned int (*bpf_func_ext)(const struct sk_buff *skb,
> +const struct sock_filter_ext 
> *fp);
> +   };
> union {
> struct sock_filter  insns[0];
> +   struct sock_filter_ext  insns_ext[0];
> struct work_struct  work;
> };
>  };
>
> I think we have to generalise this to make it flexible to accomodate
> any socket filtering infrastructure. For example, instead of having
> bpf_func and bpf_func_ext, I think it would be good to generalise it
> that so we pass some void *filter. I also think that other private

well, David indicated that using 'void*' for such cases is undesirable,
since we want to rely on compiler to do type verification as much
as we can. My patches are preserving type safety.
'void * filter' would mean - open the door for anything.
I don't think we want that type of 'generality'.

> information to the filtering approach should be put after the
> filtering code, in some variable length area.
>
> This change looks quite ad-hoc. My 1-3 patches were more going to the
> direction of making this in some generic way to accomodate any socket
> filtering infrastructure.

They may look ad-hoc, but they're preserving type checking.

>> Could you share what performance you're getting when doing nft
>> filter equivalent to 'tcpdump port 22' ?
>> Meaning your filter needs to parse eth->proto, ip or ipv6 header and
>> check both ports. How will it compare with JITed bpf/ebpf ?
>
> We already have plans to add jit to nf_tables.

The patches don't explain the reasons to do nft socket filtering.
I can only guess and my guess that this is to show
that nft sort of can do what bpf can.
tc can be made to do socket filtering too.
The differentiation is speed and ease of use.
Both have big question marks in sock_filter+nft approach.

I think to consider nft to be one and only classifier, some
benchmarking needs to be done first:
nft vs bpf, nft vs tc, nft vs ovs, ...

It can be done the other way too. nft can run on top of tc.
ovs can run on top of tc and so on.
I'm not advocating any of that.

Having one interpreter underneath doesn't mean that all
components will be easier to maintain or have less code around.
Code that converts from one to another counts as well.
Simplicity and performance should be the deciding factor.
imo nft+sock_filter example is not simple.

I've posted patches to compile restricted C into ebpf.
Theoretically I can make 'universal kernel module' out of ebpf.
Like, compile any C code into ebpf and jit it.
Such 'universal kernel module' will be runnable on all architectures.
One compiler to rule them all... one ebpf to run them all... NO!
That may be cool thing for university research, but no good
reason to do it in practice.
Same way nft for socket filtering is a cool research, but what
is the strong reason to have it in kernel and maintain forever?

>> here are some comments about patches:
>> 1/9:
>> -   if (fp->bpf_func != sk_run_filter)
>> -   module_free(NULL, fp->bpf_func);
>> +   if (fp->run_filter != sk_run_filter)
>> +   module_free(NULL, fp->run_filter);
>>
>> David suggested that these comparisons in all jits are ugly.
>> I've fixed it in my patches. When they're in, you wouldn't need to
>> mess with this.
>
> I see you have added fp->jited for this. I think we can make this more
> generic if we have some enum so fp->type will tell us what kind of
> filter we have, ie. bpf, bpf-jitted, nft, etc.

Such enum will have a problem of explosion when flags
start to cross-multiply.
fp->jited flag just says jitted or not. Easier to check.

>> 2/9:
>> -   atomic_sub(sk_filter_size(fp->len), >sk_omem_alloc);
>> +   atomic_sub(fp->size, >sk_omem_alloc);
>>
>> that's a big change in socket memory accounting.

Re: [PATCH] Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE

2014-03-12 Thread Rusty Russell

Steven Rostedt  writes:
> Mathieu, you should have added a v2 to the subject ie: [PATCH V2]
>
> Rusty,
>
> If you want to take this, please add my
> Acked-by: Steven Rostedt 

Thanks, I updated my copy and have pushed this into modules-next.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] MAINTAINERS: virtio-dev is subscribers only

2014-03-12 Thread Rusty Russell

Randy Dunlap  writes:
> From: Randy Dunlap 
>
> virtio-dev mailing list is for subscribers only according to the
> returned message after trying to send to it.

Thanks, applied.

Cheers,
Rusty.

> Signed-off-by: Randy Dunlap 
> ---
>  MAINTAINERS |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- mmotm-2014-0310-1535.orig/MAINTAINERS
> +++ mmotm-2014-0310-1535/MAINTAINERS
> @@ -9438,7 +9438,7 @@ F:  include/media/videobuf2-*
>  
>  VIRTIO CONSOLE DRIVER
>  M:   Amit Shah 
> -L:   virtio-...@lists.oasis-open.org
> +L:   virtio-...@lists.oasis-open.org (subscribers-only)
>  L:   virtualizat...@lists.linux-foundation.org
>  S:   Maintained
>  F:   drivers/char/virtio_console.c
> @@ -9448,7 +9448,7 @@ F:  include/uapi/linux/virtio_console.h
>  VIRTIO CORE, NET AND BLOCK DRIVERS
>  M:   Rusty Russell 
>  M:   "Michael S. Tsirkin" 
> -L:   virtio-...@lists.oasis-open.org
> +L:   virtio-...@lists.oasis-open.org (subscribers-only)
>  L:   virtualizat...@lists.linux-foundation.org
>  S:   Maintained
>  F:   drivers/virtio/
> @@ -9461,7 +9461,7 @@ F:  include/uapi/linux/virtio_*.h
>  VIRTIO HOST (VHOST)
>  M:   "Michael S. Tsirkin" 
>  L:   k...@vger.kernel.org
> -L:   virtio-...@lists.oasis-open.org
> +L:   virtio-...@lists.oasis-open.org (subscribers-only)
>  L:   virtualizat...@lists.linux-foundation.org
>  L:   net...@vger.kernel.org
>  S:   Maintained
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set via debugfs

2014-03-12 Thread Mathieu Desnoyers

- Original Message -
> From: "Andi Kleen" 
> To: "Mathieu Desnoyers" 
> Cc: "Andi Kleen" , "Steven Rostedt" 
> , "Frank Ch. Eigler" ,
> linux-kernel@vger.kernel.org, "Ingo Molnar" , "Frederic 
> Weisbecker" , "Andrew
> Morton" , "Johannes Berg" 
> , "Linus Torvalds"
> , "Peter Zijlstra" , 
> "Thomas Gleixner" ,
> "Greg Kroah-Hartman" , "lttng-dev" 
> , "Rusty Russell"
> 
> Sent: Wednesday, March 12, 2014 11:15:01 PM
> Subject: Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set 
> via debugfs
> 
> On Wed, Mar 12, 2014 at 08:47:07PM +, Mathieu Desnoyers wrote:
> > - Original Message -
> > > From: "Andi Kleen" 
> > > To: "Mathieu Desnoyers" 
> > > Cc: "Steven Rostedt" , "Frank Ch. Eigler"
> > > , linux-kernel@vger.kernel.org, "Ingo
> > > Molnar" , "Frederic Weisbecker" ,
> > > "Andrew Morton" ,
> > > "Johannes Berg" , "Linus Torvalds"
> > > , "Peter Zijlstra"
> > > , "Thomas Gleixner" , "Greg
> > > Kroah-Hartman" ,
> > > "lttng-dev" , "Rusty Russell"
> > > , "Andi Kleen" 
> > > Sent: Wednesday, March 12, 2014 4:35:15 PM
> > > Subject: Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not
> > > set via debugfs
> > > 
> > > > So I understand that you wish to banish tracepoints from static inline
> > > > functions within headers to ensure they only appear within a single
> > > > module.
> > > > This seems to be a step backward, but let's assume we stick to that
> > > > rule.
> > > > Then how do you envision dealing with Link-Time Optimisations (LTO) ?
> > > 
> > > I assume it uses the file name defines set by Kbuild?
> > 
> > Just to make sure I understand your question: I understand that you are
> > asking
> > whether tracepoints use file name defines at all in the naming of a
> > tracepoint.
> > The answer to this question is: No, they do not.
> 
> Ok. It uses kallsyms? That can change of course.

As I just replied to Steven, I now see that I mixed up concerns about
static keys, and the prior kernel markers, with tracepoint concerns.
The way they are implemented are very much different (Hey! I should know,
I wrote that code some 6 years ago!) ;)

> > 
> > > These don't change with
> > > LTO. It's whatever was specified at compile time. Also LTO doesn't
> > > inline over module boundaries (if the module is not built in)
> > 
> > Good to know. Can it inline core kernel functions into a module ?
> 
> Each module and the main kernel are currently LTO'ed separately.
> 
> In theory it would be possible to change this, but likely at some
> compile time cost.

OK, thanks for the explanations!

Mathieu

> 
> -Andi
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] bridge: few enhancements and small fixes

2014-03-12 Thread Luis R. Rodriguez

Here's a few fixes I've been carrying around. I've now tested them
on as many systems / environments as I can. They should be ready.

Luis R. Rodriguez (3):
  bridge: preserve random init MAC address
  bridge: trigger a bridge calculation upon port changes
  bridge: fix bridge root block on designated port

 net/bridge/br_device.c  |  1 +
 net/bridge/br_netlink.c | 24 
 net/bridge/br_private.h |  2 ++
 net/bridge/br_stp.c | 73 +
 net/bridge/br_stp_if.c  |  6 +++-
 5 files changed, 99 insertions(+), 7 deletions(-)

-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] bridge: trigger a bridge calculation upon port changes

2014-03-12 Thread Luis R. Rodriguez

From: "Luis R. Rodriguez" 

If netlink is used to tune a port we currently don't trigger a
new recalculation of the bridge id, ensure that happens just as
if we're adding a new net_device onto the bridge.

Cc: Stephen Hemminger 
Cc: bri...@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-de...@lists.xenproject.org
Cc: k...@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 net/bridge/br_netlink.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index e74b6d53..6f1b26d 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -364,6 +364,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
struct net_bridge_port *p;
struct nlattr *tb[IFLA_BRPORT_MAX + 1];
int err = 0;
+   bool changed;
 
protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), 
IFLA_PROTINFO);
afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
@@ -386,7 +387,12 @@ int br_setlink(struct net_device *dev, struct nlmsghdr 
*nlh)
 
spin_lock_bh(>br->lock);
err = br_setport(p, tb);
+   changed = br_stp_recalculate_bridge_id(p->br);
spin_unlock_bh(>br->lock);
+   if (changed)
+   call_netdevice_notifiers(NETDEV_CHANGEADDR,
+p->br->dev);
+   netdev_update_features(p->br->dev);
} else {
/* Binary compatibility with old RSTP */
if (nla_len(protinfo) < sizeof(u8))
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] bridge: fix bridge root block on designated port

2014-03-12 Thread Luis R. Rodriguez

From: "Luis R. Rodriguez" 

Root port blocking was designed so that a bridge port can opt
out of becoming the designated root port for a bridge. If a port
however first becomes the designated root port and we then toggle
the root port block on it we currently don't kick that port out of
the designated root port. This fixes that. This is particularly
important for net_devices that would wish to never become a root
port from the start, currently toggling that off will enable root
port flag but it won't really kick the bridge and do what you'd
expect, the MAC address is kept on the bridge of the toggled port
for root_block if it was the designated port.

In order to catch if a port with root port block preference is set
we need to move our check for root block prior to the root selection
so check for root blocked ports upon eveyr br_configuration_update().
We also simply just prevent the root-blocked ports from consideration
as root port candidates on br_should_become_root_port() and
br_stp_recalculate_bridge_id().

The issue that this patch is trying to address and fix can be tested
easily before and after this patch is applied using 2 TAP net_devices
and then toggling at will with the root_block knob.

ip tuntap add dev tap0 mode tap
ip tuntap add dev tap1 mode tap
ip link add dev br0 type bridge
ip link show br0
echo ---
ip link set dev tap0 master br0
ip link
echo ---
ip link set dev tap1 master br0
ip link
echo ---

Upon review at the above results you can toggle root_block
on each port to see if you see the results you expect.

bridge link set dev tap0 root_block on
ip link
bridge link set dev tap1 root_block on
ip link

Toggling off root_block on any port should will bring back the
port to be a candidate for designated root port:

bridge link set dev tap1 root_block off
ip link
bridge link set dev tap0 root_block off
ip link

To nuke:

ip tuntap del tap0 mode tap
ip tuntap del tap0 mode tap

Cc: Stephen Hemminger 
Cc: bri...@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-de...@lists.xenproject.org
Cc: k...@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 net/bridge/br_netlink.c | 18 
 net/bridge/br_private.h |  1 +
 net/bridge/br_stp.c | 73 +
 net/bridge/br_stp_if.c  |  3 +-
 4 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 6f1b26d..fbec354 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -324,6 +324,21 @@ static void br_set_port_flag(struct net_bridge_port *p, 
struct nlattr *tb[],
}
 }
 
+static void br_kick_bridge_port(struct net_bridge_port *p)
+{
+   struct net_bridge *br = p->br;
+   bool wasroot;
+
+   wasroot = br_is_root_bridge(br);
+   br_become_designated_port(p);
+
+   br_configuration_update(br);
+   br_port_state_selection(br);
+
+   if (br_is_root_bridge(br) && !wasroot)
+   br_become_root_bridge(br);
+}
+
 /* Process bridge protocol info on port */
 static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 {
@@ -353,6 +368,9 @@ static int br_setport(struct net_bridge_port *p, struct 
nlattr *tb[])
if (err)
return err;
}
+
+   br_kick_bridge_port(p);
+
return 0;
 }
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 32a06da..45d7917 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -150,6 +150,7 @@ struct net_bridge_port
u8  priority;
u8  state;
u16 port_no;
+   boolroot_block_enabled;
unsigned char   topology_change_ack;
unsigned char   config_pending;
port_id port_id;
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 3c86f05..f5741f3 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -59,6 +59,7 @@ static int br_should_become_root_port(const struct 
net_bridge_port *p,
 
br = p->br;
if (p->state == BR_STATE_DISABLED ||
+  (p->flags & BR_ROOT_BLOCK) ||
br_is_designated_port(p))
return 0;
 
@@ -104,7 +105,7 @@ static void br_root_port_block(const struct net_bridge *br,
   struct net_bridge_port *p)
 {
 
-   br_notice(br, "port %u(%s) tried to become root port (blocked)",
+   br_notice(br, "port %u (%s) is now root blocked",
  (unsigned int) p->port_no, p->dev->name);
 
p->state = BR_STATE_LISTENING;
@@ -124,11 +125,7 @@ static void br_root_selection(struct net_bridge *br)
list_for_each_entry(p, >port_list, list) {
if

Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set via debugfs

2014-03-12 Thread Andi Kleen

On Wed, Mar 12, 2014 at 08:47:07PM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Andi Kleen" 
> > To: "Mathieu Desnoyers" 
> > Cc: "Steven Rostedt" , "Frank Ch. Eigler" 
> > , linux-kernel@vger.kernel.org, "Ingo
> > Molnar" , "Frederic Weisbecker" , 
> > "Andrew Morton" ,
> > "Johannes Berg" , "Linus Torvalds" 
> > , "Peter Zijlstra"
> > , "Thomas Gleixner" , "Greg 
> > Kroah-Hartman" ,
> > "lttng-dev" , "Rusty Russell" 
> > , "Andi Kleen" 
> > Sent: Wednesday, March 12, 2014 4:35:15 PM
> > Subject: Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not 
> > set via debugfs
> > 
> > > So I understand that you wish to banish tracepoints from static inline
> > > functions within headers to ensure they only appear within a single 
> > > module.
> > > This seems to be a step backward, but let's assume we stick to that rule.
> > > Then how do you envision dealing with Link-Time Optimisations (LTO) ?
> > 
> > I assume it uses the file name defines set by Kbuild?
> 
> Just to make sure I understand your question: I understand that you are asking
> whether tracepoints use file name defines at all in the naming of a 
> tracepoint.
> The answer to this question is: No, they do not.

Ok. It uses kallsyms? That can change of course.
> 
> > These don't change with
> > LTO. It's whatever was specified at compile time. Also LTO doesn't
> > inline over module boundaries (if the module is not built in)
> 
> Good to know. Can it inline core kernel functions into a module ?

Each module and the main kernel are currently LTO'ed separately.

In theory it would be possible to change this, but likely at some
compile time cost.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] bridge: preserve random init MAC address

2014-03-12 Thread Luis R. Rodriguez

From: "Luis R. Rodriguez" 

As it is now if you add create a bridge it gets started
with a random MAC address and if you then add a net_device
as a slave but later kick it out you end up with a zero
MAC address. Instead preserve the original random MAC
address and use it.

If you manually set the bridge address that will always
be respected. This change only takes effect if at the time
of computing the new root port we determine we have found
no candidates.

Cc: Stephen Hemminger 
Cc: bri...@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-de...@lists.xenproject.org
Cc: k...@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 net/bridge/br_device.c  | 1 +
 net/bridge/br_private.h | 1 +
 net/bridge/br_stp_if.c  | 3 +++
 3 files changed, 5 insertions(+)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index b063050..5f13eac 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -368,6 +368,7 @@ void br_dev_setup(struct net_device *dev)
br->bridge_id.prio[1] = 0x00;
 
ether_addr_copy(br->group_addr, eth_reserved_addr_base);
+   ether_addr_copy(br->random_init_addr, dev->dev_addr);
 
br->stp_enabled = BR_NO_STP;
br->group_fwd_mask = BR_GROUPFWD_DEFAULT;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e1ca1dc..32a06da 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -240,6 +240,7 @@ struct net_bridge
unsigned long   bridge_hello_time;
unsigned long   bridge_forward_delay;
 
+   u8  random_init_addr[ETH_ALEN];
u8  group_addr[ETH_ALEN];
u16 root_port;
 
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 189ba1e..4c9ad45 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -239,6 +239,9 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
if (ether_addr_equal(br->bridge_id.addr, addr))
return false;   /* no change */
 
+   if (ether_addr_equal(addr, br_mac_zero))
+   addr = br->random_init_addr;
+
br_stp_change_bridge_id(br, addr);
return true;
 }
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 07:38 PM, joeyli wrote:
> 
> I sent rtc-acpitad driver for RTC subsystem on last month, I will send
> second version.
> 
> For using TAD to set wall clock is because in ACPI 5.0 spec, there have
> a "CMOS RTC Not Present" flag in FADT to indicate OSPM should use TAD
> when this flag set:
> 
> CMOS RTC Not Present
> 

Bullsh*t.  The TAD should be used if it is present, it has nothing to do
with this flag.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set via debugfs

2014-03-12 Thread Mathieu Desnoyers

- Original Message -
> From: "Steven Rostedt" 
> To: "Mathieu Desnoyers" 
> Cc: "Frank Ch. Eigler" , linux-kernel@vger.kernel.org, "Ingo 
> Molnar" , "Frederic
> Weisbecker" , "Andrew Morton" 
> , "Johannes Berg"
> , "Linus Torvalds" , 
> "Peter Zijlstra"
> , "Thomas Gleixner" , "Greg 
> Kroah-Hartman" ,
> "lttng-dev" , "Rusty Russell" 
> , "Andi Kleen" 
> Sent: Wednesday, March 12, 2014 8:49:07 PM
> Subject: Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set 
> via debugfs
> 
> On Wed, 12 Mar 2014 19:51:01 + (UTC)
> Mathieu Desnoyers  wrote:
> 
> > This only leaves tracepoints in header files and the impact of LTO as
> > requirements for having tracepoint callsites with the same name across
> > modules.
> 
> The only thing that needs to be unique is the struct tracepoint
> __tracepoint_##name. There should not be any duplicates of those. I
> can't see how the LTO would duplicate a data structure without screwing
> everything (not just tracepoints) up.
> 
> We can still have more than one trace_##name() called, as that is
> handled by the static key.

Hrm, I seem to have mixed up the concerns regarding compiler
optimisations between the static keys with those related to tracepoints
(including that their predecessors "kernel markers" worked more like
static keys than tracepoint). I did work in both area pretty much at the
same time back in 2007-2008.

Having gotten my head back up straight, I now see the point in your
proposal. Thanks for bearing with me.

Even if we ever want to have tracepoints within header files, as long
as we have the DECLARE_TRACE() within the header, and a DEFINE_TRACE()
in one location in the loaded kernel image (or a loaded module we depend
on), it should be possible too. All tracepoint.c cares about is the site
defined by DEFINE_TRACE().

> 
> Note, I'm scrambling to get ready for my trip tomorrow. Thus, I'm not
> as much at the computer. I may work on some patches in my 6 hour
> layover though.

Allright. Have a good trip!

Thanks,

Mathieu

> 
> -- Steve
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] zram: support REQ_DISCARD

2014-03-12 Thread Andrew Morton

On Thu, 13 Mar 2014 11:46:17 +0900 Joonsoo Kim  wrote:

> + while (n >= PAGE_SIZE) {
> + /*
> +  * discard request can be too large so that the zram can
> +  * be stucked for a long time if we handle the request
> +  * at once. So handle the request by PAGE_SIZE unit at a time.
> +  */
> + write_lock(>meta->tb_lock);
> + zram_free_page(zram, index);
> + write_unlock(>meta->tb_lock);
> + index++;
> + n -= PAGE_SIZE;
> + }

Well, you could use something like

if (need_resched()) {
unlock()
schedule()
lock()
}

here, or free 100 pages at a time or something silly like that.  I
guess we retain these as options if/when that lock turns out to be
contended.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread joeyli

於 四，2014-03-13 於 00:49 +0100，Thomas Gleixner 提到：
> On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> > I agree, and we need to fix that for 3.14.  Patch is appended.
> 
> You beat me by a few minutes. Was about to send out the same, just
> with a more spicy changelog :)
> 
> > ---
> > From: Rafael J. Wysocki 
> > Subject: ACPI / init: Invoke early ACPI initialization later
> > 
> > Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before
> > timekeeping_init()) optimistically moved the early ACPI initialization
> > before timekeeping_init(), but that didn't work, because it broke fast
> > TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely
> > for others too).  The reason is that acpi_early_init() enables the SCI
> > and that interferes with the fast TSC calibration mechanism.
> > 
> > Thus follow the original idea to execute acpi_early_init() before
> > efi_enter_virtual_mode() to help the EFI people for now and we can
> > revisit the other problem that commit 73f7d1ca3263 attempted to
> > address in the future (if really necessary).
> 
> Reviewed-by: Thomas Gleixner 
>  
> 

Acked-by: Lee, Chun-Yi 

Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] generic early_ioremap support

2014-03-12 Thread Andrew Morton

On Wed, 12 Mar 2014 22:29:48 -0400 Mark Salter  wrote:

> Hi Andrew,
> 
> Could you add this series into the -mm tree for v3.15?
> 
> The following changes since commit c3bebc71c4bcdafa24b506adf0c1de3c1f77e2e0:
> 
>   Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2014-03-04 
> 08:44:32 -0800)
> 
> are available in the git repository at:
> 
> 
>   git://github.com/mosalter/linux.git tags/for-v3.15
> 
> for you to fetch changes up to b27e0658d90c63dc2696eca44f7701a903cb13c5:
> 
>   doc/kernel-parameters.txt: add early_ioremap_debug (2014-03-09 12:53:50 
> -0400)
> 
> 
> generic early_ioremap support

Spose so.  I was hoping the x86 and arm people might do it.  Has there
been sufficient feedback from those parties?

I don't actually do git pulls - I'll grab the patches off the mailing
list, if they're up to date?  (Sorry, it doesn't come up very often
so it isn't worth getting set up for).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] zram: support REQ_DISCARD

2014-03-12 Thread Joonsoo Kim

On Wed, Mar 12, 2014 at 01:33:18PM -0700, Andrew Morton wrote:
> On Wed, 12 Mar 2014 17:01:09 +0900 Joonsoo Kim  wrote:
> 
> > zram is ram based block device and can be used by backend of filesystem.
> > When filesystem deletes a file, it normally doesn't do anything on data
> > block of that file. It just marks on metadata of that file. This behavior
> > has no problem on disk based block device, but has problems on ram based
> > block device, since we can't free memory used for data block. To overcome
> > this disadvantage, there is REQ_DISCARD functionality. If block device
> > support REQ_DISCARD and filesystem is mounted with discard option,
> > filesystem sends REQ_DISCARD to block device whenever some data blocks are
> > discarded. All we have to do is to handle this request.
> > 
> > This patch implements to flag up QUEUE_FLAG_DISCARD and handle this
> > REQ_DISCARD request. With it, we can free memory used by zram if it isn't
> > used.
> > 
> > ...
> >
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -541,6 +541,33 @@ static int zram_bvec_rw(struct zram *zram, struct 
> > bio_vec *bvec, u32 index,
> > return ret;
> >  }
> >  
> > +static void zram_bio_discard(struct zram *zram, u32 index,
> > +int offset, struct bio *bio)
> 
> A little bit of documentation here wouldn't hurt.  "index" and "offset"
> are pretty vague identifiers.  What do these args represent and what
> are their units.
> 
> > +{
> > +   size_t n = bio->bi_iter.bi_size;
> > +
> > +   /*
> > +* On some arch, logical block (4096) aligned request couldn't be
> > +* aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096.
> > +* Therefore we should handle this misaligned case here.
> > +*/
> > +   if (offset) {
> > +   if (n < offset)
> > +   return;
> > +
> > +   n -= offset;
> > +   index++;
> > +   }
> > +
> > +   while (n >= PAGE_SIZE) {
> > +   write_lock(>meta->tb_lock);
> > +   zram_free_page(zram, index);
> > +   write_unlock(>meta->tb_lock);
> > +   index++;
> > +   n -= PAGE_SIZE;
> > +   }
> 
> We could take the lock a single time rather than once per page.  Was
> there a reason for doing it this way?  If so, that should be documented
> as well please - there is no way a reader can know the reason from this
> code.
> 
> 
> > +}
> > +
> >  static void zram_reset_device(struct zram *zram, bool reset_capacity)
> >  {
> > size_t index;
> > @@ -676,6 +703,12 @@ static void __zram_make_request(struct zram *zram, 
> > struct bio *bio)
> > offset = (bio->bi_iter.bi_sector &
> >   (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT;
> >  
> > +   if (unlikely(bio->bi_rw & REQ_DISCARD)) {
> > +   zram_bio_discard(zram, index, offset, bio);
> > +   bio_endio(bio, 0);
> > +   return;
> > +   }
> > +
> > bio_for_each_segment(bvec, bio, iter) {
> > int max_transfer_size = PAGE_SIZE - offset;
> >  
> > @@ -845,6 +878,17 @@ static int create_device(struct zram *zram, int 
> > device_id)
> > ZRAM_LOGICAL_BLOCK_SIZE);
> > blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
> > blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
> > +   zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
> > +   zram->disk->queue->limits.max_discard_sectors = UINT_MAX;
> > +   /*
> > +* We will skip to discard mis-aligned range, so we can't ensure
> > +* whether discarded region is zero or not.
> > +*/
> 
> That's a bit hard to follow.  What is it that is misaligned, relative
> to what?
> 
> And where does this skipping occur?  zram_bio_discard() avoids
> discarding partial pages at the start and end of the bio (I think).  Is
> that what we're referring to here?  If so, what about the complete
> pages between the two partial pages - they are zeroed on read.  Will
> the code end up having to rezero those?
> 
> As you can tell, I'm struggling to understand what's going on here ;)
> Some additional description of how it all works would be nice.  Perferably
> as code comments so the information is permanent.

Hello, Andrew.

I applied all your comments in below patch. :)
Thanks for comment.

->8---
>From f77b0a5ad9bc27d5b3bc0b21ed1e98de51c62f1f Mon Sep 17 00:00:00 2001
From: Joonsoo Kim 
Date: Mon, 24 Feb 2014 14:30:43 +0900
Subject: [PATCH v4] zram: support REQ_DISCARD

zram is ram based block device and can be used by backend of filesystem.
When filesystem deletes a file, it normally doesn't do anything on data
block of that file. It just marks on metadata of that file. This behavior
has no problem on disk based block device, but has problems on ram based
block device, since we can't free memory used for data block. To overcome
this disadvantage, there is REQ_DISCARD functionality. If block device
support REQ_DISCARD and filesystem is mounted with discard option,
filesystem

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread joeyli

於 四，2014-03-13 於 01:54 +0100，Thomas Gleixner 提到：
> On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> > Thus follow the original idea to execute acpi_early_init() before
> > efi_enter_virtual_mode() to help the EFI people for now and we can
> > revisit the other problem that commit 73f7d1ca3263 attempted to
> > address in the future (if really necessary).
> 
> It's not necessary at all. In fact we really want to get rid of the
> arch specific cmos stuff which is an historical leftover.
> 
> I talked to John Stultz earlier today and he agrees that there are
> only a few trivial things to add to the RTC subsystem to make this
> work.
> 

I sent rtc-acpitad driver for RTC subsystem on last month, I will send
second version.

For using TAD to set wall clock is because in ACPI 5.0 spec, there have
a "CMOS RTC Not Present" flag in FADT to indicate OSPM should use TAD
when this flag set:

CMOS RTC Not Present

If set, indicates that the CMOS RTC is either not implemented, or
does not exist at the legacy addresses. OSPM uses the Control
Method Time and Alarm Namespace device instead.

So, the original thinking of patch is using TAD to replace CMOS
interface in kernel for access RTC. The timekeeping_init() is the
earliest function to access CMOS RTC, that's why move TAD before it.

I hope can discuss about "CMOS RTC Not Present" flag. If hardware vendor
set this flag in FADT, should we avoid to access CMOS RTC interface in
any kernel code?

> >From the timekeeping POV there is absolutely no need to set the wall
> clock time early. The kernel boot phase does not care about wall time
> at all. We should have it done before we hit userspace, but not even
> that is a hard requirement.

I agree! If kernel boot phase does not care about wall time, then we
don't need parse DSDT for access TAD too early.

> 
> That TAD/EFI time mess is not going to happen before that is solved.
> 
> Thanks,
> 
>   tglx
> 

ACPI TAD return local time and timezone information so kernel can adjust
wall time then don't need userspace involve. 

Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 0/2] i2c: add DMA support for freescale i2c driver

2014-03-12 Thread Yuan Yao


Changed in v3:
- fix a bug when request the dma faild.
- some minor fixes for coding style.
- other minor fixes.

Changed in v2:
- remove has_dma_support property
- unify i2c_imx_dma_rx and i2c_imx_dma_tx
- unify i2c_imx_dma_read and i2c_imx_pio_read
- unify i2c_imx_dma_write and i2c_imx_pio_write

Added in v1:
- Enable dma if it's support dma and transfer size bigger than the threshold.
- Add device tree bindings for i2c eDMA support.
- Add eDMA support for i2c driver.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/2] Documentation:add DMA support for freescale i2c driver

2014-03-12 Thread Yuan Yao

Add i2c dts node properties for eDMA support, them depend on the eDMA driver.

Signed-off-by: Yuan Yao 
---
 Documentation/devicetree/bindings/i2c/i2c-imx.txt | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt 
b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
index 4a8513e..52d37fd 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
@@ -11,6 +11,8 @@ Required properties:
 Optional properties:
 - clock-frequency : Constains desired I2C/HS-I2C bus clock frequency in Hz.
   The absence of the propoerty indicates the default frequency 100 kHz.
+- dmas: A list of two dma specifiers, one for each entry in dma-names.
+- dma-names: should contain "tx" and "rx".
 
 Examples:
 
@@ -26,3 +28,12 @@ i2c@70038000 { /* HS-I2C on i.MX51 */
interrupts = <64>;
clock-frequency = <40>;
 };
+
+i2c0: i2c@40066000 { /* i2c0 on vf610 */
+   compatible = "fsl,vf610-i2c";
+   reg = <0x40066000 0x1000>;
+   interrupts =<0 71 0x04>;
+   dmas = < 0 50>,
+   < 0 51>;
+   dma-names = "rx","tx";
+};
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/2] i2c: add DMA support for freescale i2c driver

2014-03-12 Thread Yuan Yao

Add dma support for i2c. This function depend on DMA driver.
You can turn on it by write both the dmas and dma-name properties in dts node.

Signed-off-by: Yuan Yao 
---
 drivers/i2c/busses/i2c-imx.c | 354 +--
 1 file changed, 306 insertions(+), 48 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index db895fb..6bfe23c 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -37,22 +37,27 @@
 /** Includes 
***
 
***/
 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+#include 
 
 /** Defines 

 
***/
@@ -63,6 +68,10 @@
 /* Default value */
 #define IMX_I2C_BIT_RATE   10  /* 100kHz */
 
+/* enable DMA if transfer byte size is bigger than this threshold */
+#define IMX_I2C_DMA_THRESHOLD  16
+#define IMX_I2C_DMA_TIMEOUT1000
+
 /* IMX I2C registers:
  * the I2C register offset is different between SoCs,
  * to provid support for all these chips, split the
@@ -88,6 +97,7 @@
 #define I2SR_IBB   0x20
 #define I2SR_IAAS  0x40
 #define I2SR_ICF   0x80
+#define I2CR_DMAEN 0x02
 #define I2CR_RSTA  0x04
 #define I2CR_TXAK  0x08
 #define I2CR_MTX   0x10
@@ -174,6 +184,17 @@ struct imx_i2c_hwdata {
unsignedi2cr_ien_opcode;
 };
 
+struct imx_i2c_dma {
+   struct dma_chan *chan_tx;
+   struct dma_chan *chan_rx;
+   struct dma_chan *chan_using;
+   struct completion   cmd_complete;
+   dma_addr_t  dma_buf;
+   unsigned intdma_len;
+   unsigned intdma_transfer_dir;
+   unsigned intdma_data_dir;
+};
+
 struct imx_i2c_struct {
struct i2c_adapter  adapter;
struct clk  *clk;
@@ -184,6 +205,9 @@ struct imx_i2c_struct {
int stopped;
unsigned intifdr; /* IMX_I2C_IFDR */
const struct imx_i2c_hwdata *hwdata;
+
+   struct imx_i2c_dma  *dma;
+   booluse_dma;
 };
 
 static const struct imx_i2c_hwdata imx1_i2c_hwdata  = {
@@ -254,9 +278,121 @@ static inline unsigned char imx_i2c_read_reg(struct 
imx_i2c_struct *i2c_imx,
return readb(i2c_imx->base + (reg << i2c_imx->hwdata->regshift));
 }
 
+/* Functions for DMA support */
+static int i2c_imx_dma_request(struct imx_i2c_struct *i2c_imx,
+   dma_addr_t phy_addr)
+{
+   struct imx_i2c_dma *dma = i2c_imx->dma;
+   struct dma_slave_config dma_sconfig;
+   struct device *dev = _imx->adapter.dev;
+   int ret;
+
+   dma->chan_tx = dma_request_slave_channel(dev, "tx");
+   if (!dma->chan_tx) {
+   dev_err(dev, "Dma tx channel request failed!\n");
+   return -ENODEV;
+   }
+
+   dma_sconfig.dst_addr = phy_addr +
+   (IMX_I2C_I2DR << i2c_imx->hwdata->regshift);
+   dma_sconfig.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+   dma_sconfig.dst_maxburst = 1;
+   dma_sconfig.direction = DMA_MEM_TO_DEV;
+   ret = dmaengine_slave_config(dma->chan_tx, _sconfig);
+   if (ret < 0) {
+   dev_err(dev, "Dma slave config failed, err = %d\n", ret);
+   goto fail_tx;
+   }
+
+   dma->chan_rx = dma_request_slave_channel(dev, "rx");
+   if (!dma->chan_rx) {
+   dev_err(dev, "Dma rx channel request failed!\n");
+   ret = -ENODEV;
+   goto fail_tx;
+   }
+
+   dma_sconfig.src_addr = phy_addr +
+   (IMX_I2C_I2DR << i2c_imx->hwdata->regshift);
+   dma_sconfig.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+   dma_sconfig.src_maxburst = 1;
+   dma_sconfig.direction = DMA_DEV_TO_MEM;
+   ret = dmaengine_slave_config(dma->chan_rx, _sconfig);
+   if (ret < 0) {
+   dev_err(dev, "Dma slave config failed, err = %d\n", ret);
+   goto fail_rx;
+   }
+
+   init_completion(>cmd_complete);
+
+   return 0;
+
+fail_rx:
+   dma_release_channel(dma->chan_rx);
+fail_tx:
+   dma_release_channel(dma->chan_tx);
+   return ret;
+}
+
+static void i2c_imx_dma_callback(void *arg)
+{
+   struct imx_i2c_struct *i2c_imx = (struct imx_i2c_struct *)arg;
+   struct imx_i2c_dma *dma = i2c_imx->dma;
+
+   dma_unmap_single(dma->chan_using->device->dev, dma->dma_buf,
+

[GIT PULL] generic early_ioremap support

2014-03-12 Thread Mark Salter

Hi Andrew,

Could you add this series into the -mm tree for v3.15?

The following changes since commit c3bebc71c4bcdafa24b506adf0c1de3c1f77e2e0:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2014-03-04 
08:44:32 -0800)

are available in the git repository at:


  git://github.com/mosalter/linux.git tags/for-v3.15

for you to fetch changes up to b27e0658d90c63dc2696eca44f7701a903cb13c5:

  doc/kernel-parameters.txt: add early_ioremap_debug (2014-03-09 12:53:50 -0400)


generic early_ioremap support


Dave Young (1):
  x86/mm: sparse warning fix for early_memremap

Mark Salter (5):
  mm: create generic early_ioremap() support
  x86: use generic early_ioremap
  arm64: initialize pgprot info earlier in boot
  arm64: add early_ioremap support
  doc/kernel-parameters.txt: add early_ioremap_debug

 Documentation/arm64/memory.txt  |   4 +-
 Documentation/kernel-parameters.txt |   5 +
 arch/arm64/Kconfig  |   1 +
 arch/arm64/include/asm/Kbuild   |   1 +
 arch/arm64/include/asm/fixmap.h |  67 ++
 arch/arm64/include/asm/io.h |   1 +
 arch/arm64/include/asm/memory.h |   2 +-
 arch/arm64/include/asm/mmu.h|   1 +
 arch/arm64/kernel/early_printk.c|   8 +-
 arch/arm64/kernel/head.S|   9 +-
 arch/arm64/kernel/setup.c   |   4 +
 arch/arm64/mm/ioremap.c |  85 +
 arch/arm64/mm/mmu.c |  44 +--
 arch/x86/Kconfig|   1 +
 arch/x86/include/asm/Kbuild |   1 +
 arch/x86/include/asm/fixmap.h   |   6 +
 arch/x86/include/asm/io.h   |  14 +--
 arch/x86/mm/ioremap.c   | 224 +
 arch/x86/mm/pgtable_32.c|   2 +-
 include/asm-generic/early_ioremap.h |  42 +++
 mm/Kconfig  |   3 +
 mm/Makefile |   1 +
 mm/early_ioremap.c  | 245 
 23 files changed, 482 insertions(+), 289 deletions(-)
 create mode 100644 arch/arm64/include/asm/fixmap.h
 create mode 100644 include/asm-generic/early_ioremap.h
 create mode 100644 mm/early_ioremap.c


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] net: phy: fix uninitalized WOL parameters in phy_ethtool_get_wol

2014-03-12 Thread Ben Hutchings

On Wed, 2014-03-12 at 00:02 +0100, Sebastian Hesselbarth wrote:
> phy_ethtool_get_wol is a helper to get current WOL settings from
> a phy device. When using this helper on a PHY without .get_wol
> callback, struct ethtool_wolinfo is never set-up correctly and
> may contain misleading information about WOL status.
> 
> To fix this, always zero relevant fields of struct ethtool_wolinfo
> regardless of .get_wol callback availability.

Sorry, I still disagree with this.

You're trying to make phy_ethtool_get_wol() do two subtly different
things:
- Provide an implementation of ethtool_ops::get_wol, leaving the net
  driver only to look up phy_device
- Provide a standalone function for executing ETHTOOL_GWOL on a
  phy_device

You may notice that phy_suspend() already sets wol.cmd = ETHTOOL_GWOL.
So it seems to me like it's taking responsibility for initialising the
structure like ethtool_get_wol() does.  The bug is then that
phy_suspend() doesn't clear the rest of the structure.  That is not the
responsibility of phy_ethtool_get_wol().

Ben.

> Signed-off-by: Sebastian Hesselbarth 
> Reviewed-by: Florian Fainelli 
> ---
> Changelog:
> v1->v2:
> - clear whole struct ethtool_wolinfo
> - check for non-NULL phy_device
> v2->v3:
> - only clear ->supported and ->wolopts (Suggested by Ben Hutchings)
> 
> Cc: David Miller 
> Cc: Florian Fainelli 
> Cc: Ben Hutchings 
> Cc: net...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/phy/phy.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 19c9eca0ef26..94234a91a50f 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -1092,7 +1092,9 @@ EXPORT_SYMBOL(phy_ethtool_set_wol);
>  
>  void phy_ethtool_get_wol(struct phy_device *phydev, struct ethtool_wolinfo 
> *wol)
>  {
> - if (phydev->drv->get_wol)
> + wol->supported = wol->wolopts = 0;
> +
> + if (phydev && phydev->drv->get_wol)
>   phydev->drv->get_wol(phydev, wol);
>  }
>  EXPORT_SYMBOL(phy_ethtool_get_wol);

-- 
Ben Hutchings
Experience is directly proportional to the value of equipment destroyed.
 - Carolyn Scheppner


signature.asc
Description: This is a digitally signed message part

Re: [PATCH 1/2] pstore: fix buffer overflow while write offset equal to buffer size

2014-03-12 Thread Shuo Liu

2014-03-13 0:50 GMT+08:00 Kees Cook :
> On Wed, Mar 12, 2014 at 6:24 AM, Liu Shuo  wrote:
>> From: Liu ShuoX 
>>
>> In case new offset is equal to prz->buffer_size, it won't wrap at this
>> time and will return old(overflow) value next time.
>>
>> Signed-off-by: Liu ShuoX 
>
> This seems correct; good catch. Have you seen this problem happen, or
> is this just from reading the code?
Thanks.
We indeed hit it when we enhanced the ramoops tracing.

>
> Acked-by: Kees Cook 
>
> -Kees
>
>> ---
>> fs/pstore/ram_core.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
>> index de272d4..ff7e3d4 100644
>> --- a/fs/pstore/ram_core.c
>> +++ b/fs/pstore/ram_core.c
>> @@ -54,7 +54,7 @@ static size_t buffer_start_add_atomic(struct
>> persistent_ram_zone *prz, size_t a)
>> do {
>> old = atomic_read(>buffer->start);
>> new = old + a;
>> -   while (unlikely(new > prz->buffer_size))
>> +   while (unlikely(new >= prz->buffer_size))
>> new -= prz->buffer_size;
>> } while (atomic_cmpxchg(>buffer->start, old, new) != old);
>>
>> @@ -91,7 +91,7 @@ static size_t buffer_start_add_locked(struct
>> persistent_ram_zone *prz, size_t a)
>>
>> old = atomic_read(>buffer->start);
>> new = old + a;
>> -   while (unlikely(new > prz->buffer_size))
>> +   while (unlikely(new >= prz->buffer_size))
>> new -= prz->buffer_size;
>> atomic_set(>buffer->start, new);
>>
>> --
>> 1.8.3.2
>>
>>
>>
>
>
>
> --
> Kees Cook
> Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] devfreq: exynos4: Support devicetree to get device id of Exynos4 SoC

2014-03-12 Thread Chanwoo Choi

Hi Batlomiej,

On 03/12/2014 11:37 PM, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi Chanwoo,
> 
> On Wednesday, March 12, 2014 08:47:59 PM Chanwoo Choi wrote:
>> This patch support DT(DeviceTree) method to probe exynos4_bus and get device
>> id of each Exynos4 SoC by using dt helper function.
>>
>> Signed-off-by: Chanwoo Choi 
>> ---
>>  drivers/devfreq/exynos/exynos4_bus.c | 26 +-
>>  1 file changed, 25 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/devfreq/exynos/exynos4_bus.c 
>> b/drivers/devfreq/exynos/exynos4_bus.c
>> index e07b0c6..168a7c6 100644
>> --- a/drivers/devfreq/exynos/exynos4_bus.c
>> +++ b/drivers/devfreq/exynos/exynos4_bus.c
>> @@ -23,6 +23,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  
>>  /* Exynos4 ASV has been in the mailing list, but not upstreamed, yet. */
>> @@ -1017,6 +1018,28 @@ unlock:
>>  return NOTIFY_DONE;
>>  }
>>  
>> +static struct of_device_id exynos4_busfreq_id_match[] = {
>> +{
>> +.compatible = "samsung,exynos4210-busfreq",
>> +.data = (void *)TYPE_BUSF_EXYNOS4210,
>> +}, {
>> +.compatible = "samsung,exynos4x12-busfreq",
>> +.data = (void *)TYPE_BUSF_EXYNOS4x12,
>> +},
>> +};
>> +
>> +static int exynos4_busfreq_get_driver_data(struct platform_device *pdev)
>> +{
>> +struct device *dev = >dev;
>> +const struct of_device_id *match;
>> +
>> +match = of_match_node(exynos4_busfreq_id_match, dev->of_node);
>> +if (!match)
>> +return -ENODEV;
>> +
>> +return (int) match->data;
>> +}
>> +
>>  static int exynos4_busfreq_probe(struct platform_device *pdev)
>>  {
>>  struct busfreq_data *data;
>> @@ -1030,7 +1053,7 @@ static int exynos4_busfreq_probe(struct 
>> platform_device *pdev)
>>  return -ENOMEM;
>>  }
>>  
>> -data->type = pdev->id_entry->driver_data;
>> +data->type = exynos4_busfreq_get_driver_data(pdev);
>>  data->dmc[0].hw_base = S5P_VA_DMC0;
>>  data->dmc[1].hw_base = S5P_VA_DMC1;
>>  data->pm_notifier.notifier_call = exynos4_busfreq_pm_notifier_event;
>> @@ -1135,6 +1158,7 @@ static struct platform_driver exynos4_busfreq_driver = 
>> {
>>  .name   = "exynos4-busfreq",
>>  .owner  = THIS_MODULE,
>>  .pm = _busfreq_pm,
>> +.of_match_table = exynos4_busfreq_id_match,
>>  },
>>  };
> 
> It looks OK but it would be good to also add bindings documentation file,
> i.e. Documentation/devicetree/bindings/devfreq/exynos4_bus.txt.

OK I'll add documentation for exynos4_bus.c

Best Regards,
Chanwoo Choi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] devfreq: exynos4: Add ppmu's clock control and code clean about regulator control

2014-03-12 Thread Chanwoo Choi

Hi Batlomiej,

On 03/13/2014 12:17 AM, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi,
> 
> On Wednesday, March 12, 2014 08:48:01 PM Chanwoo Choi wrote:
>> There are not the clock controller of ppmudmc0/1. This patch control the 
>> clock
>> of ppmudmc0/1 which is used for monitoring memory bus utilization.
>>
>> Also, this patch code clean about regulator control and free resource
>> when calling exit/remove function.
>>
>> For example,
>> busfreq@106A {
>>  compatible = "samsung,exynos4x12-busfreq";
>>
>>  /* Clock for PPMUDMC0/1 */
>>  clocks = < CLK_PPMUDMC0>, < CLK_PPMUDMC1>;
>>  clock-names = "ppmudmc0", "ppmudmc1";
>>
>>  /* Regulator for MIF/INT block */
>>  vdd_mif-supply = <_reg>;
>>  vdd_int-supply = <_reg>;
>> };
> 
> This should be in Documentation/devicetree/bindings/ documentation.

OK, I will add documentation about it.

> 
>> Signed-off-by: Chanwoo Choi 
>> ---
>>  drivers/devfreq/exynos/exynos4_bus.c | 107 
>> ++-
>>  1 file changed, 93 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/devfreq/exynos/exynos4_bus.c 
>> b/drivers/devfreq/exynos/exynos4_bus.c
>> index 16fb3cb..0c5b99e 100644
>> --- a/drivers/devfreq/exynos/exynos4_bus.c
>> +++ b/drivers/devfreq/exynos/exynos4_bus.c
>> @@ -62,6 +62,11 @@ enum exynos_ppmu_idx {
>>  PPMU_END,
>>  };
>>  
>> +static const char *exynos_ppmu_clk_name[] = {
>> +[PPMU_DMC0] = "ppmudmc0",
>> +[PPMU_DMC1] = "ppmudmc1",
>> +};
>> +
>>  #define EX4210_LV_MAX   LV_2
>>  #define EX4x12_LV_MAX   LV_4
>>  #define EX4210_LV_NUM   (LV_2 + 1)
>> @@ -86,6 +91,7 @@ struct busfreq_data {
>>  struct regulator *vdd_mif; /* Exynos4412/4212 only */
>>  struct busfreq_opp_info curr_oppinfo;
>>  struct exynos_ppmu ppmu[PPMU_END];
>> +struct clk *clk_ppmu[PPMU_END];
>>  
>>  struct notifier_block pm_notifier;
>>  struct mutex lock;
>> @@ -722,8 +728,26 @@ static int exynos4_bus_get_dev_status(struct device 
>> *dev,
>>  static void exynos4_bus_exit(struct device *dev)
>>  {
>>  struct busfreq_data *data = dev_get_drvdata(dev);
>> +int i;
>>  
>> -devfreq_unregister_opp_notifier(dev, data->devfreq);
>> +/*
>> + * Un-map memory man and disable regulator/clocks
>> + * to prevent power leakage.
>> + */
>> +regulator_disable(data->vdd_int);
>> +if (data->type == TYPE_BUSF_EXYNOS4x12)
>> +regulator_disable(data->vdd_mif);
>> +
>> +for (i = 0; i < PPMU_END; i++) {
>> +if (data->clk_ppmu[i])
>> +clk_disable_unprepare(data->clk_ppmu[i]);
>> +}
>> +
>> +for (i = 0; i < PPMU_END; i++) {
>> +if (data->ppmu[i].hw_base)
>> +iounmap(data->ppmu[i].hw_base);
>> +
>> +}
>>  }
>>  
>>  static struct devfreq_dev_profile exynos4_devfreq_profile = {
>> @@ -987,6 +1011,7 @@ static int exynos4_busfreq_parse_dt(struct busfreq_data 
>> *data)
>>  {
>>  struct device *dev = data->dev;
>>  struct device_node *np = dev->of_node;
>> +const char **clk_name = exynos_ppmu_clk_name;
>>  int i, ret;
>>  
>>  if (!np) {
>> @@ -1005,8 +1030,67 @@ static int exynos4_busfreq_parse_dt(struct 
>> busfreq_data *data)
>>  }
>>  }
>>  
>> +/*
>> + * Get PPMU's clocks to control them. But, if PPMU's clocks
>> + * is default 'pass' state, this driver don't need control
>> + * PPMU's clock.
>> + */
>> +for (i = 0; i < PPMU_END; i++) {
>> +data->clk_ppmu[i] = devm_clk_get(dev, clk_name[i]);
>> +if (IS_ERR_OR_NULL(data->clk_ppmu[i])) {
>> +dev_warn(dev, "Cannot get %s clock\n", clk_name[i]);
>> +data->clk_ppmu[i] = NULL;
>> +}
>> +
>> +ret = clk_prepare_enable(data->clk_ppmu[i]);
>> +if (ret < 0) {
>> +dev_warn(dev, "Cannot enable %s clock\n", clk_name[i]);
>> +data->clk_ppmu[i] = NULL;
>> +goto err_clocks;
>> +}
>> +}
>> +
>> +
>> +/* Get regulators to control voltage of int/mif block */
>> +data->vdd_int = devm_regulator_get(dev, "vdd_int");
>> +if (IS_ERR(data->vdd_int)) {
>> +dev_err(dev, "Failed to get the regulator of vdd_int\n");
>> +ret = PTR_ERR(data->vdd_int);
>> +goto err_clocks;
>> +}
>> +ret = regulator_enable(data->vdd_int);
>> +if (ret < 0) {
>> +dev_err(dev, "Failed to enable regulator of vdd_int\n");
>> +goto err_clocks;
>> +}
>> +
>> +switch (data->type) {
>> +case TYPE_BUSF_EXYNOS4x12:
>> +data->vdd_mif = devm_regulator_get(dev, "vdd_mif");
>> +if (IS_ERR(data->vdd_mif)) {
>> +dev_err(dev, "Failed to get the regulator vdd_mif\n");
>> +ret = PTR_ERR(data->vdd_mif);
>> +goto err_clocks;
> 
> This won't disable vdd_int regulator.

I don't

Re: [PATCH 2/2] net: Implement SO_PEERCGROUP

2014-03-12 Thread Andy Lutomirski

On Wed, Mar 12, 2014 at 6:43 PM, Simo Sorce  wrote:
> On Wed, 2014-03-12 at 18:21 -0700, Andy Lutomirski wrote:
>> On Wed, Mar 12, 2014 at 6:17 PM, Simo Sorce  wrote:
>> > On Wed, 2014-03-12 at 14:19 -0700, Andy Lutomirski wrote:
>> >> On Wed, Mar 12, 2014 at 2:16 PM, Simo Sorce  wrote:
>> >>
>> >> >
>> >> > Connection time is all we do and can care about.
>> >>
>> >> You have not answered why.
>> >
>> > We are going to disclose information to the peer based on policy that
>> > depends on the cgroup the peer is part of. All we care for is who opened
>> > the connection, if the peer wants to pass on that information after it
>> > has obtained it there is nothing we can do, so connection time is all we
>> > really care about.
>>
>> Can you give a realistic example?
>>
>> I could say that I'd like to disclose information to processes based
>> on their rlimits at the time they connected, but I don't think that
>> would carry much weight.
>
> We want to be able to show different user's list from SSSD based on the
> docker container that is asking for it.
>
> This works by having libnsss_sss.so from the containerized application
> connect to an SSSD daemon running on the host or in another container.
>
> The only way to distinguish between containers "from the outside" is to
> lookup the cgroup of the requesting process. It has a unique container
> ID, and can therefore be mapped to the appropriate policy that will let
> us decide which 'user domain' to serve to the container.
>

I can think of at least three other ways to do this.

1. Fix Docker to use user namespaces and use the uid of the requesting
process via SCM_CREDENTIALS.

2. Docker is a container system, so use the "container" (aka
namespace) APIs.  There are probably several clever things that could
be done with /proc//ns.

3. Given that Docker uses network namespaces, I assume that the socket
connection between the two sssd instances either comes from Docker
itself or uses socket inodes.  In either case, the same mechanism
should be usable for authentication.

On an unrelated note, since you seem to have found a way to get unix
sockets to connect the inside and outside of a Docker container, it
would be awesome if Docker could use the same mechanism to pass TCP
sockets around rather than playing awful games with virtual networks.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 2/2] iio: Add AS3935 lightning sensor support

2014-03-12 Thread Marek Vasut

On Wednesday, March 12, 2014 at 01:53:14 PM, Matt Ranostay wrote:
> AS3935 chipset can detect lightning strikes and reports those back as
> events and the estimated distance to the storm.
> 
> Signed-off-by: Matt Ranostay 

Reviewed-by: Marek Vasut 

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/3] mfd: max8997: handle IRQs using regmap

2014-03-12 Thread Chanwoo Choi

Hi Robert,

On 03/12/2014 10:37 PM, Robert Baldyga wrote:
> This patch modifies mfd driver to use regmap for handling interrupts.
> It allows to simplify irq handling process. This modifications needed
> to make small changes in function drivers, which use interrupts.
> 
> Signed-off-by: Robert Baldyga 
> ---
>  drivers/extcon/extcon-max8997.c |   35 ++--
>  drivers/mfd/Kconfig |2 +-
>  drivers/mfd/Makefile|2 +-
>  drivers/mfd/max8997-irq.c   |  373 
> ---
>  drivers/mfd/max8997.c   |  113 ++-
>  drivers/rtc/rtc-max8997.c   |2 +-
>  include/linux/mfd/max8997-private.h |   65 +-
>  7 files changed, 183 insertions(+), 409 deletions(-)
>  delete mode 100644 drivers/mfd/max8997-irq.c
> 
> diff --git a/drivers/extcon/extcon-max8997.c b/drivers/extcon/extcon-max8997.c
> index f258c08..15fc5c0 100644
> --- a/drivers/extcon/extcon-max8997.c
> +++ b/drivers/extcon/extcon-max8997.c
> @@ -46,15 +46,15 @@ struct max8997_muic_irq {
>  };
>  
>  static struct max8997_muic_irq muic_irqs[] = {
> - { MAX8997_MUICIRQ_ADCError, "muic-ADCERROR" },
> - { MAX8997_MUICIRQ_ADCLow,   "muic-ADCLOW" },
> - { MAX8997_MUICIRQ_ADC,  "muic-ADC" },
> - { MAX8997_MUICIRQ_VBVolt,   "muic-VBVOLT" },
> - { MAX8997_MUICIRQ_DBChg,"muic-DBCHG" },
> - { MAX8997_MUICIRQ_DCDTmr,   "muic-DCDTMR" },
> - { MAX8997_MUICIRQ_ChgDetRun,"muic-CHGDETRUN" },
> - { MAX8997_MUICIRQ_ChgTyp,   "muic-CHGTYP" },
> - { MAX8997_MUICIRQ_OVP,  "muic-OVP" },
> + { MAX8997_MUICIRQ_ADCERROR, "MUIC-ADCERROR" },
> + { MAX8997_MUICIRQ_ADCLOW,   "MUIC-ADCLOW" },
> + { MAX8997_MUICIRQ_ADC,  "MUIC-ADC" },
> + { MAX8997_MUICIRQ_VBVOLT,   "MUIC-VBVOLT" },
> + { MAX8997_MUICIRQ_DBCHG,"MUIC-DBCHG" },
> + { MAX8997_MUICIRQ_DCDTMR,   "MUIC-DCDTMR" },
> + { MAX8997_MUICIRQ_CHGDETRUN,"MUIC-CHGDETRUN" },
> + { MAX8997_MUICIRQ_CHGTYP,   "MUIC-CHGTYP" },
> + { MAX8997_MUICIRQ_OVP,  "MUIC-OVP" },
>  };


Why did you modify interrput name? Did you have some reason?
I think this modification don't need it.

>  
>  /* Define supported cable type */
> @@ -553,17 +553,17 @@ static void max8997_muic_irq_work(struct work_struct 
> *work)
>   }
>  
>   switch (irq_type) {
> - case MAX8997_MUICIRQ_ADCError:
> - case MAX8997_MUICIRQ_ADCLow:
> + case MAX8997_MUICIRQ_ADCERROR:
> + case MAX8997_MUICIRQ_ADCLOW:
>   case MAX8997_MUICIRQ_ADC:
>   /* Handle all of cable except for charger cable */
>   ret = max8997_muic_adc_handler(info);
>   break;
> - case MAX8997_MUICIRQ_VBVolt:
> - case MAX8997_MUICIRQ_DBChg:
> - case MAX8997_MUICIRQ_DCDTmr:
> - case MAX8997_MUICIRQ_ChgDetRun:
> - case MAX8997_MUICIRQ_ChgTyp:
> + case MAX8997_MUICIRQ_VBVOLT:
> + case MAX8997_MUICIRQ_DBCHG:
> + case MAX8997_MUICIRQ_DCDTMR:
> + case MAX8997_MUICIRQ_CHGDETRUN:
> + case MAX8997_MUICIRQ_CHGTYP:

I don't agree the modification of MUIC itnerrput.

>   /* Handle charger cable */
>   ret = max8997_muic_chg_handler(info);
>   break;
> @@ -679,7 +679,8 @@ static int max8997_muic_probe(struct platform_device 
> *pdev)
>   struct max8997_muic_irq *muic_irq = _irqs[i];
>   unsigned int virq = 0;
>  
> - virq = irq_create_mapping(max8997->irq_domain, muic_irq->irq);
> + virq = regmap_irq_get_virq(max8997->irq_data_muic,
> + muic_irq->irq);
>   if (!virq) {
>   ret = -EINVAL;
>   goto err_irq;
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 2871a65..2273574 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -388,7 +388,7 @@ config MFD_MAX8997
>   depends on I2C=y
>   select MFD_CORE
>   select REGMAP_I2C
> - select IRQ_DOMAIN
> + select REGMAP_IRQ
>   help
> Say yes here to add support for Maxim Semiconductor MAX8997/8966.
> This is a Power Management IC with RTC, Flash, Fuel Gauge, Haptic,
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index f5a7b2c..4cec8ad 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -119,7 +119,7 @@ obj-$(CONFIG_MFD_MAX77693)+= max77693.o 
> max77693-irq.o
>  obj-$(CONFIG_MFD_MAX8907)+= max8907.o
>  max8925-objs := max8925-core.o max8925-i2c.o
>  obj-$(CONFIG_MFD_MAX8925)+= max8925.o
> -obj-$(CONFIG_MFD_MAX8997)+= max8997.o max8997-irq.o
> +obj-$(CONFIG_MFD_MAX8997)+= max8997.o
>  obj-$(CONFIG_MFD_MAX8998)+= max8998.o max8998-irq.o
>  
>  pcf50633-objs:= pcf50633-core.o pcf50633-irq.o
> diff --git a/drivers/mfd/max8997-irq.c b/drivers/mfd/max8997-irq.c
> deleted file mode 100644
>

[PATCH v3] drivers: mfd: silence compiler warning in sec-core.c

2014-03-12 Thread Pankaj Dubey

When used 64bit compiler GCC warns as
drivers/mfd/sec-core.c:199:10: warning:
cast from pointer to integer of different size [-Wpointer-to-int-cast]

Signed-off-by: Pankaj Dubey 
---
 drivers/mfd/sec-core.c   |6 +++---
 include/linux/mfd/samsung/core.h |2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mfd/sec-core.c b/drivers/mfd/sec-core.c
index ce32538..c6088c5 100644
--- a/drivers/mfd/sec-core.c
+++ b/drivers/mfd/sec-core.c
@@ -189,17 +189,17 @@ static struct sec_platform_data 
*sec_pmic_i2c_parse_dt_pdata(
 }
 #endif
 
-static inline int sec_i2c_get_driver_data(struct i2c_client *i2c,
+static inline unsigned long sec_i2c_get_driver_data(struct i2c_client *i2c,
const struct i2c_device_id *id)
 {
 #ifdef CONFIG_OF
if (i2c->dev.of_node) {
const struct of_device_id *match;
match = of_match_node(sec_dt_match, i2c->dev.of_node);
-   return (int)match->data;
+   return (unsigned long)match->data;
}
 #endif
-   return (int)id->driver_data;
+   return id->driver_data;
 }
 
 static int sec_pmic_probe(struct i2c_client *i2c,
diff --git a/include/linux/mfd/samsung/core.h b/include/linux/mfd/samsung/core.h
index a30b53c..05d584b 100644
--- a/include/linux/mfd/samsung/core.h
+++ b/include/linux/mfd/samsung/core.h
@@ -51,7 +51,7 @@ struct sec_pmic_dev {
struct regmap_irq_chip_data *irq_data;
 
int ono;
-   int type;
+   unsigned long type;
bool wakeup;
bool wtsr_smpl;
 };
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 0/5] ARM: dts: exynos: Add missing dt data to bring kernel of Exynos4x12

2014-03-12 Thread Chanwoo Choi

Dear Kukjin,

On 03/12/2014 08:21 PM, Tomasz Figa wrote:
> Hi Chanwoo,
> 
> On 12.03.2014 07:19, Chanwoo Choi wrote:
>> This patch add missing dt data of Exynos4x12 to bring up kernel feature and
>> code clean. This patchset is based on 'v3.15-next/dt-clk-exynos' branch.
>> - git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung.git
>>
>> exynos4x12/exynos4412/exynos4212.dtsi
>> - Add ADC (Analog and Digital Converter) to get raw data
>> - Add PMU (Performance Monitoring Unit) for perf event
>> - Add gps_alive power domain to remove power leakage when gps-alive isn't 
>> used
>> - Remove duplicate dt data of interrput combiner controller
>>
>> exynos4412-trats.dts
>> - Add ADC dt data with ntc thermistor child to read temperature
>>
>> Changes from v1:
>> - Use clock macro name for Exynos4 instead of constant for ADC
>> - Remove unnecessary description about patch content
>> - Move gps-alive power domain's dt data from exynos4x12.dts to exynos4.dts
>> - Move thermistor dt node outside of ADC dt node and modify node name of 
>> thermistor
>>
>> Chanwoo Choi (5):
>>ARM: dts: exynos4x12: Add ADC's dt data to read raw data
>>ARM: dts: exynos4x12: Add PMU dt data to support PMU(Perforamnce 
>> Monitoring Unit)
>>ARM: dts: exynos4x12: Add GPS_ALIVE power domain
>>ARM: dts: exynos: Move common dt data for interrupt combiner controller
>>ARM: dts: exynos4412-trats2: Add ADC/themistor dt data to get temperature 
>> of SoC/battery
>>
>>   arch/arm/boot/dts/exynos4.dtsi  |  5 +
>>   arch/arm/boot/dts/exynos4212.dtsi   | 13 -
>>   arch/arm/boot/dts/exynos4412-trats2.dts | 21 +
>>   arch/arm/boot/dts/exynos4412.dtsi   | 14 --
>>   arch/arm/boot/dts/exynos4x12.dtsi   | 26 ++
>>   5 files changed, 60 insertions(+), 19 deletions(-)
>>
> 
> Reviewed-by: Tomasz Figa 
> 

Please review or comment this patchset.

Best Regards,
Chanwoo Choi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] r8a7790.dtsi: add vin[0-3] nodes

2014-03-12 Thread Simon Horman

On Fri, Mar 07, 2014 at 01:01:35PM +, Ben Dooks wrote:
> Add nodes for the four video input channels on the R8A7790.

Please update the prefix of this subject of this patch to:
ARM: shmobile: r8a7790: 

> 
> Signed-off-by: Ben Dooks 
> ---
>  arch/arm/boot/dts/r8a7790.dtsi | 32 
>  1 file changed, 32 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/r8a7790.dtsi b/arch/arm/boot/dts/r8a7790.dtsi
> index a1e7c39..4c3eafb 100644
> --- a/arch/arm/boot/dts/r8a7790.dtsi
> +++ b/arch/arm/boot/dts/r8a7790.dtsi
> @@ -395,6 +395,38 @@
>   status = "disabled";
>   };
>  
> + vin0: vin@0xe6ef {
> + compatible = "renesas,vin-r8a7790";
> + clocks = <_clks R8A7790_CLK_VIN0>;
> + reg = <0 0xe6ef 0 0x1000>;
> + interrupts = <0 188 IRQ_TYPE_LEVEL_HIGH>;
> + status = "disabled";
> + };
> +
> + vin1: vin@0xe6ef1000 {
> + compatible = "renesas,vin-r8a7790";
> + clocks = <_clks R8A7790_CLK_VIN1>;
> + reg = <0 0xe6ef1000 0 0x1000>;
> + interrupts = <0 189 IRQ_TYPE_LEVEL_HIGH>;
> + status = "disabled";
> + };
> +
> + vin2: vin@0xe6ef2000 {
> + compatible = "renesas,vin-r8a7790";
> + clocks = <_clks R8A7790_CLK_VIN2>;
> + reg = <0 0xe6ef2000 0 0x1000>;
> + interrupts = <0 190 IRQ_TYPE_LEVEL_HIGH>;
> + status = "disabled";
> + };
> +
> + vin3: vin@0xe6ef3000 {
> + compatible = "renesas,vin-r8a7790";
> + clocks = <_clks R8A7790_CLK_VIN3>;
> + reg = <0 0xe6ef3000 0 0x1000>;
> + interrupts = <0 191 IRQ_TYPE_LEVEL_HIGH>;
> + status = "disabled";
> + };
> +
>   clocks {
>   #address-cells = <2>;
>   #size-cells = <2>;
> -- 
> 1.9.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] ARM: lager: add vin1 node

2014-03-12 Thread Simon Horman

On Fri, Mar 07, 2014 at 01:01:36PM +, Ben Dooks wrote:
> Add device-tree for vin1 (composite video in) on the
> lager board.

Please update the prefix of the subject of this patch to:
ARM: shmobile: lager: 

> 
> Signed-off-by: Ben Dooks 
> ---
>  arch/arm/boot/dts/r8a7790-lager.dts | 38 
> +
>  1 file changed, 38 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/r8a7790-lager.dts 
> b/arch/arm/boot/dts/r8a7790-lager.dts
> index a087421..7528cfc 100644
> --- a/arch/arm/boot/dts/r8a7790-lager.dts
> +++ b/arch/arm/boot/dts/r8a7790-lager.dts
> @@ -158,6 +158,11 @@
>   renesas,groups = "i2c2";
>   renesas,function = "i2c2";
>   };
> +
> + vin1_pins: vin {
> + renesas,groups = "vin1_data8", "vin1_clk";
> + renesas,function = "vin1";
> + };
>  };
>  
>   {
> @@ -239,8 +244,41 @@
>   status = "ok";
>   pinctrl-0 = <_pins>;
>   pinctrl-names = "default";
> +
> + adv7180: adv7180@0x20 {
> + compatible = "adi,adv7180";
> + reg = <0x20>;
> + remote = <>;
> +
> + port {
> + adv7180_1: endpoint {
> + bus-width = <8>;
> + remote-endpoint = <>;
> + };
> + };
> + };
> +
>  };
>  
>  {
>   status = "ok";
>  };
> +
> +/* composite video input */
> + {
> + pinctrl-0 = <_pins>;
> + pinctrl-names = "default";
> +
> + status = "ok";
> +
> + port {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + vin1ep0: endpoint {
> + remote-endpoint = <_1>;
> + bus-width = <8>;
> + };
> + };
> +};
> +
> -- 
> 1.9.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: Force sector and nr_sects to device alignment and granularity.

2014-03-12 Thread Martin K. Petersen

> "Jeff" == Jeff Moyer  writes:

>> The case that we were seeing was with an SSD that required TRIM on 8k
>> boundaries and with an 8k granularity.  Since the file system was
>> trying to do discards based on 4k alignment the driver complained
>> mightily.

Jeff> but you managed to read my mind well enough.  The question is how
Jeff> high up the stack do you put the logic for this?  Is it worth it
Jeff> to duplicate the checks in the OS that are already done on the
Jeff> device?  I don't know.  Martin, do you have an opinion on this?

I'm no big fan of dropping information.

My original intent with the discard granularity and alignment was to
allow filesystems to use them to influence block allocation and layout.
Not to affect how we issue commands at runtime.

Since a storage device is free to ignore all or parts of any discard
request I'd consider it somewhat broken if it actually complained.
Especially so since the relevant knobs in the standard that we key off
of are performance recommendations and not requirements that commands
must adhere to.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: sh-sci: Neaten dev_ uses

2014-03-12 Thread Simon Horman

On Wed, Mar 12, 2014 at 07:55:50PM +0100, Geert Uytterhoeven wrote:
> Hi Joe,
> 
> On Tue, Mar 11, 2014 at 6:10 PM, Joe Perches  wrote:
> > Add missing newlines and coalesce formats.
> > Realign arguments.
> 
> Thanks!
> 
> > Signed-off-by: Joe Perches 
> 
> Acked-by: Geert Uytterhoeven 

Acked-by: Simon Horman 

Greg, could you pick up this one too?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] serial: sh-sci: Replace printk() by pr_*()

2014-03-12 Thread Simon Horman

On Tue, Mar 11, 2014 at 11:30:52AM +0100, Laurent Pinchart wrote:
> Hi Geert,
> 
> Thank you for the patches.
> 
> I had patches similar to 3/4 and 4/4 in my tree, that's a sign you're going 
> in 
> the right direction (or at least the direction I consider to be right :-)). 
> For the whole series,
> 
> Acked-by: Laurent Pinchart 

Acked-by: Simon Horman 

Greg, could you pick up this series?

> On Tuesday 11 March 2014 11:11:17 Geert Uytterhoeven wrote:
> > From: Geert Uytterhoeven 
> > 
> > Make banner const while we're at it
> > 
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> >  drivers/tty/serial/sh-sci.c |7 +++
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
> > index 7958115e6a51..24ec6ef67984 100644
> > --- a/drivers/tty/serial/sh-sci.c
> > +++ b/drivers/tty/serial/sh-sci.c
> > @@ -428,7 +428,7 @@ static int sci_probe_regmap(struct plat_sci_port *cfg)
> > cfg->regtype = SCIx_HSCIF_REGTYPE;
> > break;
> > default:
> > -   printk(KERN_ERR "Can't probe register map for given port\n");
> > +   pr_err("Can't probe register map for given port\n");
> > return -EINVAL;
> > }
> > 
> > @@ -2389,8 +2389,7 @@ static inline int sci_probe_earlyprintk(struct
> > platform_device *pdev)
> > 
> >  #endif /* CONFIG_SERIAL_SH_SCI_CONSOLE */
> > 
> > -static char banner[] __initdata =
> > -   KERN_INFO "SuperH (H)SCI(F) driver initialized\n";
> > +static const char banner[] __initconst = "SuperH (H)SCI(F) driver
> > initialized";
> > 
> >  static struct uart_driver sci_uart_driver = {
> > .owner  = THIS_MODULE,
> > @@ -2616,7 +2615,7 @@ static int __init sci_init(void)
> >  {
> > int ret;
> > 
> > -   printk(banner);
> > +   pr_info("%s\n", banner);
> > 
> > ret = uart_register_driver(_uart_driver);
> > if (likely(ret == 0)) {
> 
> -- 
> Regards,
> 
> Laurent Pinchart
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [locking/mutexes] WARNING: CPU: 1 PID: 77 at kernel/locking/mutex-debug.c:82 debug_mutex_unlock()

2014-03-12 Thread Jason Low

Hi Fengguang,

Can you try out this patch?

https://lkml.org/lkml/2014/3/12/243

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] net: Implement SO_PEERCGROUP

2014-03-12 Thread Simo Sorce

On Wed, 2014-03-12 at 18:21 -0700, Andy Lutomirski wrote:
> On Wed, Mar 12, 2014 at 6:17 PM, Simo Sorce  wrote:
> > On Wed, 2014-03-12 at 14:19 -0700, Andy Lutomirski wrote:
> >> On Wed, Mar 12, 2014 at 2:16 PM, Simo Sorce  wrote:
> >>
> >> >
> >> > Connection time is all we do and can care about.
> >>
> >> You have not answered why.
> >
> > We are going to disclose information to the peer based on policy that
> > depends on the cgroup the peer is part of. All we care for is who opened
> > the connection, if the peer wants to pass on that information after it
> > has obtained it there is nothing we can do, so connection time is all we
> > really care about.
> 
> Can you give a realistic example?
> 
> I could say that I'd like to disclose information to processes based
> on their rlimits at the time they connected, but I don't think that
> would carry much weight.

We want to be able to show different user's list from SSSD based on the
docker container that is asking for it.

This works by having libnsss_sss.so from the containerized application
connect to an SSSD daemon running on the host or in another container.

The only way to distinguish between containers "from the outside" is to
lookup the cgroup of the requesting process. It has a unique container
ID, and can therefore be mapped to the appropriate policy that will let
us decide which 'user domain' to serve to the container.

Simo.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] watchdog: fix ARCH_BCM_MOBILE dependency

2014-03-12 Thread Alex Elder

On 03/12/2014 07:00 PM, Guenter Roeck wrote:
> On 03/12/2014 03:12 PM, Alex Elder wrote:
>> Starting with this commit:
>>  047ef2fa rename ARCH_BCM to ARCH_BCM_MOBILE (clocksource)
>> the meaning of the ARCH_BCM config option is changed to represent
>> all Broadcom chips with code in the mach-bcm directory.
>>
>> Configuration options related to specific Broadcom platforms should
>> now use another symbol (currently ARCH_BCM_MOBILE, ARCH_BCM2835, or
>> ARCH_BCM_5301X).
>>
>> The BCM_KONA_WDT config option indicates a dependency on ARCH_BCM,
>> but it should be ARCH_BCM_MOBILE instead.  Fix that.
>>
>> Signed-off-by: Alex Elder 
> 
> Hi Alex,
> 
> Markus Mayer already submitted a similar patch:
> http://patchwork.roeck-us.net/patch/1263/

Yes someone else pointed that out to me shortly
after I sent mine.  I wasn't aware he had done it.
I discovered the problem independently today and
just sent out the fix.  Thanks.

-Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] amd/pci: Add AMD hostbridge supports for newer AMD systems

2014-03-12 Thread Myron Stowe

Based on Bjorn's latest response it looks as if pci_acpi_scan_root()
is only used as a fallback when the platform's BIOS has not put a
proper ACPI SRAT table and/or _PXM method in place.  It will be
interesting to see if the 'dmesg' log and 'acpidump' information ends
up proving this out.

Regardless of the above, if I'm understanding your patches -
specifically 1/3 - it looks as if the arch/x86/pci/amd_bus.c changes
remove a lot of the CPU family specifics and yield a generally more
generic approach that won't have to be constantly updated as new AMD
CPU families are introduced going forward (exactly the situation we
are now in).  Today, before patch 1/3 here, amd_bus.c only handles AMD
CPU families K8, 0x10, and 0x11 (ref: static struct
pci_hostbridge_probe pci_probes[] __initdata = { ...) and as such,
when the 0x15 family was introduced, the response supplied upstream
was to add to an existing quirk - commit f62ef5f.  I believe with the
approach supplied here - patch 1/3 - the 'quirk_amd_nb_node()' quirk
becomes obsolete and can be removed.  Do you agree?

Myron

On Wed, Mar 12, 2014 at 3:13 PM, Bjorn Helgaas  wrote:
> On Tue, Mar 11, 2014 at 12:12 PM, Bjorn Helgaas  wrote:
>> On Thu, Mar 6, 2014 at 1:03 PM, Suravee Suthikulpanit
>>  wrote:
>>> On 3/6/2014 11:40 AM, Bjorn Helgaas wrote:

 [+cc Yinghai, sorry I didn't think of it before]

 On Wed, Mar 5, 2014 at 11:30 PM, Suravee Suthikulpanit
  wrote:
>
> On 3/5/2014 8:13 PM, Suravee Suthikulanit wrote:
>>
>>
>> On 3/5/2014 3:24 PM, Bjorn Helgaas wrote:
>>>
>>>
>>> [+cc linux-acpi]
>>>
>>> On Wed, Mar 5, 2014 at 2:06 PM,   wrote:


 From: Suravee Suthikulpanit 

 The current code only supports upto AMD hostbridge for family11h.
 This causes PCI numa_node information to be reported incorrectly
 for newer family with multi sockets.
>>>
>>>
>>>
>>> Where is the incorrect reporting?  In ACPI tables?  Is this patch a
>>> way to cover up firmware defects in the ACPI description?  Or is this
>>> for machines without ACPI (it seems unlikely that machines with new
>>> AMD processors would not have ACPI)?
>>
>>
>>
>> This is incorrectly reported in the sysfs for each PCI device (e.g.
>> /devices/pci:50/:50:00.2/numa_node). Without the patch, they
>> return -1.
>>
>> In file arch/x86/pci/acpi.c, in function pci_acpi_scan_root(), it is
>> queries the node information as following:
>>
>> #ifdef CONFIG_ACPI_NUMA
>>   pxm = acpi_get_pxm(device->handle);
>>   if (pxm >= 0)
>>   node = pxm_to_node(pxm);
>>   if (node != -1)
>>   set_mp_bus_to_node(busnum, node);
>>   else
>> #endif
>>   node = get_mp_bus_to_node(busnum);
>>
>> In this case, I see that the acpi_get_pxm() returns -1.  Therefore, it
>> falls back to using the node information in mp_bus_to_node[].  So,
>> without this patch, it would also returning -1.
>>
>> Also, the spec mentioned that the _PXM is optional, so I am not sure if
>> this is a firmware bug.
>
>
> I am not quite familiar with the ACPI for this part.  However, after
> taking
> a look at the code (in driver/acpi/pci_root.c: acpi_pci_root_add()), I
> believe it's trying to locate _PXM method in the DSDT table, in which I
> don't see any _PXM methods.


 This sure looks like a firmware bug.  True, _PXM is optional, but if
 the firmware doesn't provide it, nobody should be surprised that the
 OS thinks everything is in the same proximity domain.

 I would not endorse extending amd_bus.c for new CPUs.  That just
 covers up firmware problems like this, and if you ever run a different
 OS on the box, you'll trip over them again.  And I don't think a patch
 like this will even be a possibility for Windows.
>>>
>>> I understand and am trying to verify this with the BIOS engineers. However,
>>> this is currently affecting family15h servers out in the field.  We can try
>>> to fix ACPI for newer generation of machines, but it won't be practical to
>>> push this BIOS fix to all the BIOS vendors and system vendors for older
>>> platforms, as they tend to.
>>>
>>> What if I localize the extension to the changes to access node information
>>> in the hostbridge for just the famil15h which is mostly used in our main
>>> server products? Would that be acceptable?
>>
>> I assume the system is fully functional even without these patches,
>> right?  The only effect of these changes should be a performance
>> improvement.
>>
>> So the choices are:
>>
>>   1) Change the BIOS to provide _PXM
>>   2) Change Linux with your patches
>>
>> Either way, the customer has to upgrade something.  Choice 1) gets you
>> the performance improvement on all Linux and Windows releases, even
>> the ones that are already in the

Re: [PATCH 2/2] net: Implement SO_PEERCGROUP

2014-03-12 Thread Andy Lutomirski

On Wed, Mar 12, 2014 at 6:17 PM, Simo Sorce  wrote:
> On Wed, 2014-03-12 at 14:19 -0700, Andy Lutomirski wrote:
>> On Wed, Mar 12, 2014 at 2:16 PM, Simo Sorce  wrote:
>>
>> >
>> > Connection time is all we do and can care about.
>>
>> You have not answered why.
>
> We are going to disclose information to the peer based on policy that
> depends on the cgroup the peer is part of. All we care for is who opened
> the connection, if the peer wants to pass on that information after it
> has obtained it there is nothing we can do, so connection time is all we
> really care about.

Can you give a realistic example?

I could say that I'd like to disclose information to processes based
on their rlimits at the time they connected, but I don't think that
would carry much weight.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] net: Implement SO_PEERCGROUP

2014-03-12 Thread Simo Sorce

On Wed, 2014-03-12 at 14:19 -0700, Andy Lutomirski wrote:
> On Wed, Mar 12, 2014 at 2:16 PM, Simo Sorce  wrote:
> > On Wed, 2014-03-12 at 14:12 -0700, Andy Lutomirski wrote:
> >> On Wed, Mar 12, 2014 at 2:00 PM, Andy Lutomirski  
> >> wrote:
> >> > On 03/12/2014 01:46 PM, Vivek Goyal wrote:
> >> >> Implement SO_PEERCGROUP along the lines of SO_PEERCRED. This returns the
> >> >> cgroup of first mounted hierarchy of the task. For the case of client,
> >> >> it represents the cgroup of client at the time of opening the 
> >> >> connection.
> >> >> After that client cgroup might change.
> >> >
> >> > Even if people decide that sending cgroups over a unix socket is a good
> >> > idea, this API has my NAK in the strongest possible sense, for whatever
> >> > my NAK is worth.
> >> >
> >> > IMO SO_PEERCRED is a disaster.  Calling send(2) or write(2) should
> >> > *never* imply the use of a credential.  A program should always have to
> >> > *explicitly* request use of a credential.  What you want is SCM_CGROUP.
> >> >
> >> > (I've found privilege escalations before based on this observation, and
> >> > I suspect I'll find them again.)
> >> >
> >> >
> >> > Note that I think that you really want SCM_SOMETHING_ELSE and not
> >> > SCM_CGROUP, but I don't know what the use case is yet.
> >>
> >> This might not be quite as awful as I thought.  At least you're
> >> looking up the cgroup at connection time instead of at send time.
> >>
> >> OTOH, this is still racy -- the socket could easily outlive the cgroup
> >> that created it.
> >
> > I think you do not understand how this whole problem space works.
> >
> > The problem is exactly the same as with SO_PEERCRED, so we are taking
> > the same proven solution.
> 
> You mean the same proven crappy solution?
> 
> >
> > Connection time is all we do and can care about.
> 
> You have not answered why.

We are going to disclose information to the peer based on policy that
depends on the cgroup the peer is part of. All we care for is who opened
the connection, if the peer wants to pass on that information after it
has obtained it there is nothing we can do, so connection time is all we
really care about.

Simo.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] tracing: Fix array size mismatch in format string

2014-03-12 Thread Vaibhav Nagarnaik

On Wed, Mar 12, 2014 at 5:53 PM, Steven Rostedt  wrote:
> Your timing here isn't that great either, because I leave tomorrow for
> another conference, and I'm currently trying to get everything ready
> for that trip. Could you ping me again on Tuesday?

Will do.

Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread Thomas Gleixner

On Thu, 13 Mar 2014, Thomas Gleixner wrote:

> On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> > Thus follow the original idea to execute acpi_early_init() before
> > efi_enter_virtual_mode() to help the EFI people for now and we can
> > revisit the other problem that commit 73f7d1ca3263 attempted to
> > address in the future (if really necessary).
> 
> It's not necessary at all. In fact we really want to get rid of the
> arch specific cmos stuff which is an historical leftover.
> 
> I talked to John Stultz earlier today and he agrees that there are
> only a few trivial things to add to the RTC subsystem to make this
> work.
> 
> From the timekeeping POV there is absolutely no need to set the wall
> clock time early. The kernel boot phase does not care about wall time
> at all. We should have it done before we hit userspace, but not even
> that is a hard requirement.
> 
> That TAD/EFI time mess is not going to happen before that is solved.

Though there was one odd request versus randomness to take the RTC
into account very early on boot. I really can't make any sense of it.

How helpful is entropy which is added by something which can be
reevaluated by looking at the boot time? The randomness factor of the
standard 1 sec resolution is not that amazing either.

Why don't we make use of the inherent randomness of todays cpus which
will help ALL architectures and systems independent of early RTC
availablity? Something along these lines will add a way better initial
entropy than any RTC can provide:

u64 random_init(void)
{
u64 i = 0, tmp = SEED, t = sched_clock();
u64 rnd = (long) t;

for (; (sched_clock() < (t + X) && i < MINLOOPS; tmp += SOMETHING, i++)
  rnd = some_useful_rng_algo(rnd, tmp, sched_clock());

return rnd;
}

Tune X and MINLOOPS to your needs and place the call of random_init()
to a point where most sane systems have reached the point where a high
resolution sched_clock is available and the system is
preemtible. That's the case quite early in the boot process. Add a
synchronization point to finalize that before we have any serious
user.

Even if no high resolution sched clock is available at this point the
randomness of the call versus the breakout of the loop will be unique
per boot process and way better than anything we have right now all
across the architecture space.

Add some randomness injection from the sched_clock() based runtime of
the various initcalls to it and we have a way better baseline than we
have now for all of our architectures.

Enforcing some early RTC availability for the sake of randomness does
not make sense when we do not exploit better sources which are
available on all systems in the first place.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread Thomas Gleixner

On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> Thus follow the original idea to execute acpi_early_init() before
> efi_enter_virtual_mode() to help the EFI people for now and we can
> revisit the other problem that commit 73f7d1ca3263 attempted to
> address in the future (if really necessary).

It's not necessary at all. In fact we really want to get rid of the
arch specific cmos stuff which is an historical leftover.

I talked to John Stultz earlier today and he agrees that there are
only a few trivial things to add to the RTC subsystem to make this
work.

>From the timekeeping POV there is absolutely no need to set the wall
clock time early. The kernel boot phase does not care about wall time
at all. We should have it done before we hit userspace, but not even
that is a hard requirement.

That TAD/EFI time mess is not going to happen before that is solved.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] tracing: Fix array size mismatch in format string

2014-03-12 Thread Steven Rostedt

On Wed, 12 Mar 2014 17:16:20 -0700
Vaibhav Nagarnaik  wrote:

> Hi Steven,
> 
> Any chance you can take a look at this patch?
> 

Grumble, I thought I pulled this in already. I may have been working on
it and then got distracted by my day job, and it got lost in the
shuffle.

Your timing here isn't that great either, because I leave tomorrow for
another conference, and I'm currently trying to get everything ready
for that trip. Could you ping me again on Tuesday?

Thanks,

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the staging tree with the tree

2014-03-12 Thread Mark Brown

Hi Greg,

Today's linux-next merge of the staging tree got a conflict in
drivers/media/v4l2-core/v4l2-of.c between commit b9db140c1e4644d
("[media] v4l: of: Support empty port nodes") from the v4l tree and
commit fd9fdb78a9bf ("[media] of: move graph helpers from
drivers/media/v4l2-core to drivers/of") from the staging tree.

I fixed it up by essentially dropping the support for empty port nodes
since there were more context differences than I was comfortable with
in the changes in the new code.


pgpnUBusnfx4b.pgp
Description: PGP signature

Re: [for-next][PATCH 08/20] tracing: Warn if a tracepoint is not set via debugfs

2014-03-12 Thread Steven Rostedt

On Wed, 12 Mar 2014 19:51:01 + (UTC)
Mathieu Desnoyers  wrote:

> This only leaves tracepoints in header files and the impact of LTO as
> requirements for having tracepoint callsites with the same name across
> modules.

The only thing that needs to be unique is the struct tracepoint
__tracepoint_##name. There should not be any duplicates of those. I
can't see how the LTO would duplicate a data structure without screwing
everything (not just tracepoints) up.

We can still have more than one trace_##name() called, as that is
handled by the static key.

Note, I'm scrambling to get ready for my trip tomorrow. Thus, I'm not
as much at the computer. I may work on some patches in my 6 hour
layover though.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH] slub: fix high order page allocation problem with __GFP_NOFAIL

2014-03-12 Thread David Rientjes

On Wed, 12 Mar 2014, Joonsoo Kim wrote:

> SLUB already try to allocate high order page with clearing __GFP_NOFAIL.
> But, when allocating shadow page for kmemcheck, it missed clearing
> the flag. This trigger WARN_ON_ONCE() reported by Christian Casteyde.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=65991
> https://lkml.org/lkml/2013/12/3/764
> 
> This patch fix this situation by using same allocation flag as original
> allocation.
> 
> Reported-by: Christian Casteyde 
> Signed-off-by: Joonsoo Kim 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] kallsyms: handle special absolute symbols

2014-03-12 Thread Kees Cook

On Wed, Mar 12, 2014 at 5:30 PM, Rusty Russell  wrote:
> Kees Cook  writes:
>> On Fri, Mar 7, 2014 at 5:00 PM, Kees Cook  wrote:
>>> This forces the entire per_cpu range to be reported as absolute
>>> without losing their linker symbol types, when the per_cpu area is
>>> 0-based. Without this, the variables are incorrectly shown as relocated
>>> under kASLR on x86_64.
>>>
>>> Several kallsyms output in different boot states for comparison of
>>> various symbols:
>>>
>>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.nokaslr
>>>  D __per_cpu_start
>>> 4000 D gdt_page
>>> 00014280 D __per_cpu_end
>>> 810001c8 T _stext
>>> 81ee53c0 D __per_cpu_offset
>>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr1
>>> 1f20 D __per_cpu_start
>>> 1f204000 D gdt_page
>>> 1f214280 D __per_cpu_end
>>> a02001c8 T _stext
>>> a10e53c0 D __per_cpu_offset
>>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr2
>>> 0d40 D __per_cpu_start
>>> 0d404000 D gdt_page
>>> 0d414280 D __per_cpu_end
>>> 8e4001c8 T _stext
>>> 8f2e53c0 D __per_cpu_offset
>>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr-fixed
>>>  D __per_cpu_start
>>> 4000 D gdt_page
>>> 00014280 D __per_cpu_end
>>> adc001c8 T _stext
>>> aeae53c0 D __per_cpu_offset
>>>
>>> Signed-off-by: Kees Cook 
>>> ---
>>> v2:
>>>  - only force absolute when per_cpu starts at 0.
>>> ---
>>>  scripts/kallsyms.c |   20 +++-
>>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
>>> index 08f30ac5b07d..d3f93b8eb277 100644
>>> --- a/scripts/kallsyms.c
>>> +++ b/scripts/kallsyms.c
>>> @@ -34,6 +34,7 @@ struct sym_entry {
>>> unsigned int len;
>>> unsigned int start_pos;
>>> unsigned char *sym;
>>> +   int force_absolute;
>>>  };
>>>
>>>  struct addr_range {
>>> @@ -51,6 +52,14 @@ static struct addr_range text_ranges[] = {
>>>  #define text_range_text (_ranges[0])
>>>  #define text_range_inittext (_ranges[1])
>>>
>>> +/*
>>> + * Variables in these ranges, when the start is 0 based, will be forced to
>>> + * be handled as absolute addresses.
>>> + */
>>> +static struct addr_range abs_ranges[] = {
>>> +   { "__per_cpu_start","__per_cpu_end", -1ULL, 0 },
>>> +};
>>> +
>>>  static struct sym_entry *table;
>>>  static unsigned int table_size, table_cnt;
>>>  static int all_symbols = 0;
>>> @@ -165,6 +174,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)
>>> }
>>> strcpy((char *)s->sym + 1, str);
>>> s->sym[0] = stype;
>>> +   s->force_absolute = 0;
>>> +
>>> +   /* Check if we've found special absolute symbol range. */
>>> +   check_symbol_range(sym, s->addr, abs_ranges, 
>>> ARRAY_SIZE(abs_ranges));
>>>
>>> return 0;
>>>  }
>>> @@ -211,6 +224,11 @@ static int symbol_valid(struct sym_entry *s)
>>> if (s->addr < kernel_start_addr)
>>> return 0;
>>>
>>> +   /* Force zero-based range special symbols into being absolute. */
>>> +   i = symbol_in_range(s, abs_ranges, ARRAY_SIZE(abs_ranges));
>>> +   if (i >= 0 && abs_ranges[i].start == 0)
>>> +   s->force_absolute = 1;
>>
>> Rusty, is this 0-detection workable for you? If so, should you or akpm
>> carry this series for 3.15?
>
> Damn, sorry, I wrote this patch and seems like I didn't actually send it
> out.  No wonder you didn't respond :)

Ah-ha, I was wondering. :)

> This applies on top of your first cleanup patch:
>
> kallsyms: fix percpu vars on x86-64 with relocation.
>
> x86-64 has a problem: per-cpu variables are actually represented by
> their absolute offsets within the per-cpu area, but the symbols are
> not emitted as absolute.  Thus kallsyms naively creates them as offsets
> from _text, meaning their values change if the kernel is relocated
> (especially noticeable with CONFIG_RANDOMIZE_BASE):
>
>  $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.nokaslr
>   D __per_cpu_start
>  4000 D gdt_page
>  00014280 D __per_cpu_end
>  810001c8 T _stext
>  81ee53c0 D __per_cpu_offset
>  $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr1
>  1f20 D __per_cpu_start
>  1f204000 D gdt_page
>  1f214280 D __per_cpu_end
>  a02001c8 T _stext
>  a10e53c0 D __per_cpu_offset
>
> Making them absolute symbols is the Right Thing, but requires fixes to
> the relocs tool.  So for the moment, we add a --absolute-percpu option
> which makes them absolute from a kallsyms perspective:

Why not just do this with 0-base-address detection like my v2? That
would mean we don't need to remember to add this flag in the future to
imagined new architectures that might want this 0-based per_cpu
feature.

-Kees

>
>  $ egrep '

Re: linux-next: build failure after merge of the driver-core tree

2014-03-12 Thread Benjamin Herrenschmidt

On Wed, 2014-03-12 at 16:21 -0400, Tejun Heo wrote:
> It's a series of rather complex patches.  I really don't think
> duplicating them is a good idea.  We can either resurrect the old API
> to kill it again or set up a merge branch which I don't think is too
> unusual in situations like this.

Right, a topic branch that gets merged in both driver-core-next and
powerpc-next.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fs: fix i_writecount on shmem and friends

2014-03-12 Thread Al Viro

On Wed, Mar 12, 2014 at 11:30:09PM +0100, David Herrmann wrote:
> Hi
> 
> > I think it's trying to fix the problem in the wrong place.  The bug is real,
> > all right, but it's not that alloc_file() for non-regulars doesn't grab
> > writecount; it's that drop_file_write_access() drops it for those.
> >
> > What the hell would we want to play with that counter for, anyway?  It's not
> > as if they could be mmapped, so all it does is making pipe(2) and socket(2)
> > more costly, for no visible reason.
> 
> Please see:
>   shmem_zero_setup()
> shmem_file_setup()
>   __shmem_file_setup()
> alloc_file()
> 
> shmem_zero_setup() is used by /dev/zero (drivers/char/mem.c) and
> mmap(MAP_ANON). So yes, we do call mmap() on these files.

We do, but how do you get anything even attempt deny_write_access() on
those?  And what would the semantics of that be, anyway?

> I also disagree on "for no visible reason". The reason to do this is
> uniformity. We make i_writecount work on all inodes regardless how
> they got created. Breaking consistent behavior just to save an
> atomic_inc_unless_negative() in the alloc_file() path seems
> unreasonable to me.

i_writecount is a hack used to implement MAP_DENYWRITE (ignored since way
back) and to provide historical execve() behaviour (note, BTW, that it's
not consistent - e.g. it does apply to binary itself, but not to shared
libraries; scripts don't get much protection either - if the sucker was
opened for write during execve(), you get ETXTBUSY, but as soon as execve()
has opened the interpreter, script itself can be opened for write just fine).

Maintaining it for pipes and sockets is just plain nuts.

> Anyhow, I haven't found any bug if we follow your recommendation.
> Every path using alloc_file() either prevents write access on the
> underlying inode (eg., anon-inode) or prevents user-space from getting
> an FD (all the shmem_file_setup() paths). So it's up to you. Feel free
> to fix it yourself, otherwise I will send a second patch following
> your idea tomorrow.

We want the same in __get_file_write_access() as well (i.e. doing nothing
if the file isn't regular).

file_take_write(), f_mnt_write_state and the rest of CONFIG_DEBUG_WRITECOUNT
stuff should probably just disappear.

TBH, I'm none too happy about all those if (!special_file()) in that logics.
I would rather have do_dentry_open() do something like

if ((f->f_mode & FMODE_WRITE) && !special_file(inode->i_mode)) {
error = get_write_access(inode);
if (error)
goto cleanup_file;
error = __mnt_want_write(f->f_path.mnt);
if (error) {
put_write_access(inode);
goto cleanup_file;
}
f->f_mode |= FMODE_WRITER;
}

cleanup_all:
fops_put(f->f_op);
if (f->f_mode & FMODE_WRITER) {
put_write_access(inode);
__mnt_drop_write(f->f_path.mnt);
}
cleanup_file:

fput() would just do

if (file->f_mode & FMODE_WRITER) {
put_write_access(inode);
__mnt_drop_write(mnt);
}

and alloc_file() would stop playing these games:
/*
 * These mounts don't really matter in practice
 * for r/o bind mounts.  They aren't userspace-
 * visible.  We do this for consistency, and so
 * that we can do debugging checks at __fput()
 */
if ((mode & FMODE_WRITE) && 
!special_file(path->dentry->d_inode->i_mode)) {
file_take_write(file);
WARN_ON(mnt_clone_write(path->mnt));
}

And to hell with __get_file_write_access()/drop_file_write_access().
IOW, FMODE_WRITER == "we have bumped i_writecount and writer count on
vfsmount".  Makes life much simpler...

IOW, how about the following (completely untested):

diff --git a/fs/file_table.c b/fs/file_table.c
index 5b24008..ce1504f 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -52,7 +52,6 @@ static void file_free_rcu(struct rcu_head *head)
 static inline void file_free(struct file *f)
 {
percpu_counter_dec(_files);
-   file_check_state(f);
call_rcu(>f_u.fu_rcuhead, file_free_rcu);
 }

@@ -178,47 +177,12 @@ struct file *alloc_file(struct path *path, fmode_t mode,
file->f_mapping = path->dentry->d_inode->i_mapping;
file->f_mode = mode;
file->f_op = fop;
-
-   /*
-* These mounts don't really matter in practice
-* for r/o bind mounts.  They aren't userspace-
-* visible.  We do this for consistency, and so
-* that we can do debugging checks at __fput()
-*/
-   if ((mode & FMODE_WRITE) && 
!special_file(path->dentry->d_inode->i_mode)) {
-   file_take_write(file);
-   WARN_ON(mnt_clone_write(path->mnt));
-   }
if ((mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)

Re: [PATCH 2/2] mm: Changed pr_warning() to pr_warn()

2014-03-12 Thread Choi

Okay, i'll practice on file in drivers/staging.
Thank you for your help :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] kallsyms: handle special absolute symbols

2014-03-12 Thread Rusty Russell

Kees Cook  writes:
> On Fri, Mar 7, 2014 at 5:00 PM, Kees Cook  wrote:
>> This forces the entire per_cpu range to be reported as absolute
>> without losing their linker symbol types, when the per_cpu area is
>> 0-based. Without this, the variables are incorrectly shown as relocated
>> under kASLR on x86_64.
>>
>> Several kallsyms output in different boot states for comparison of
>> various symbols:
>>
>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.nokaslr
>>  D __per_cpu_start
>> 4000 D gdt_page
>> 00014280 D __per_cpu_end
>> 810001c8 T _stext
>> 81ee53c0 D __per_cpu_offset
>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr1
>> 1f20 D __per_cpu_start
>> 1f204000 D gdt_page
>> 1f214280 D __per_cpu_end
>> a02001c8 T _stext
>> a10e53c0 D __per_cpu_offset
>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr2
>> 0d40 D __per_cpu_start
>> 0d404000 D gdt_page
>> 0d414280 D __per_cpu_end
>> 8e4001c8 T _stext
>> 8f2e53c0 D __per_cpu_offset
>> $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr-fixed
>>  D __per_cpu_start
>> 4000 D gdt_page
>> 00014280 D __per_cpu_end
>> adc001c8 T _stext
>> aeae53c0 D __per_cpu_offset
>>
>> Signed-off-by: Kees Cook 
>> ---
>> v2:
>>  - only force absolute when per_cpu starts at 0.
>> ---
>>  scripts/kallsyms.c |   20 +++-
>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
>> index 08f30ac5b07d..d3f93b8eb277 100644
>> --- a/scripts/kallsyms.c
>> +++ b/scripts/kallsyms.c
>> @@ -34,6 +34,7 @@ struct sym_entry {
>> unsigned int len;
>> unsigned int start_pos;
>> unsigned char *sym;
>> +   int force_absolute;
>>  };
>>
>>  struct addr_range {
>> @@ -51,6 +52,14 @@ static struct addr_range text_ranges[] = {
>>  #define text_range_text (_ranges[0])
>>  #define text_range_inittext (_ranges[1])
>>
>> +/*
>> + * Variables in these ranges, when the start is 0 based, will be forced to
>> + * be handled as absolute addresses.
>> + */
>> +static struct addr_range abs_ranges[] = {
>> +   { "__per_cpu_start","__per_cpu_end", -1ULL, 0 },
>> +};
>> +
>>  static struct sym_entry *table;
>>  static unsigned int table_size, table_cnt;
>>  static int all_symbols = 0;
>> @@ -165,6 +174,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)
>> }
>> strcpy((char *)s->sym + 1, str);
>> s->sym[0] = stype;
>> +   s->force_absolute = 0;
>> +
>> +   /* Check if we've found special absolute symbol range. */
>> +   check_symbol_range(sym, s->addr, abs_ranges, ARRAY_SIZE(abs_ranges));
>>
>> return 0;
>>  }
>> @@ -211,6 +224,11 @@ static int symbol_valid(struct sym_entry *s)
>> if (s->addr < kernel_start_addr)
>> return 0;
>>
>> +   /* Force zero-based range special symbols into being absolute. */
>> +   i = symbol_in_range(s, abs_ranges, ARRAY_SIZE(abs_ranges));
>> +   if (i >= 0 && abs_ranges[i].start == 0)
>> +   s->force_absolute = 1;
>
> Rusty, is this 0-detection workable for you? If so, should you or akpm
> carry this series for 3.15?

Damn, sorry, I wrote this patch and seems like I didn't actually send it
out.  No wonder you didn't respond :)

This applies on top of your first cleanup patch:

kallsyms: fix percpu vars on x86-64 with relocation.

x86-64 has a problem: per-cpu variables are actually represented by
their absolute offsets within the per-cpu area, but the symbols are
not emitted as absolute.  Thus kallsyms naively creates them as offsets
from _text, meaning their values change if the kernel is relocated
(especially noticeable with CONFIG_RANDOMIZE_BASE):

 $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.nokaslr
  D __per_cpu_start
 4000 D gdt_page
 00014280 D __per_cpu_end
 810001c8 T _stext
 81ee53c0 D __per_cpu_offset
 $ egrep ' (gdt_|_(stext|_per_cpu_))' /root/kallsyms.kaslr1
 1f20 D __per_cpu_start
 1f204000 D gdt_page
 1f214280 D __per_cpu_end
 a02001c8 T _stext
 a10e53c0 D __per_cpu_offset

Making them absolute symbols is the Right Thing, but requires fixes to
the relocs tool.  So for the moment, we add a --absolute-percpu option
which makes them absolute from a kallsyms perspective:

 $ egrep ' (gdt_|_(stext|_per_cpu_))' /proc/kallsyms # no KASLR
  A __per_cpu_start
 a000 A gdt_page
 00013040 A __per_cpu_end
 802001c8 T _stext
 8099b180 D __per_cpu_offset
 809a3000 D __per_cpu_load
 $ egrep ' (gdt_|_(stext|_per_cpu_))' /proc/kallsyms # With KASLR
  A __per_cpu_start
 a000 A gdt_page
 00013040 A __per_cpu_end
 89c001c8 T _stext
 8a39d180 D

RE: [RFC PATCH] mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus

2014-03-12 Thread Dong, Chuanxiao



> -Original Message-
> From: Ulf Hansson [mailto:ulf.hans...@linaro.org]
> Sent: Wednesday, March 12, 2014 8:41 PM
> To: Chris Ball; Lu, Aaron; Wang, Xiaoming; Dong, Chuanxiao
> Cc: linux-...@vger.kernel.org; Liu, Chuansheng; linux-kernel@vger.kernel.org;
> NeilBrown; Rafael J. Wysocki
> Subject: Re: [RFC PATCH] mmc: core: Invoke sdio func driver's PM callbacks 
> from the
> sdio bus
> 
> On 12 March 2014 07:26, Aaron Lu  wrote:
> > On 03/12/2014 11:44 AM, Dong, Chuanxiao wrote:
> >> Hi Aaron,
> >>
> >> This patch is tested on Intel platform, and SDIO function driver's
> suspend/resume callback will only be called once, which fixed this issue. 
> Previously,
> they can be called twice.
> >>
> >> Here is the tested-by:
> >>
> >> Tested-by: xiaoming wang 
> >> Tested-by: Chuanxiao Dong 
> >
> > Thanks a lot for the test!
> >
> > -Aaron
> >
> 
> Thanks for helping out testing!
> 
> Just out of curiosity, which sdio func driver did you use (or maybe it hasn't 
> been
> upstreamed yet)?
> 
> Anyway, I suppose it's ->suspend callback don't return -ENOSYS with the
> expectation of the card to be removed?
> 
> So I assume you want this to go to stable as well, right?

Hi Uffe,

We are testing based on BRCM WiFi driver. It is not in upstream. Actually no 
matter which kind of SDIO function driver it is, if this SDIO host device is an 
ACPI device which contains SDIO device as its children, the SDIO function 
driver's suspend will be called twice.

And you are right, there is no -ENOSYS returned if the card is removed. It's 
better to have this patch go to stable as well.

Thanks
Chuanxiao

> 
> Kind regards
> Uffe
> 
> >>
> >> Thanks
> >> Chuanxiao
> >>
> >>> -Original Message-
> >>> From: Lu, Aaron
> >>> Sent: Wednesday, March 12, 2014 10:36 AM
> >>> To: Ulf Hansson; linux-...@vger.kernel.org; Chris Ball; Liu,
> >>> Chuansheng; Dong, Chuanxiao
> >>> Cc: linux-kernel@vger.kernel.org; NeilBrown; Rafael J. Wysocki
> >>> Subject: Re: [RFC PATCH] mmc: core: Invoke sdio func driver's PM
> >>> callbacks from the sdio bus
> >>>
> >>> Hi Chuansheng & Chuanxiao,
> >>>
> >>> Can you please help us testing this patch on your platform and let
> >>> us know the test result? Thanks.
> >>>
> >>> -Aaron
> >>>
> >>> On 02/28/2014 07:49 PM, Ulf Hansson wrote:
>  The sdio func device is added to the driver model after the card
>  device.
> 
>  This means the sdio func device will be suspend before the card
>  device and thus resumed after. The consequence are the mmc core
>  don't explicity need to protect itself from receiving sdio requests
>  in suspended state. Instead that can be handled from the sdio bus,
>  which is thus invokes the PM callbacks instead of old dummy function.
> 
>  In the case were the sdio func driver don't implement the PM
>  callbacks the mmc core will in the early phase of system suspend,
>  remove the card from the driver model and thus power off it.
> 
>  Cc: Aaron Lu 
>  Cc: NeilBrown 
>  Cc: Rafael J. Wysocki 
>  Signed-off-by: Ulf Hansson 
>  ---
> 
>  Note, this patch has only been compile tested. Would appreciate if
>  some with SDIO and a sdio func driver could help out to test this.
>  Especially the libertas driver would be nice.
> 
>  ---
>   drivers/mmc/core/sdio.c |   45 
>  ---
>   drivers/mmc/core/sdio_bus.c |   14 +-
>   2 files changed, 5 insertions(+), 54 deletions(-)
> 
>  diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
>  index
>  4d721c6..9933e42 100644
>  --- a/drivers/mmc/core/sdio.c
>  +++ b/drivers/mmc/core/sdio.c
>  @@ -943,40 +943,21 @@ static int mmc_sdio_pre_suspend(struct
>  mmc_host
> >>> *host)
>    */
>   static int mmc_sdio_suspend(struct mmc_host *host)  {
>  -   int i, err = 0;
>  -
>  -   for (i = 0; i < host->card->sdio_funcs; i++) {
>  -   struct sdio_func *func = host->card->sdio_func[i];
>  -   if (func && sdio_func_present(func) && func->dev.driver) {
>  -   const struct dev_pm_ops *pmops =
> func->dev.driver->pm;
>  -   err = pmops->suspend(>dev);
>  -   if (err)
>  -   break;
>  -   }
>  -   }
>  -   while (err && --i >= 0) {
>  -   struct sdio_func *func = host->card->sdio_func[i];
>  -   if (func && sdio_func_present(func) && func->dev.driver) {
>  -   const struct dev_pm_ops *pmops =
> func->dev.driver->pm;
>  -   pmops->resume(>dev);
>  -   }
>  -   }
>  -
>  -   if (!err && mmc_card_keep_power(host) &&
> mmc_card_wake_sdio_irq(host))
> >>> {
>  +   if (mmc_card_keep_power(host) && mmc_card_wake_sdio_irq(host))
>  + {
>  mmc_claim_host(host);
>

Re: [PATCH] tracing: Fix array size mismatch in format string

2014-03-12 Thread Vaibhav Nagarnaik

Hi Steven,

Any chance you can take a look at this patch?


Thanks

Vaibhav


On Thu, Feb 13, 2014 at 7:51 PM, Vaibhav Nagarnaik
 wrote:
> In event format strings, the array size is reported in two locations.
> One in array subscript and then via the "size:" attribute. The values
> reported there have a mismatch.
>
> For e.g., in sched:sched_switch the prev_comm and next_comm character
> arrays have subscript values as [32] where as the actual field size is
> 16.
>
> name: sched_switch
> ID: 301
> format:
> field:unsigned short common_type;   offset:0;   size:2; 
> signed:0;
> field:unsigned char common_flags;   offset:2;   size:1; 
> signed:0;
> field:unsigned char common_preempt_count;   offset:3;   
> size:1;signed:0;
> field:int common_pid;   offset:4;   size:4; signed:1;
>
> field:char prev_comm[32];   offset:8;   size:16;
> signed:1;
> field:pid_t prev_pid;   offset:24;  size:4; signed:1;
> field:int prev_prio;offset:28;  size:4; signed:1;
> field:long prev_state;  offset:32;  size:8; signed:1;
> field:char next_comm[32];   offset:40;  size:16;
> signed:1;
> field:pid_t next_pid;   offset:56;  size:4; signed:1;
> field:int next_prio;offset:60;  size:4; signed:1;
>
> After bisection, the following commit was blamed:
> 92edca0 tracing: Use direct field, type and system names
>
> This commit removes the duplication of strings for field->name and
> field->type assuming that all the strings passed in
> __trace_define_field() are immutable. This is not true for arrays, where
> the type string is created in event_storage variable and field->type for
> all array fields points to event_storage.
>
> Use __stringify() to create a string constant for the type string.
>
> Also, get rid of event_storage and event_storage_mutex that are not
> needed anymore.
>
> Signed-off-by: Vaibhav Nagarnaik 
> ---
> * Fix warning from previous version about mixed declarations.
>
>  include/linux/ftrace_event.h | 4 
>  include/trace/ftrace.h   | 7 ++-
>  kernel/trace/trace_events.c  | 6 --
>  kernel/trace/trace_export.c  | 7 ++-
>  4 files changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 4e4cc28..4cdb3a1 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -495,10 +495,6 @@ enum {
> FILTER_TRACE_FN,
>  };
>
> -#define EVENT_STORAGE_SIZE 128
> -extern struct mutex event_storage_mutex;
> -extern char event_storage[EVENT_STORAGE_SIZE];
> -
>  extern int trace_event_raw_init(struct ftrace_event_call *call);
>  extern int trace_define_field(struct ftrace_event_call *call, const char 
> *type,
>   const char *name, int offset, int size,
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 1a8b28d..1ee19a2 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -310,15 +310,12 @@ static struct trace_event_functions 
> ftrace_event_type_funcs_##call = {\
>  #undef __array
>  #define __array(type, item, len)   \
> do {\
> -   mutex_lock(_storage_mutex);   \
> +   char *type_str = #type"["__stringify(len)"]";   \
> BUILD_BUG_ON(len > MAX_FILTER_STR_VAL); \
> -   snprintf(event_storage, sizeof(event_storage),  \
> -"%s[%d]", #type, len); \
> -   ret = trace_define_field(event_call, event_storage, #item, \
> +   ret = trace_define_field(event_call, type_str, #item,   \
>  offsetof(typeof(field), item), \
>  sizeof(field.item),\
>  is_signed_type(type), FILTER_OTHER);   \
> -   mutex_unlock(_storage_mutex); \
> if (ret)\
> return ret; \
> } while (0);
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index e71ffd4..22826c7 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -27,12 +27,6 @@
>
>  DEFINE_MUTEX(event_mutex);
>
> -DEFINE_MUTEX(event_storage_mutex);
> -EXPORT_SYMBOL_GPL(event_storage_mutex);
> -
> -char event_storage[EVENT_STORAGE_SIZE];
> -EXPORT_SYMBOL_GPL(event_storage);
> -
>  LIST_HEAD(ftrace_events);
>  static LIST_HEAD(ftrace_common_fields);
>
> diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
> index 7c3e3e7..ee0a509 100644
> --- a/kernel/trace/trace_export.c
> +++ b/kernel/trace/trace_export.c
> @@ -95,15

Re: SuSE O_DIRECT|O_NONBLOCK overload

2014-03-12 Thread NeilBrown

On Wed, 12 Mar 2014 04:00:15 -0700 Christoph Hellwig 
wrote:

> The SLES12 tree has various patches to implement special
> O_DIRECT|O_NONBLOCK semantics for block devices:
> 
>   
> https://gitorious.org/opensuse/kernel-source/source/806eab3e4b02e798c1ae942440051f81c822ca35:patches.suse/block-nonblock-causes-failfast
> 
> this seems genuinely useful and I'd be really happy if people would do
> this work upstream for two reasons:
> 
>  a) implementing different semantics only in a vendor kernel is a
> nightmare.  No proper way to document it in the man pages for
> example, and silent breakage of applications that expect it to be
> present, or even more nasty not present.
>  b) Which brings us to: we had various issues with adding O_NONBLOCK to
> files that didn't support it before.  How well was this whole feature
> tested?

This "feature" was really just a hack because a particular customer needed
something in a particular situation.

At the core of this in my thinking is the 'failfast' BIO flag ... or 'flags'
really because there are now three of them.

They don't seem to be documented or uniformly supported or used much at
all.  dm-multipath uses one, and btrfs uses another.  There could be value in
using one or more or something in md but as they aren't documented and could
mean almost anything I have stayed away.
I tried adding some sort of 'failfast' support to md once and I would get
occasional failures from regular sata devices which otherwise appeared to be
working perfectly well.  So it seemed that "fast" was altogether *too* fast.

For a particular customer with some particular hardware there were issues
where that hardware could choose not to respond for extended periods.  So we
modified the driver to accept a 'timeout' module parameter and to cause
REQ_FAILFAST_DEV (I think) requests to fail with -ETIMEDOUT if they could not
be serviced in that time.

We then modified md to cope with that particular well-defined semantic. And
hacked "O_NONBLOCK" support in so that mdadm could access the device without
the risk of hanging indefinitely.

I would be happy to bring at least some of this functionality into mainline,
but I would need a "FAILFAST" flag that actually meant something useful and
was sufficiently well documented so that if some driver got it wrong, I would
be justified in blaming the driver for not meeting the expectations that I
encoded into md.

I think that the FAILFAST flag that I need would do some error recovery but
would be time limited.  Maybe a software TLER (Time Limited Error Recovery).

I also think there should probably be just one FAILFAST flag.  Where it was
the DEV or the TRANSPORT or the DRIVER that failed could be returned in the
error code for any caller that cared.  But as I don't know why the one became
three I could well be missing something important.

As for testing, only basic "does it function as expected" testing.
Part of the reason for only modifying O_NONBLOCK behaviour where O_DIRECT was
also set was to make it extremely unlikely that any code would use this
feature except code that specifically needed it.

NeilBrown

signature.asc
Description: PGP signature

Re: [PATCH v6 3/3] arm64: Add architecture support for PCI

2014-03-12 Thread Liviu Dudau

On Wed, Mar 12, 2014 at 05:41:33PM +0900, Jingoo Han wrote:
> On Wednesday, March 05, 2014 8:49 PM, Liviu Dudau wrote:
> > 
> > Use the generic host bridge functions to provide support for
> > PCI Express on arm64. There is no support for ISA memory.
> > 
> > Signed-off-by: Liviu Dudau 
> > Tested-by: Tanmay Inamdar 
> > ---
> >  arch/arm64/Kconfig|  19 +++-
> >  arch/arm64/include/asm/Kbuild |   1 +
> >  arch/arm64/include/asm/io.h   |   3 +-
> >  arch/arm64/include/asm/pci.h  |  49 +
> >  arch/arm64/kernel/Makefile|   1 +
> >  arch/arm64/kernel/pci.c   | 173 
> >  6 files changed, 244 insertions(+), 2 deletions(-)
> >  create mode 100644 arch/arm64/include/asm/pci.h
> >  create mode 100644 arch/arm64/kernel/pci.c
> 
> [.]
> 
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/pci.h
> 
> [.]
> 
> > +
> > +static inline int pci_domain_nr(struct pci_bus *bus)
> > +{
> > +   struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
> > +
> > +   if (bridge)
> > +   return bridge->domain_nr;
> > +
> > +   return 0;
> > +}
> 
> Hi Liviu Dudau,
> 
> When CONFIG_PCI=n, the following build errors happen. :-(
> Would you confirm this?
> 
> In file included from include/linux/pci.h:1393:0,
>  from drivers/scsi/scsi_lib.c:19:
> arch/arm64/include/asm/pci.h:31:19: error: redefinition of 'pci_domain_nr'
>  static inline int pci_domain_nr(struct pci_bus *bus)
>^
> In file included from drivers/scsi/scsi_lib.c:19:0:
> include/linux/pci.h:1383:19: note: previous definition of 'pci_domain_nr' was 
> here
>  static inline int pci_domain_nr(struct pci_bus *bus) { return 0; }
>^
>  .

Hi Jingoo,

I confirm the build error. Sorry for missing out on this test. I've now got out
of some more pressing tasks and I will post v7 tomorrow.

Best regards,
Liviu

> 
> Best regards,
> Jingoo Han
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
---
   .oooO
   (   )
\ (  Oooo.
 \_) (   )
  ) /
 (_/

 One small step
   for me ...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARN_ON() and X session lost from i915 on 3.14-rc6

2014-03-12 Thread Chris Wilson

On Thu, Mar 13, 2014 at 12:30:39AM +0100, Pavel Machek wrote:
> Hi!
> 
> This cost me two half-written mails...
> 
> So far it happened once, so it may be very infrequent; but I do not
> think I seen similar failure from i915 before, so it may be an
> regression. Well...

It's a userspace use-after-free bug. Please file a bug on
bugs.freedesktop.org.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the driver-core tree with the tip tree

2014-03-12 Thread Mark Brown

Hi Greg,

Today's linux-next merge of the driver-core tree got a conflict in 
arch/x86/Kconfig between commit b4df597ae51f ("audit: Add 
CONFIG_HAVE_ARCH_AUDITSYSCALL") from the tip tree and commit 2b9c1f03278 ("x86: 
align x86 arch with generic CPU modalias handling") from the driver-core tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

diff --cc arch/x86/Kconfig
index 81f8485bb6b7,7fab7e0b1a72..
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@@ -127,7 -127,7 +127,8 @@@ config X8
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
select HAVE_CC_STACKPROTECTOR
 +  select HAVE_ARCH_AUDITSYSCALL
+   select GENERIC_CPU_AUTOPROBE
  
  config INSTRUCTION_DECODER
def_bool y


pgp_eEYN8Uker.pgp
Description: PGP signature

Re: [PATCH] watchdog: fix ARCH_BCM_MOBILE dependency

2014-03-12 Thread Guenter Roeck


On 03/12/2014 03:12 PM, Alex Elder wrote:

Starting with this commit:
 047ef2fa rename ARCH_BCM to ARCH_BCM_MOBILE (clocksource)
the meaning of the ARCH_BCM config option is changed to represent
all Broadcom chips with code in the mach-bcm directory.

Configuration options related to specific Broadcom platforms should
now use another symbol (currently ARCH_BCM_MOBILE, ARCH_BCM2835, or
ARCH_BCM_5301X).

The BCM_KONA_WDT config option indicates a dependency on ARCH_BCM,
but it should be ARCH_BCM_MOBILE instead.  Fix that.

Signed-off-by: Alex Elder 


Hi Alex,

Markus Mayer already submitted a similar patch:
http://patchwork.roeck-us.net/patch/1263/

Guenter



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread Rafael J. Wysocki

On Thursday, March 13, 2014 12:49:07 AM Thomas Gleixner wrote:
> On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> > I agree, and we need to fix that for 3.14.  Patch is appended.
> 
> You beat me by a few minutes. Was about to send out the same, just
> with a more spicy changelog :)
> 
> > ---
> > From: Rafael J. Wysocki 
> > Subject: ACPI / init: Invoke early ACPI initialization later
> > 
> > Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before
> > timekeeping_init()) optimistically moved the early ACPI initialization
> > before timekeeping_init(), but that didn't work, because it broke fast
> > TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely
> > for others too).  The reason is that acpi_early_init() enables the SCI
> > and that interferes with the fast TSC calibration mechanism.
> > 
> > Thus follow the original idea to execute acpi_early_init() before
> > efi_enter_virtual_mode() to help the EFI people for now and we can
> > revisit the other problem that commit 73f7d1ca3263 attempted to
> > address in the future (if really necessary).
> 
> Reviewed-by: Thomas Gleixner 

Thanks!

I have a plan to push this to Linus for -rc7.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 0/7] locking: qspinlock

2014-03-12 Thread Dave Chinner

On Wed, Mar 12, 2014 at 07:15:03AM +0100, Peter Zijlstra wrote:
> On Wed, Mar 12, 2014 at 01:31:53PM +1100, Dave Chinner wrote:
> > With the queuing spinlock, I expected to see somewhat better
> > results, but I didn't at first. Turns out if you have any sort of
> > lock debugging turned on, then the code doesn't ever go into the
> > lock slow path and hence does not ever enter the "lock failed" slow
> > path where all the contention fixes are supposed to be.
> 
> Yeah; its a 'feature' of the spinlock debugging to turn all spinlocks
> into test-and-set thingies.
> 
> > Anyway, with all lock debugging turned off, the system hangs
> > the instant I start the multithreaded bulkstat workload. Even the
> > console is unrepsonsive. 
> 
> Oops, I only briefly tested this series in userspace and that seemed to
> work. I'll go prod at it. Thanks for having a look though.
> 
> Is that bstat test any easier/faster to setup/run than the aim7 crap?

Depends. I've got a VM setup with a sparse 100TB block device hosted
on SSDs where I can create 50M inodes using fsmark in about 3
and half minutes. I also have a hacked xfstests::src/bstat.c that is
multithreaded that I then run and it triggers it staight away.

Quite frankly, you don't need bulkstat to produce this lock
contention - you'll see it running this on a wide directory
structure on XFS and an SSD:

$ cat ~/tests/walk-scratch.sh 
#!/bin/bash

echo Walking via find -ctime
echo 3 > /proc/sys/vm/drop_caches
time (
for d in /mnt/scratch/[0-9]* ; do

for i in $d/*; do
(
echo $i
find $i -ctime 1 > /dev/null
) > /dev/null 2>&1
done &
done
wait
)

echo Walking via ls -R
echo 3 > /proc/sys/vm/drop_caches
time (
for d in /mnt/scratch/[0-9]* ; do

for i in $d/*; do
(
echo $i
ls -R $i
) > /dev/null 2>&1
done &
done
wait
)
$

The directory structure I create has 16 top level directories (0-15)
each with 30-40 subdirectories containing 100,000 files each.
There's a thread per top level directory, and running it on a 16p VM
and an SSD that can do 30,000 IOPS will generate sufficient inode
cache pressure to trigger severe lock contention.

My usual test script for this workload runs mkfs, fsmark,
xfs_repair, bstat, walk-scratch, and finally a multi-threaded rm to
clean up. Usual inode numbers are in the 50-100million for zero
length file workloads, 10-20 million for single block (small) files,
and 100,000-1million for larger files. it's great for stressing VFs,
FS and IO level scalability...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] Change ACPI IPMI support to "default y"

2014-03-12 Thread Matthew Garrett

On Thu, 2014-03-13 at 00:00 +0100, Pavel Machek wrote:
> On Tue 2014-02-18 23:15:08, Matthew Garrett wrote:
> > For example, if you load the ACPI power meter driver before you've
> > installed the ACPI IPMI driver you'll typically get failures (most
> > vendors implement it via IPMI).
> 
> Would the right solution be to implement dependency between power
> meter and IMPI?

No. The power meter driver knows nothing about IPMI. It makes no IPMI
calls. There's no requirement that a vendor implement it via IPMI.

-- 
Matthew Garrett

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread Thomas Gleixner

On Thu, 13 Mar 2014, Rafael J. Wysocki wrote:
> I agree, and we need to fix that for 3.14.  Patch is appended.

You beat me by a few minutes. Was about to send out the same, just
with a more spicy changelog :)

> ---
> From: Rafael J. Wysocki 
> Subject: ACPI / init: Invoke early ACPI initialization later
> 
> Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before
> timekeeping_init()) optimistically moved the early ACPI initialization
> before timekeeping_init(), but that didn't work, because it broke fast
> TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely
> for others too).  The reason is that acpi_early_init() enables the SCI
> and that interferes with the fast TSC calibration mechanism.
> 
> Thus follow the original idea to execute acpi_early_init() before
> efi_enter_virtual_mode() to help the EFI people for now and we can
> revisit the other problem that commit 73f7d1ca3263 attempted to
> address in the future (if really necessary).

Reviewed-by: Thomas Gleixner 
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] x86: Remove compat vdso support

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 04:43 PM, Andy Lutomirski wrote:
>>
>> I do hear vsyscall=native still being used as a workaround for problems,
>> but yes, just making it call the kernel is fine, of course.
> 
> Next time you hear that, can you let me know?  I haven't heard of any
> issues since 3.4 IIRC.
> 

Will do.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 10/52] arm, kvm: Fix CPU hotplug callback registration

2014-03-12 Thread Christoffer Dall

On Tue, Mar 11, 2014 at 02:05:38AM +0530, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
> 
>   get_online_cpus();
> 
>   for_each_online_cpu(cpu)
>   init_cpu(cpu);
> 
>   register_cpu_notifier(_cpu_notifier);
> 
>   put_online_cpus();
> 
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
> 
> Instead, the correct and race-free way of performing the callback
> registration is:
> 
>   cpu_notifier_register_begin();
> 
>   for_each_online_cpu(cpu)
>   init_cpu(cpu);
> 
>   /* Note the use of the double underscored version of the API */
>   __register_cpu_notifier(_cpu_notifier);
> 
>   cpu_notifier_register_done();
> 
> 
> Fix the kvm code in arm by using this latter form of callback registration.
> 
> Cc: Christoffer Dall 
> Cc: Gleb Natapov 
> Cc: Russell King 
> Cc: Ingo Molnar 
> Cc: kvm...@lists.cs.columbia.edu
> Cc: k...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Acked-by: Paolo Bonzini 
> Signed-off-by: Srivatsa S. Bhat 
> ---
> 
>  arch/arm/kvm/arm.c |7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index bd18bb8..f0e50a0 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -1051,21 +1051,26 @@ int kvm_arch_init(void *opaque)
>   }
>   }
>  
> + cpu_notifier_register_begin();
> +
>   err = init_hyp_mode();
>   if (err)
>   goto out_err;
>  
> - err = register_cpu_notifier(_init_cpu_nb);
> + err = __register_cpu_notifier(_init_cpu_nb);
>   if (err) {
>   kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
>   goto out_err;
>   }
>  
> + cpu_notifier_register_done();
> +
>   hyp_cpu_pm_init();
>  
>   kvm_coproc_table_init();
>   return 0;
>  out_err:
> + cpu_notifier_register_done();
>   return err;
>  }
>  
> 

Just so we're clear, the existing code was simply racy as not prone to
deadlocks, right?

This makes it clear that the test above for compatible CPUs can be quite
easily evaded by using CPU hotplug, but we don't really have a good
solution for handling that yet...  Hmmm, grumble grumble, I guess if you
hotplug unsupported CPUs on a KVM/ARM system for now, stuff will break.

In any case:
Acked-by: Christoffer Dall 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] x86: Remove compat vdso support

2014-03-12 Thread Andy Lutomirski

On Wed, Mar 12, 2014 at 4:06 PM, H. Peter Anvin  wrote:
> On 03/12/2014 02:49 PM, Andy Lutomirski wrote:
>> On Wed, Mar 12, 2014 at 2:46 PM, Linus Torvalds
>>  wrote:
>>> On Wed, Mar 12, 2014 at 2:37 PM, H. Peter Anvin  
>>> wrote:

 How would that deal with the legacy vsyscall case for x86-64?  Just rely
 on the "legacy vsyscall emulation" (which seems to have its own class of
 problems...)?
>>>
>>> It does?
>>>
>>> We *default* to emulation, and have for over two years now (since
>>> v3.4).  If there are problems with it, we need to fix those.
>>
>> Even in the non-default "vsyscall=native" case, the vsyscall pages
>> just contains syscalls.  It does not need to access the vvar page, the
>> hpet, or anything else that the vdso uses.
>>
>
> Ah, right.  I let that detail slip the mind.
>
> I do hear vsyscall=native still being used as a workaround for problems,
> but yes, just making it call the kernel is fine, of course.

Next time you hear that, can you let me know?  I haven't heard of any
issues since 3.4 IIRC.

--Andy

>
> So yes, this does make it all better.
>
> -hpa
>
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

WARN_ON() and X session lost from i915 on 3.14-rc6

2014-03-12 Thread Pavel Machek

Hi!

This cost me two half-written mails...

So far it happened once, so it may be very infrequent; but I do not
think I seen similar failure from i915 before, so it may be an
regression. Well...

-22 should be EINVAL afaict.

Any ideas?
Pavel

[drm:i915_gem_object_get_pages] *ERROR* Attempting to obtain a
purgeable object
[drm:i915_gem_object_get_pages] *ERROR* Attempting to obtain a
purgeable object
[ cut here ]
WARNING: CPU: 0 PID: 2945 at drivers/gpu/drm/i915/i915_gem.c:1459
i915_gem_fault+0x10c/0x240()
unhandled error in i915_gem_fault: -22
Modules linked in:
CPU: 0 PID: 2945 Comm: Xorg Tainted: GW3.14.0-rc6+ #325
Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 )
03/31/2011
 05b3 f00ade08 c47db94b c4a318f4 f00ade38 c4037f9a c4a31ed0
 f00ade64
 0b81 c4a318f4 05b3 c436cb9c c436cb9c 0002 ea35af00
 f00adebc
 f00ade50 c403803e 0009 f00ade48 c4a31ed0 f00ade64 f00ade90
 c436cb9c
Call Trace:
 [] dump_stack+0x41/0x52
 [] warn_slowpath_common+0x7a/0xa0
 [] ? i915_gem_fault+0x10c/0x240
 [] ? i915_gem_fault+0x10c/0x240
 [] warn_slowpath_fmt+0x2e/0x30
 [] i915_gem_fault+0x10c/0x240
 [] __do_fault+0x57/0x490
 [] ? __lock_acquire+0x3ae/0xc80
 [] ? i915_gem_pwrite_ioctl+0x8a0/0x8a0
 [] handle_mm_fault+0x14b/0x750
 [] ? __do_page_fault+0xaf/0x410
 [] ? __do_page_fault+0x410/0x410
 [] ? __do_page_fault+0x410/0x410
 [] __do_page_fault+0xfc/0x410
 [] ? SyS_setitimer+0x45/0xd0
 [] ? SyS_ioctl+0x45/0x70
 [] ? __do_page_fault+0x410/0x410
 [] do_page_fault+0xb/0x10
 [] error_code+0x67/0x6c
---[ end trace c124d19552341c51 ]---
wlan0: deauthenticating from 00:33:13:01:31:45 by local choice
(reason=3)
cfg80211: Calling CRDA to update world regulatory domain
Chrome_IOThread[20585]: segfault at 0 ip b459e086 sp ad7f9da0 error 4
in chromium-browser[b3ecd000+390f000]
wlan0: authenticate with 00:33:13:01:31:45
wlan0: send auth to 00:33:13:01:31:45 (try 1/3)
wlan0: authenticated
wlan0: associate with 00:33:13:01:31:45 (try 1/3)


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] dcache: fix dpath buffer corruption for too small buffers

2014-03-12 Thread Imre Deak

During dentry path lookups we can end up corrupting memory if the
destination path buffer is too small. This is because prepend_path()
and prepend() adjust the passed buffer length unconditionally, allowing
for the buffer length to go negative. Then a later prepend_name() call
will receive a negative length and convert this to unsigned before
comparing it against the source string length leading to a possible
memory corruption preceeding the destination buffer.

This fixes the following core dumping for me:

evolution[16162]: segfault at 0 ip 7faadd0e7609 sp 7d63e780 error 4 
in libevolution-mail.so.0.0.0[7faadd05a000+c7000]
BUG: unable to handle kernel paging request at c90f5000
IP: [] memmove+0x4a/0x1a0
PGD 41d80e067 PUD 41d80f067 PMD 41d820067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
Modules linked in: nbd vfat fat nls_utf8 fuse rfcomm bnep bluetooth rfkill 
snd_hda_codec_hdmi snd_hda_codec_realtek kvm_intel kvm crc32c_intel 
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss ftdi_sio 
usbserial snd_pcm cdc_acm snd_seq_dummy snd_page_alloc snd_seq_oss microcode 
serio_raw i915 snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq 
snd_seq_device snd_timer snd cfbfillrect soundcore cfbimgblt i2c_algo_bit 
cfbcopyarea drm_kms_helper lpc_ich drm mfd_core usb_storage usbhid
CPU: 1 PID: 16162 Comm: evolution Tainted: GW3.13.0-imre+ #327
Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A03 02/15/2011
task: 8804157a8000 ti: 8803e0afa000 task.ti: 8803e0afa000
RIP: 0010:[]  [] memmove+0x4a/0x1a0
RSP: :8803e0afba70  EFLAGS: 00010283
RAX: c90f5000 RBX: ffdd RCX: c90f5000
RDX: ffe3 RSI: c90f4ffd RDI: c90f5000
RBP: 8803e0afbc30 R08: 312e302e6f732e30 R09: 2e332d6b74677469
R10: 6b62657762696c2f R11: 62696c2f7273752f R12: 0023
R13: 880415cb7790 R14: c90f5023 R15: c90e7370
FS:  7faafb750a40() GS:88042fa4() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: c90f5000 CR3: 43388000 CR4: 000407e0
Stack:
 811e4a3d 811e4757 8804157a8000 7558
 04e504e5 880400014000 88017558 8801668bfb00
 81c226a0 8803e0afbce0 c90e8558 c90e1000
Call Trace:
 [] ? elf_core_dump+0xb3d/0x1400
 [] ? elf_core_dump+0x857/0x1400
 [] do_coredump+0xbda/0xf90
 [] ? trace_hardirqs_off+0xd/0x10
 [] get_signal_to_deliver+0x5c6/0x6b0
 [] do_signal+0x48/0x560
 [] ? fsnotify+0x8f/0x340
 [] ? retint_signal+0x11/0x90
 [] do_notify_resume+0x35/0x80
 [] retint_signal+0x46/0x90
Code: 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 41 48 83 ea 20 48 83 ea 20 
4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 <4c> 89 1f 4c 89 57 08 
4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4
RIP  [] memmove+0x4a/0x1a0
 RSP 
CR2: c90f5000
---[ end trace cc7a046285294005 ]---

Signed-off-by: Imre Deak 
---
 fs/dcache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 265e0ce..4015fd9 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2833,7 +2833,8 @@ static int prepend_name(char **buffer, int *buflen, 
struct qstr *name)
u32 dlen = ACCESS_ONCE(name->len);
char *p;
 
-   if (*buflen < dlen + 1)
+   /* make sure we don't convert a negative value to unsigned int */
+   if (*buflen < 0 || *buflen < dlen + 1)
return -ENAMETOOLONG;
*buflen -= dlen + 1;
p = *buffer -= dlen + 1;
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mmotm 2014-03-12-16-04 uploaded

2014-03-12 Thread akpm

The mm-of-the-moment snapshot 2014-03-12-16-04 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (3.x
or 3.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/?p=linux-mmots.git;a=summary

and use of this tree is similar to
http://git.cmpxchg.org/?p=linux-mmotm.git, described above.


This mmotm tree contains the following patches against 3.14-rc6:
(patches marked "*" will be included in linux-next)

  origin.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
  i-need-old-gcc.patch
  maintainers-akpm-maintenance.patch
* backing_dev-fix-hung-task-on-sync.patch
* backing_dev-fix-hung-task-on-sync-fix.patch
* sh-fix-format-string-bug-in-stack-tracer.patch
* bdi-avoid-oops-on-device-removal.patch
* kthread-ensure-locality-of-task_struct-allocations.patch
* arch-x86-mm-kmemcheck-kmemcheckc-use-kstrtoint-instead-of-sscanf.patch
* arm-use-generic-fixmaph.patch
* fs-cifs-cifsfsc-add-__init-to-cifs_init_inodecache.patch
* fanotify-remove-useless-bypass_perm-check.patch
* fanotify-use-fanotify-event-structure-for-permission-response-processing.patch
* fanotify-convert-access_mutex-to-spinlock.patch
* fanotify-reorganize-loop-in-fanotify_read.patch
* fanotify-move-unrelated-handling-from-copy_event_to_user.patch
* sched_clock-document-4mhz-vs-1mhz-decision.patch
* input-route-kbd-leds-through-the-generic-leds-layer.patch
* genksyms-fix-typeof-handling.patch
* score-remove-unused-cpu_score7-kconfig-parameter.patch
* sh-push-extra-copy-of-r0-r2-for-syscall-parameters.patch
* sh-remove-unused-do_fpu_error.patch
* sh-dont-pass-saved-userspace-state-to-exception-handlers.patch
* arch-sh-boards-board-sh7757lcrc-fixup-sdhi-register-size.patch
* sh-sh7757-switch-rspi-clock-to-dev-id-match.patch
* drivers-net-irda-donauboe-convert-to-module_pci_driver.patch
* net-core-rtnetlinkc-copy-paste-error-in-rtnl_bridge_notify.patch
* 
ocfs2-fix-null-pointer-dereference-when-access-dlm_state-before-launching-dlm-thread.patch
* ocfs2-change-ip_unaligned_aio-to-of-type-mutex-from-atomit_t.patch
* ocfs2-remove-unused-variable-uuid_net_key-in-ocfs2_initialize_super.patch
* 
ocfs2-improve-fsync-efficiency-and-fix-deadlock-between-aio_write-and-sync_file.patch
* 
ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one.patch
* ocfs2-dlm-fix-lock-migration-crash.patch
* ocfs2-dlm-fix-recovery-hung.patch
* ocfs2-add-dlm_recover_callback_support-in-sysfs.patch
* ocfs2-add-dlm_recover_callback_support-in-sysfs-fix.patch
* ocfs2-fix-a-tiny-race-when-running-dirop_fileop_racer.patch
* 
ocfs2-o2net-o2net_listen_data_ready-should-do-nothing-if-socket-state-is-not-tcp_listen.patch
* ocfs2-remove-ocfs2_inode_skip_delete-flag.patch
* ocfs2-move-dquot_initialize-in-ocfs2_delete_inode-somewhat-later.patch
* quota-provide-function-to-grab-quota-structure-reference.patch
* ocfs2-implement-delayed-dropping-of-last-dquot-reference.patch
*

Re: [RESEND] Fast TSC calibration fails with v3.14-rc1 and later

2014-03-12 Thread Rafael J. Wysocki

On Wednesday, March 12, 2014 05:39:15 PM Thomas Gleixner wrote:
> On Wed, 12 Mar 2014, Thomas Gleixner wrote:
> > On Wed, 12 Mar 2014, Thomas Gleixner wrote:
> > 
> > > On Wed, 12 Mar 2014, joeyli wrote:
> > > > I think maybe still using ACPI_FADT_NO_CMOS_RTC to check does
> > > > acpi_early_init() need run before timekeeping_init().
> > > > If there have any future machine that applied ACPI TAD but "Fast TSC
> > > > calibration" fail, at least the alternate TSC calibration can work
> > > > around issue.
> > > 
> > > Well, it can work around, but it sucks as it's way slower than the
> > > fast one. And we really don't want to pay that price for some half
> > > baken ACPI nonsense.
> > > 
> > > Why exactly do you need that ACPI stuff before timekeeping_init()?
> > 
> > According to the changelog:
> > 
> > And, we want accessing ACPI TAD device to set system clock, so move
> > acpi_early_init() before timekeeping_init(). This final position is
> > also before efi_enter_virtual_mode().
> > 
> > Why do we need to access that TAD thing (whatever newfangled that is)
> > at this point?
> 
> And we have no support for that nonsense in tree, so why do we need to
> disturb functionality which does not need that at all?
> 
> We can revisit the issue when we actually have reached a conclusion
> how to deal with that and when we are merging something which supports
> TAD.
> 
> Up to then we really can live without that and put the call just
> before efi_enter_virtual_mode().

I agree, and we need to fix that for 3.14.  Patch is appended.

Thanks,
Rafael


---
From: Rafael J. Wysocki 
Subject: ACPI / init: Invoke early ACPI initialization later

Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before
timekeeping_init()) optimistically moved the early ACPI initialization
before timekeeping_init(), but that didn't work, because it broke fast
TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely
for others too).  The reason is that acpi_early_init() enables the SCI
and that interferes with the fast TSC calibration mechanism.

Thus follow the original idea to execute acpi_early_init() before
efi_enter_virtual_mode() to help the EFI people for now and we can
revisit the other problem that commit 73f7d1ca3263 attempted to
address in the future (if really necessary).

Fixes: 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before 
timekeeping_init())
Reported-by: Julian Wollrath 
Signed-off-by: Rafael J. Wysocki 
---
 init/main.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/init/main.c
===
--- linux-pm.orig/init/main.c
+++ linux-pm/init/main.c
@@ -561,7 +561,6 @@ asmlinkage void __init start_kernel(void
init_timers();
hrtimers_init();
softirq_init();
-   acpi_early_init();
timekeeping_init();
time_init();
sched_clock_postinit();
@@ -613,6 +612,7 @@ asmlinkage void __init start_kernel(void
calibrate_delay();
pidmap_init();
anon_vma_init();
+   acpi_early_init();
 #ifdef CONFIG_X86
if (efi_enabled(EFI_RUNTIME_SERVICES))
efi_enter_virtual_mode();

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] Improve 32 bit vDSO time

2014-03-12 Thread stefani



Zitat von Andy Lutomirski :


Since no one seems to object to the latest patch I sent out, wouldn't
it make more sense to base this on top of my patch?



I will do this when your patch is pulled into tip. For now we have the choice,
but i preferer our solution removing the compat vdso.

- Stefani

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] x86: Remove compat vdso support

2014-03-12 Thread H. Peter Anvin

On 03/12/2014 02:49 PM, Andy Lutomirski wrote:
> On Wed, Mar 12, 2014 at 2:46 PM, Linus Torvalds
>  wrote:
>> On Wed, Mar 12, 2014 at 2:37 PM, H. Peter Anvin  wrote:
>>>
>>> How would that deal with the legacy vsyscall case for x86-64?  Just rely
>>> on the "legacy vsyscall emulation" (which seems to have its own class of
>>> problems...)?
>>
>> It does?
>>
>> We *default* to emulation, and have for over two years now (since
>> v3.4).  If there are problems with it, we need to fix those.
> 
> Even in the non-default "vsyscall=native" case, the vsyscall pages
> just contains syscalls.  It does not need to access the vvar page, the
> hpet, or anything else that the vdso uses.
> 

Ah, right.  I let that detail slip the mind.

I do hear vsyscall=native still being used as a workaround for problems,
but yes, just making it call the kernel is fine, of course.

So yes, this does make it all better.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND 2/2] MIPS: fpu: fix conflict of register usage

2014-03-12 Thread Guenter Roeck


On 03/12/2014 03:41 PM, Aaro Koskinen wrote:

From: Huacai Chen 

In _restore_fp_context/_restore_fp_context32, t0 is used for both
CP0_Status and CP1_FCSR. This is a mistake and cause FP exeception on
boot, so fix it.

Signed-off-by: Huacai Chen 
Tested-by: Aaro Koskinen 
Tested-by: Andreas Barth 
Signed-off-by: Aaro Koskinen 


With qemu-system-mips64:

Tested-by: Guenter Roeck 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND 1/2] MIPS: Replace CONFIG_MIPS64 and CONFIG_MIPS32_R2

2014-03-12 Thread Guenter Roeck


On 03/12/2014 03:41 PM, Aaro Koskinen wrote:

From: Paul Bolle 

Commit 597ce1723e0f ("MIPS: Support for 64-bit FP with O32 binaries")
introduced references to two undefined Kconfig macros. CONFIG_MIPS32_R2
should clearly be replaced with CONFIG_CPU_MIPS32_R2. And CONFIG_MIPS64
should apparently be replaced with CONFIG_64BIT.

Signed-off-by: Paul Bolle 
Signed-off-by: Huacai Chen 
Tested-by: Aaro Koskinen 
Signed-off-by: Aaro Koskinen 


With qemu-system-mips64:

Tested-by: Guenter Roeck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] x86, vdso32: handle 32 bit vDSO larger one page

2014-03-12 Thread Andy Lutomirski

On Wed, Mar 12, 2014 at 3:58 PM, H. Peter Anvin  wrote:
> On 03/12/2014 03:51 PM, Stefani Seibold wrote:
>> This patch enables 32 bit vDSO which are larger than a page. Currently
>> two pages are reserved, this should be enough for future improvements.
>>
>> Signed-off-by: Stefani Seibold 
>> ---
>>  arch/x86/include/asm/fixmap.h |  4 +++-
>>  arch/x86/vdso/vdso32-setup.c  | 29 +++--
>>  2 files changed, 22 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
>> index 094d0cc..f513f14 100644
>> --- a/arch/x86/include/asm/fixmap.h
>> +++ b/arch/x86/include/asm/fixmap.h
>> @@ -43,6 +43,8 @@ extern unsigned long __FIXADDR_TOP;
>>
>>  #define FIXADDR_USER_START __fix_to_virt(FIX_VDSO)
>>  #define FIXADDR_USER_END   __fix_to_virt(FIX_VDSO - 1)
>> +
>> +#define MAX_VDSO_PAGES   2
>>  #else
>>  #define FIXADDR_TOP  (VSYSCALL_END-PAGE_SIZE)
>>
>> @@ -74,7 +76,7 @@ extern unsigned long __FIXADDR_TOP;
>>  enum fixed_addresses {
>>  #ifdef CONFIG_X86_32
>>   FIX_HOLE,
>> - FIX_VDSO,
>> + FIX_VDSO = MAX_VDSO_PAGES,
>>   VVAR_PAGE,
>>   VSYSCALL_HPET,
>
> Can we make this FIX_HOLE + MAX_VDSO_PAGES or at least a comment
> explaning the constraint here... otherwise I fear random breakage.

Note that this code is completely unnecessary if either of my patch
sets is accepted.  Since you're the maintainer, can you give an
opinion? :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] Change ACPI IPMI support to "default y"

2014-03-12 Thread Pavel Machek

On Tue 2014-02-18 23:15:08, Matthew Garrett wrote:
> On Wed, 2014-02-19 at 00:26 +0100, Rafael J. Wysocki wrote:
> > On Tuesday, February 18, 2014 11:28:29 AM Matthew Garrett wrote:
> > > 
> > > The ACPI IPMI driver implements IPMI operation region support for the ACPI
> > > core. Systems that declare ACPI operation regions may reference them at 
> > > any
> > > time, including during kernel initialisation. These accesses will fail
> > > unless the ACPI IPMI driver is present, and undesirable system behaviour
> > > may result. Set the default to Y in order to encourage distributions and
> > > users to configure kernels to avoid awkward surprises.
> > 
> > Do you have any examples of problems caused by that or is this just 
> > theoretical
> > at the moment?
> 
> For example, if you load the ACPI power meter driver before you've
> installed the ACPI IPMI driver you'll typically get failures (most
> vendors implement it via IPMI).

Would the right solution be to implement dependency between power
meter and IMPI?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1432 matches

Mail list logo