date:20120925

Re: [PATCH 1/1] pinctrl: Samsung: Fix return value

2012-09-25 Thread Sachin Kamat

Hi Thomas,

Please provide your review comments.

On 14 September 2012 19:37, Linus Walleij  wrote:
> On Fri, Sep 14, 2012 at 2:02 PM, Sachin Kamat  wrote:
>
>> Return the value obtained from of_property_count_strings()
>> instead of -EINVAL.
>>
>> Silences the following smatch warning:
>> drivers/pinctrl/pinctrl-samsung.c:529 samsung_pinctrl_parse_dt_pins()
>> info: why not propagate '*npins' from of_property_count_strings()
>> instead of -22?
>>
>> Cc: Thomas Abraham 
>> Signed-off-by: Sachin Kamat 
>> ---
>>  drivers/pinctrl/pinctrl-samsung.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/pinctrl/pinctrl-samsung.c 
>> b/drivers/pinctrl/pinctrl-samsung.c
>> index 8a24223..824fda9 100644
>> --- a/drivers/pinctrl/pinctrl-samsung.c
>> +++ b/drivers/pinctrl/pinctrl-samsung.c
>> @@ -526,7 +526,7 @@ static int __init samsung_pinctrl_parse_dt_pins(struct 
>> platform_device *pdev,
>> *npins = of_property_count_strings(cfg_np, "samsung,pins");
>> if (*npins < 0) {
>> dev_err(dev, "invalid pin list in %s node", cfg_np->name);
>> -   return -EINVAL;
>> +   return *npins;
>> }
>
> Thomas, please check this ...
>
> Yours,
> Linus Walleij



-- 
With warm regards,
Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUESTION] Can uprobe_event support @ADDR, $retval, offs(FETCHARG)?

2012-09-25 Thread Srikar Dronamraju

> 
> Perhaps, it is not so small things, but at least, we can try.
> In the userspace, memories(pages) can be paged out on swap or
> files. In that case, memory dereference function needs to track
> down the data on the disk and it causes I/O. This means we will
> see the visible performance degradation with tracing.
> And also, sometime a pointer value (address) is broken, in that
> case we have to ensure the address is actually valid before
> accessing it.
> 
> Of cause, without tracking paged-out data, it is easy
> to support, because that is already done in kprobe event.
> I'm not sure how it is useful, because sometimes it will
> fail to access gather the data.
> However it is good for the first step, I think.
> 
> Srikar, what would you think?

I think we should do the best effort basis first. i.e support for
tracking data thats not paged out.
Most times the data that is requested tends to the hot data.

We could look at supporting data that is paged out later.

> 
> BTW, if we can support offs(FETCHARGS), $stack and $stackN
> are also available. ;)
> 

-- 
Thanks and Regards
Srikar Dronamraju

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf record: add meta-data support for pipe-mode

2012-09-25 Thread David Ahern

I like the idea, but can't checkout the patch - does not apply to 
Arnaldo's latest perf/core branch. mind rebasing?


David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Fix a crash when block device is read and block size is changed at the same time

2012-09-25 Thread Jens Axboe

On 2012-09-26 00:49, Mikulas Patocka wrote:
> Here I'm resending it as two patches. The first one uses existing 
> semaphore, the second converts it to RCU-based percpu semaphore.

Thanks, applied. In the future, please send new patch 'series' as a new
thread instead of replying to some email in the middle of an existing
thread. It all becomes very messy pretty quickly. Patch #2 had a botched
subject line, so that didn't help either :-)

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 02)

2012-09-25 Thread Thanasis

on 09/25/2012 11:53 PM Francois Romieu wrote the following:
> Thanasis  :
> [...]
>> Ping failed in the following step:
>>
>> HEAD is now at 3c6ad46 r8169: move rtl_set_rx_mode before its
>> rtl_hw_start callers.
> 
> *spleen*
> 
> It's a genuine code move without any real change. Imho it's more a
> matter of sleeping a few seconds for the link to settle after the
> device is brought up.
> 
> The differences between the top-most r8169 driver you tried and the
> real v3.5.4 r8169 driver are minor : mostly Ben Grear's corrupted
> frames rx work (default: disabled) and a skb_timestamp which comes
> too late in your setup.
> 
> So, either your problem lacks of reproducibility with 3.5.4 - cold reboot,
> driver which does not fail the first time - or it needs something else
> in the kernel to happen.
> 
> The "PME# disabled" messages have disappeared between 2.6 and 3.5.4 in your
> dmesg. It's probably due to a dev_dbg/dev_printk + CONFIG_DYNAMIC_DEBUG
> change. It's still worth checking runtime pm settings though
> 

Sorry but I don't understand much of what you said above...

> Can you check the content of /sys/class/pci_bus/:02/power, set it
> to "on" if it contains "auto" and plug the cable again (with 3.5.4) ?
> 

I changed /sys/class/pci_bus/\:02/power/control from auto to on,
did unplug and plug the cable again, also I manually assigned thte IP
address to the NIC, but it did not make it work.

Here is the situation in /sys/class/pci_bus/\:02/power:

atom ~ # ls /sys/class/pci_bus/\:02/power
autosuspend_delay_ms  control  runtime_active_time  runtime_status
runtime_suspended_time
atom ~ # cat /sys/class/pci_bus/\:02/power/*
cat: /sys/class/pci_bus/:02/power/autosuspend_delay_ms: Input/output
error
on
0
unsupported
0
atom ~ # cat /sys/class/pci_bus/\:02/power/runtime_status
unsupported
atom ~ # cat /sys/class/pci_bus/\:02/power/control
on
atom ~ # cat /sys/class/pci_bus/\:02/power/runtime_active_time
0
atom ~ # cat /sys/class/pci_bus/\:02/power/runtime_suspended_time
0
atom ~ #

I attach the dmesg where it can be seen that the card every few seconds
reports:
r8169 :02:00.0: eth0: link up
r8169 :02:00.0: eth0: link up
r8169 :02:00.0: eth0: link up
r8169 :02:00.0: eth0: link up
...








dmesg.gz
Description: GNU Zip compressed data

Re: [PATCH 4/4] usb: phy: omap-usb2: enable 960Mhz clock for omap5

2012-09-25 Thread ABRAHAM, KISHON VIJAY

Hi,

On Wed, Sep 19, 2012 at 5:26 PM, Felipe Balbi  wrote:
> On Wed, Sep 19, 2012 at 05:00:29PM +0530, Kishon Vijay Abraham I wrote:
>> "usb_otg_ss_refclk960m" is needed by usb2 phy present in omap5. For
>> omap4, the clk_get of this clock will fail since it does not have this
>> clock.
>>
>> Signed-off-by: Kishon Vijay Abraham I 
>> ---
>>  Documentation/devicetree/bindings/usb/usb-phy.txt |3 +++
>>  drivers/usb/phy/omap-usb2.c   |   28 
>> -
>>  2 files changed, 30 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/usb/usb-phy.txt 
>> b/Documentation/devicetree/bindings/usb/usb-phy.txt
>> index 7c5fd89..d5626de 100644
>> --- a/Documentation/devicetree/bindings/usb/usb-phy.txt
>> +++ b/Documentation/devicetree/bindings/usb/usb-phy.txt
>> @@ -24,6 +24,9 @@ Required properties:
>>  add the address of control module phy power register until a driver for
>>  control module is added
>>
>> +Optional properties:
>> + - has960mhzclk: should be added if the phy needs 960mhz clock
>> +
>>  This is usually a subnode of ocp2scp to which it is connected.
>>
>>  usb3phy@4a084400 {
>> diff --git a/drivers/usb/phy/omap-usb2.c b/drivers/usb/phy/omap-usb2.c
>> index d36c282..d6612ba 100644
>> --- a/drivers/usb/phy/omap-usb2.c
>> +++ b/drivers/usb/phy/omap-usb2.c
>> @@ -146,6 +146,7 @@ static int __devinit omap_usb2_probe(struct 
>> platform_device *pdev)
>>   struct omap_usb *phy;
>>   struct usb_otg  *otg;
>>   struct resource *res;
>> + struct device_node  *np = pdev->dev.of_node;
>>
>>   phy = devm_kzalloc(>dev, sizeof(*phy), GFP_KERNEL);
>>   if (!phy) {
>> @@ -190,6 +191,15 @@ static int __devinit omap_usb2_probe(struct 
>> platform_device *pdev)
>>   }
>>   clk_prepare(phy->wkupclk);
>>
>> + if (of_property_read_bool(np, "has960mhzclk")) {
>> + phy->optclk = devm_clk_get(phy->dev, "usb_otg_ss_refclk960m");
>> + if (IS_ERR(phy->optclk)) {
>> + dev_err(>dev, "unable to get refclk960m\n");
>> + return PTR_ERR(phy->optclk);
>> + }
>> + clk_prepare(phy->optclk);
>> + }
>
> instead, can't you just always try to get the clock but ignore the error
> if it fails ?

This clock is needed for usb2 to work in dwc3 (omap5). So we have to
report the error in case we dont get the clock no?

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 4/5] IIO : ADC: tiadc: Add support of TI's ADC driver

2012-09-25 Thread Patil, Rachna

This patch adds support for TI's ADC driver.
This is a multifunctional device.
Analog input lines are provided on which
voltage measurements can be carried out.
You can have upto 8 input lines.

Signed-off-by: Patil, Rachna 
---
Changes in v2:
Addressed review comments from Matthias Kaehlcke

Changes in v3:
Addressed review comments from Jonathan Cameron.
Added comments, new line appropriately.

Changes in v4:
Removed extra comments and variables.
rename idev to indio_dev throughout the driver.
Renamed structs for better readability.

 drivers/iio/adc/Kconfig |7 +
 drivers/iio/adc/Makefile|1 +
 drivers/iio/adc/ti_am335x_adc.c |  216 +++
 drivers/mfd/ti_am335x_tscadc.c  |   18 ++-
 include/linux/mfd/ti_am335x_tscadc.h|9 +-
 include/linux/platform_data/ti_am335x_adc.h |   14 ++
 6 files changed, 263 insertions(+), 2 deletions(-)
 create mode 100644 drivers/iio/adc/ti_am335x_adc.c
 create mode 100644 include/linux/platform_data/ti_am335x_adc.h

diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
index 8a78b4f..59db45f 100644
--- a/drivers/iio/adc/Kconfig
+++ b/drivers/iio/adc/Kconfig
@@ -22,4 +22,11 @@ config AT91_ADC
help
  Say yes here to build support for Atmel AT91 ADC.
 
+config TI_AM335X_ADC
+   tristate "TI's ADC driver"
+   depends on ARCH_OMAP2PLUS
+   help
+ Say yes here to build support for Texas Instruments ADC
+ driver which is also a MFD client.
+
 endmenu
diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
index 52eec25..e716588 100644
--- a/drivers/iio/adc/Makefile
+++ b/drivers/iio/adc/Makefile
@@ -4,3 +4,4 @@
 
 obj-$(CONFIG_AD7266) += ad7266.o
 obj-$(CONFIG_AT91_ADC) += at91_adc.o
+obj-$(CONFIG_TI_AM335X_ADC) += ti_am335x_adc.o
diff --git a/drivers/iio/adc/ti_am335x_adc.c b/drivers/iio/adc/ti_am335x_adc.c
new file mode 100644
index 000..9e1b3ac
--- /dev/null
+++ b/drivers/iio/adc/ti_am335x_adc.c
@@ -0,0 +1,216 @@
+/*
+ * TI ADC MFD driver
+ *
+ * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+struct tiadc_device {
+   struct ti_tscadc_dev *mfd_tscadc;
+   int channels;
+};
+
+static unsigned int adc_readl(struct tiadc_device *adc, unsigned int reg)
+{
+   return readl(adc->mfd_tscadc->tscadc_base + reg);
+}
+
+static void adc_writel(struct tiadc_device *adc, unsigned int reg,
+   unsigned int val)
+{
+   writel(val, adc->mfd_tscadc->tscadc_base + reg);
+}
+
+static void adc_step_config(struct tiadc_device *adc_dev)
+{
+   unsigned int stepconfig;
+   int i, channels = 0, steps;
+
+   /*
+* There are 16 configurable steps and 8 analog input
+* lines available which are shared between Touchscreen and ADC.
+*
+* Steps backwards i.e. from 16 towards 0 are used by ADC
+* depending on number of input lines needed.
+* Channel would represent which analog input
+* needs to be given to ADC to digitalize data.
+*/
+
+   steps = TOTAL_STEPS - adc_dev->channels;
+   channels = TOTAL_CHANNELS - adc_dev->channels;
+
+   stepconfig = STEPCONFIG_AVG_16 | STEPCONFIG_FIFO1;
+
+   for (i = (steps + 1); i <= TOTAL_STEPS; i++) {
+   adc_writel(adc_dev, REG_STEPCONFIG(i),
+   stepconfig | STEPCONFIG_INP(channels));
+   adc_writel(adc_dev, REG_STEPDELAY(i),
+   STEPCONFIG_OPENDLY);
+   channels++;
+   }
+   adc_writel(adc_dev, REG_SE, STPENB_STEPENB);
+}
+
+static int tiadc_channel_init(struct iio_dev *indio_dev, int channels)
+{
+   struct iio_chan_spec *chan_array;
+   int i;
+
+   indio_dev->num_channels = channels;
+   chan_array = kcalloc(indio_dev->num_channels,
+   sizeof(struct iio_chan_spec), GFP_KERNEL);
+
+   if (chan_array == NULL)
+   return -ENOMEM;
+
+   for (i = 0; i < (indio_dev->num_channels); i++) {
+   struct iio_chan_spec *chan = chan_array + i;
+   chan->type = IIO_VOLTAGE;
+   chan->indexed = 1;
+   chan->channel = i;
+   chan->info_mask = IIO_CHAN_INFO_RAW_SEPARATE_BIT;
+   }
+
+   indio_dev->channels =

[PATCH v4 5/5] MFD: ti_tscadc: add suspend/resume functionality

2012-09-25 Thread Patil, Rachna

This patch adds support for suspend/resume of
TSC/ADC MFDevice.

Signed-off-by: Patil, Rachna 
---
Changes in v2:
Added this patch newly in this patch series.

Changes in v3:
No changes.

Changes in v4:
Replaced suspend/resume callbacks with dev_pm_ops.

 drivers/iio/adc/ti_am335x_adc.c   |   42 +
 drivers/input/touchscreen/ti_am335x_tsc.c |   42 +
 drivers/mfd/ti_am335x_tscadc.c|   41 +++-
 include/linux/mfd/ti_am335x_tscadc.h  |3 ++
 4 files changed, 127 insertions(+), 1 deletions(-)

diff --git a/drivers/iio/adc/ti_am335x_adc.c b/drivers/iio/adc/ti_am335x_adc.c
index 9e1b3ac..b16f944 100644
--- a/drivers/iio/adc/ti_am335x_adc.c
+++ b/drivers/iio/adc/ti_am335x_adc.c
@@ -200,10 +200,52 @@ static int __devexit tiadc_remove(struct platform_device 
*pdev)
return 0;
 }
 
+#ifdef CONFIG_PM
+static int tiadc_suspend(struct device *dev)
+{
+   struct iio_dev *indio_dev = dev_get_drvdata(dev);
+   struct tiadc_device *adc_dev = iio_priv(indio_dev);
+   struct ti_tscadc_dev *tscadc_dev = dev->platform_data;
+   unsigned int idle;
+
+   if (!device_may_wakeup(tscadc_dev->dev)) {
+   idle = adc_readl(adc_dev, REG_CTRL);
+   idle &= ~(CNTRLREG_TSCSSENB);
+   adc_writel(adc_dev, REG_CTRL, (idle |
+   CNTRLREG_POWERDOWN));
+   }
+   return 0;
+}
+
+static int tiadc_resume(struct device *dev)
+{
+   struct iio_dev *indio_dev = dev_get_drvdata(dev);
+   struct tiadc_device *adc_dev = iio_priv(indio_dev);
+   unsigned int restore;
+
+   /* Make sure ADC is powered up */
+   restore = adc_readl(adc_dev, REG_CTRL);
+   restore &= ~(CNTRLREG_POWERDOWN);
+   adc_writel(adc_dev, REG_CTRL, restore);
+
+   adc_step_config(adc_dev);
+   return 0;
+}
+
+static const struct dev_pm_ops tiadc_pm_ops = {
+   .suspend = tiadc_suspend,
+   .resume = tiadc_resume,
+};
+#define TIADC_PM_OPS (_pm_ops)
+#else
+#define TIADC_PM_OPS NULL
+#endif
+
 static struct platform_driver tiadc_driver = {
.driver = {
.name   = "tiadc",
.owner = THIS_MODULE,
+   .pm = TIADC_PM_OPS,
},
.probe  = tiadc_probe,
.remove = __devexit_p(tiadc_remove),
diff --git a/drivers/input/touchscreen/ti_am335x_tsc.c 
b/drivers/input/touchscreen/ti_am335x_tsc.c
index 2d9dec1..b17dbe4 100644
--- a/drivers/input/touchscreen/ti_am335x_tsc.c
+++ b/drivers/input/touchscreen/ti_am335x_tsc.c
@@ -338,12 +338,54 @@ static int __devexit tscadc_remove(struct platform_device 
*pdev)
return 0;
 }
 
+#ifdef CONFIG_PM
+static int titsc_suspend(struct device *dev)
+{
+   struct ti_tscadc_dev *tscadc_dev = dev->platform_data;
+   struct tscadc *ts_dev = tscadc_dev->tsc;
+   unsigned int idle;
+
+   if (device_may_wakeup(tscadc_dev->dev)) {
+   idle = tscadc_readl(ts_dev, REG_IRQENABLE);
+   tscadc_writel(ts_dev, REG_IRQENABLE,
+   (idle | IRQENB_HW_PEN));
+   tscadc_writel(ts_dev, REG_IRQWAKEUP, IRQWKUP_ENB);
+   }
+   return 0;
+}
+
+static int titsc_resume(struct device *dev)
+{
+   struct ti_tscadc_dev *tscadc_dev = dev->platform_data;
+   struct tscadc *ts_dev = tscadc_dev->tsc;
+
+   if (device_may_wakeup(tscadc_dev->dev)) {
+   tscadc_writel(ts_dev, REG_IRQWAKEUP,
+   0x00);
+   tscadc_writel(ts_dev, REG_IRQCLR, IRQENB_HW_PEN);
+   }
+   tscadc_step_config(ts_dev);
+   tscadc_writel(ts_dev, REG_FIFO0THR,
+   ts_dev->steps_to_configure);
+   return 0;
+}
+
+static const struct dev_pm_ops titsc_pm_ops = {
+   .suspend = titsc_suspend,
+   .resume  = titsc_resume,
+};
+#define TITSC_PM_OPS (_pm_ops)
+#else
+#define TITSC_PM_OPS NULL
+#endif
+
 static struct platform_driver ti_tsc_driver = {
.probe  = tscadc_probe,
.remove = __devexit_p(tscadc_remove),
.driver = {
.name   = "tsc",
.owner  = THIS_MODULE,
+   .pm = TITSC_PM_OPS,
},
 };
 module_platform_driver(ti_tsc_driver);
diff --git a/drivers/mfd/ti_am335x_tscadc.c b/drivers/mfd/ti_am335x_tscadc.c
index 45d66e5..7e949e8 100644
--- a/drivers/mfd/ti_am335x_tscadc.c
+++ b/drivers/mfd/ti_am335x_tscadc.c
@@ -170,6 +170,7 @@ static  int __devinit ti_tscadc_probe(struct 
platform_device *pdev)
if (err < 0)
goto err_disable_clk;
 
+   device_init_wakeup(>dev, true);
platform_set_drvdata(pdev, tscadc);
return 0;
 
@@ -203,14 +204,52 @@ static int __devexit ti_tscadc_remove(struct 
platform_device *pdev)
return 0;
 }
 
+#ifdef CONFIG_PM
+static int tscadc_suspend(struct device *dev)
+{
+   struct ti_tscadc_dev*tscadc_dev = dev_get_drvdata(dev);
+
+

[PATCH v4 3/5] input: TSC: ti_tsc: Convert TSC into a MFDevice

2012-09-25 Thread Patil, Rachna

This patch converts touchscreen into a MFD client.
All the register definitions, clock initialization,
etc has been moved to MFD core driver.

Signed-off-by: Patil, Rachna 
---
Changes in v2:
No changes

Changes in v3:
No changes

Changes in v4:
No changes

 drivers/input/touchscreen/ti_am335x_tsc.c |  277 ++---
 drivers/mfd/ti_am335x_tscadc.c|   11 ++
 include/linux/mfd/ti_am335x_tscadc.h  |   10 +-
 3 files changed, 74 insertions(+), 224 deletions(-)

diff --git a/drivers/input/touchscreen/ti_am335x_tsc.c 
b/drivers/input/touchscreen/ti_am335x_tsc.c
index e71dee6..2d9dec1 100644
--- a/drivers/input/touchscreen/ti_am335x_tsc.c
+++ b/drivers/input/touchscreen/ti_am335x_tsc.c
@@ -26,123 +26,31 @@
 #include 
 #include 
 #include 
-
-#define REG_RAWIRQSTATUS   0x024
-#define REG_IRQSTATUS  0x028
-#define REG_IRQENABLE  0x02C
-#define REG_IRQWAKEUP  0x034
-#define REG_CTRL   0x040
-#define REG_ADCFSM 0x044
-#define REG_CLKDIV 0x04C
-#define REG_SE 0x054
-#define REG_IDLECONFIG 0x058
-#define REG_CHARGECONFIG   0x05C
-#define REG_CHARGEDELAY0x060
-#define REG_STEPCONFIG(n)  (0x64 + ((n - 1) * 8))
-#define REG_STEPDELAY(n)   (0x68 + ((n - 1) * 8))
-#define REG_FIFO0CNT   0xE4
-#define REG_FIFO0THR   0xE8
-#define REG_FIFO1THR   0xF4
-#define REG_FIFO0  0x100
-#define REG_FIFO1  0x200
-
-/* Register Bitfields  */
-#define IRQWKUP_ENBBIT(0)
-
-/* Step Enable */
-#define STEPENB_MASK   (0x1 << 0)
-#define STEPENB(val)   (val << 0)
-#define STPENB_STEPENB STEPENB(0x7FFF)
-
-/* IRQ enable */
-#define IRQENB_FIFO0THRES  BIT(2)
-#define IRQENB_FIFO1THRES  BIT(5)
-#define IRQENB_PENUP   BIT(9)
-
-/* Step Configuration */
-#define STEPCONFIG_MODE_MASK   (3 << 0)
-#define STEPCONFIG_MODE(val)   (val << 0)
-#define STEPCONFIG_MODE_HWSYNC STEPCONFIG_MODE(2)
-#define STEPCONFIG_AVG_MASK(7 << 2)
-#define STEPCONFIG_AVG(val)(val << 2)
-#define STEPCONFIG_AVG_16  STEPCONFIG_AVG(4)
-#define STEPCONFIG_XPP BIT(5)
-#define STEPCONFIG_XNN BIT(6)
-#define STEPCONFIG_YPP BIT(7)
-#define STEPCONFIG_YNN BIT(8)
-#define STEPCONFIG_XNP BIT(9)
-#define STEPCONFIG_YPN BIT(10)
-#define STEPCONFIG_INM_MASK(0xF << 15)
-#define STEPCONFIG_INM(val)(val << 15)
-#define STEPCONFIG_INM_ADCREFM STEPCONFIG_INM(8)
-#define STEPCONFIG_INP_MASK(0xF << 19)
-#define STEPCONFIG_INP(val)(val << 19)
-#define STEPCONFIG_INP_AN2 STEPCONFIG_INP(2)
-#define STEPCONFIG_INP_AN3 STEPCONFIG_INP(3)
-#define STEPCONFIG_INP_AN4 STEPCONFIG_INP(4)
-#define STEPCONFIG_INP_ADCREFM STEPCONFIG_INP(8)
-#define STEPCONFIG_FIFO1   BIT(26)
-
-/* Delay register */
-#define STEPDELAY_OPEN_MASK(0x3 << 0)
-#define STEPDELAY_OPEN(val)(val << 0)
-#define STEPCONFIG_OPENDLY STEPDELAY_OPEN(0x098)
-
-/* Charge Config */
-#define STEPCHARGE_RFP_MASK(7 << 12)
-#define STEPCHARGE_RFP(val)(val << 12)
-#define STEPCHARGE_RFP_XPULSTEPCHARGE_RFP(1)
-#define STEPCHARGE_INM_MASK(0xF << 15)
-#define STEPCHARGE_INM(val)(val << 15)
-#define STEPCHARGE_INM_AN1 STEPCHARGE_INM(1)
-#define STEPCHARGE_INP_MASK(0xF << 19)
-#define STEPCHARGE_INP(val)(val << 19)
-#define STEPCHARGE_INP_AN1 STEPCHARGE_INP(1)
-#define STEPCHARGE_RFM_MASK(3 << 23)
-#define STEPCHARGE_RFM(val)(val << 23)
-#define STEPCHARGE_RFM_XNURSTEPCHARGE_RFM(1)
-
-/* Charge delay */
-#define CHARGEDLY_OPEN_MASK(0x3 << 0)
-#define CHARGEDLY_OPEN(val)(val << 0)
-#define CHARGEDLY_OPENDLY  CHARGEDLY_OPEN(1)
-
-/* Control register */
-#define CNTRLREG_TSCSSENB  BIT(0)
-#define CNTRLREG_STEPIDBIT(1)
-#define CNTRLREG_STEPCONFIGWRT BIT(2)
-#define CNTRLREG_AFE_CTRL_MASK (3 << 5)
-#define CNTRLREG_AFE_CTRL(val) (val << 5)
-#define CNTRLREG_4WIRE CNTRLREG_AFE_CTRL(1)
-#define CNTRLREG_5WIRE CNTRLREG_AFE_CTRL(2)
-#define CNTRLREG_8WIRE CNTRLREG_AFE_CTRL(3)
-#define CNTRLREG_TSCENBBIT(7)
+#include 
 
 #define ADCFSM_STEPID  0x10
 #define SEQ_SETTLE 275
-#define ADC_CLK300
 #define MAX_12BIT  ((1 << 12) - 1)
 
 struct tscadc {
struct input_dev*input;
-   struct clk  *tsc_ick;
-   void __iomem*tsc_base;
unsigned intirq;
unsigned intwires;
unsigned intx_plate_resistance;
boolpen_down;
int steps_to_configure;
+   struct ti_tscadc_dev*mfd_tscadc;
 };
 
 static unsigned int tscadc_readl(struct tscadc *ts, unsigned int reg)
 {
-   return readl(ts->tsc_base + reg);
+   return

[PATCH v4 2/5] MFD: ti_tscadc: Add support for TI's TSC/ADC MFDevice

2012-09-25 Thread Patil, Rachna

Add the mfd core driver which supports touchscreen
and ADC.
With this patch we are only adding infrastructure to
support the MFD clients.

Signed-off-by: Patil, Rachna 
---
Changes in v2:
Merged "[PATCH 5/5] MFD: ti_tscadc: Add check on number of i/p 
channels",
patch submitted in previous version into this file.

Changes in v3:
No changes

Changes in v4:
No changes

 drivers/mfd/Kconfig  |9 ++
 drivers/mfd/Makefile |1 +
 drivers/mfd/ti_am335x_tscadc.c   |  193 ++
 include/linux/mfd/ti_am335x_tscadc.h |  133 +++
 4 files changed, 336 insertions(+), 0 deletions(-)
 create mode 100644 drivers/mfd/ti_am335x_tscadc.c
 create mode 100644 include/linux/mfd/ti_am335x_tscadc.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index b1a1462..e472184 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -94,6 +94,15 @@ config MFD_TI_SSP
  To compile this driver as a module, choose M here: the
  module will be called ti-ssp.
 
+config MFD_TI_AM335X_TSCADC
+   tristate "TI ADC / Touch Screen chip support"
+   depends on ARCH_OMAP2PLUS
+   help
+ If you say yes here you get support for Texas Instruments series
+ of Touch Screen /ADC chips.
+ To compile this driver as a module, choose M here: the
+ module will be called ti_am335x_tscadc.
+
 config HTC_EGPIO
bool "HTC EGPIO support"
depends on GENERIC_HARDIRQS && GPIOLIB && ARM
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 79dd22d..bc6b2f0 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_HTC_I2CPLD)  += htc-i2cpld.o
 obj-$(CONFIG_MFD_DAVINCI_VOICECODEC)   += davinci_voicecodec.o
 obj-$(CONFIG_MFD_DM355EVM_MSP) += dm355evm_msp.o
 obj-$(CONFIG_MFD_TI_SSP)   += ti-ssp.o
+obj-$(CONFIG_MFD_TI_AM335X_TSCADC) += ti_am335x_tscadc.o
 
 obj-$(CONFIG_MFD_STA2X11)  += sta2x11-mfd.o
 obj-$(CONFIG_MFD_STMPE)+= stmpe.o
diff --git a/drivers/mfd/ti_am335x_tscadc.c b/drivers/mfd/ti_am335x_tscadc.c
new file mode 100644
index 000..ae93b48
--- /dev/null
+++ b/drivers/mfd/ti_am335x_tscadc.c
@@ -0,0 +1,193 @@
+/*
+ * TI Touch Screen / ADC MFD driver
+ *
+ * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static unsigned int tscadc_readl(struct ti_tscadc_dev *tsadc, unsigned int reg)
+{
+   return readl(tsadc->tscadc_base + reg);
+}
+
+static void tscadc_writel(struct ti_tscadc_dev *tsadc, unsigned int reg,
+   unsigned int val)
+{
+   writel(val, tsadc->tscadc_base + reg);
+}
+
+static void tscadc_idle_config(struct ti_tscadc_dev *config)
+{
+   unsigned int idleconfig;
+
+   idleconfig = STEPCONFIG_YNN | STEPCONFIG_INM_ADCREFM |
+   STEPCONFIG_INP_ADCREFM | STEPCONFIG_YPN;
+
+   tscadc_writel(config, REG_IDLECONFIG, idleconfig);
+}
+
+static int __devinit ti_tscadc_probe(struct platform_device *pdev)
+{
+   struct ti_tscadc_dev*tscadc;
+   struct resource *res;
+   struct clk  *clk;
+   struct mfd_tscadc_board *pdata = pdev->dev.platform_data;
+   int irq;
+   int err, ctrl;
+   int clk_value, clock_rate;
+
+   if (!pdata) {
+   dev_err(>dev, "Could not find platform data\n");
+   return -EINVAL;
+   }
+
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (!res) {
+   dev_err(>dev, "no memory resource defined.\n");
+   return -EINVAL;
+   }
+
+   irq = platform_get_irq(pdev, 0);
+   if (irq < 0) {
+   dev_err(>dev, "no irq ID is specified.\n");
+   return -EINVAL;
+   }
+
+   /* Allocate memory for device */
+   tscadc = kzalloc(sizeof(struct ti_tscadc_dev), GFP_KERNEL);
+   if (!tscadc) {
+   dev_err(>dev, "failed to allocate memory.\n");
+   return -ENOMEM;
+   }
+   tscadc->dev = >dev;
+   tscadc->irq = irq;
+
+   res = request_mem_region(res->start, resource_size(res), pdev->name);
+   if (!res) {
+   dev_err(>dev, "failed to reserve registers.\n");
+   err = -EBUSY;
+   goto err_free_mem;
+   }
+
+

[PATCH v4 1/5] input: TSC: ti_tscadc: Rename the existing touchscreen driver

2012-09-25 Thread Patil, Rachna

Make way for addition of MFD driver.
The existing touchsreen driver is a MFD client.
For better readability we rename the file to
indicate its functionality as only touchscreen.

Signed-off-by: Patil, Rachna 
---
Changes in v2:
Missed changing the name of touchscreen header file
in the previous version.
Adding the same.

Changes in v3:
no changes.

Changes in v4:
no changes.

 drivers/input/touchscreen/Kconfig  |4 ++--
 drivers/input/touchscreen/Makefile |2 +-
 .../touchscreen/{ti_tscadc.c => ti_am335x_tsc.c}   |2 +-
 .../linux/input/{ti_tscadc.h => ti_am335x_tsc.h}   |4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)
 rename drivers/input/touchscreen/{ti_tscadc.c => ti_am335x_tsc.c} (99%)
 rename include/linux/input/{ti_tscadc.h => ti_am335x_tsc.h} (88%)

diff --git a/drivers/input/touchscreen/Kconfig 
b/drivers/input/touchscreen/Kconfig
index 1ba232c..7bdb629 100644
--- a/drivers/input/touchscreen/Kconfig
+++ b/drivers/input/touchscreen/Kconfig
@@ -529,7 +529,7 @@ config TOUCHSCREEN_TOUCHWIN
  To compile this driver as a module, choose M here: the
  module will be called touchwin.
 
-config TOUCHSCREEN_TI_TSCADC
+config TOUCHSCREEN_TI_AM335X_TSC
tristate "TI Touchscreen Interface"
depends on ARCH_OMAP2PLUS
help
@@ -539,7 +539,7 @@ config TOUCHSCREEN_TI_TSCADC
  If unsure, say N.
 
  To compile this driver as a module, choose M here: the
- module will be called ti_tscadc.
+ module will be called ti_am335x_tsc.
 
 config TOUCHSCREEN_ATMEL_TSADCC
tristate "Atmel Touchscreen Interface"
diff --git a/drivers/input/touchscreen/Makefile 
b/drivers/input/touchscreen/Makefile
index 178eb12..7c4c78e 100644
--- a/drivers/input/touchscreen/Makefile
+++ b/drivers/input/touchscreen/Makefile
@@ -52,7 +52,7 @@ obj-$(CONFIG_TOUCHSCREEN_PIXCIR)  += pixcir_i2c_ts.o
 obj-$(CONFIG_TOUCHSCREEN_S3C2410)  += s3c2410_ts.o
 obj-$(CONFIG_TOUCHSCREEN_ST1232)   += st1232.o
 obj-$(CONFIG_TOUCHSCREEN_STMPE)+= stmpe-ts.o
-obj-$(CONFIG_TOUCHSCREEN_TI_TSCADC)+= ti_tscadc.o
+obj-$(CONFIG_TOUCHSCREEN_TI_AM335X_TSC)+= ti_am335x_tsc.o
 obj-$(CONFIG_TOUCHSCREEN_TNETV107X)+= tnetv107x-ts.o
 obj-$(CONFIG_TOUCHSCREEN_TOUCHIT213)   += touchit213.o
 obj-$(CONFIG_TOUCHSCREEN_TOUCHRIGHT)   += touchright.o
diff --git a/drivers/input/touchscreen/ti_tscadc.c 
b/drivers/input/touchscreen/ti_am335x_tsc.c
similarity index 99%
rename from drivers/input/touchscreen/ti_tscadc.c
rename to drivers/input/touchscreen/ti_am335x_tsc.c
index ec0a442..e71dee6 100644
--- a/drivers/input/touchscreen/ti_tscadc.c
+++ b/drivers/input/touchscreen/ti_am335x_tsc.c
@@ -24,7 +24,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #define REG_RAWIRQSTATUS   0x024
diff --git a/include/linux/input/ti_tscadc.h 
b/include/linux/input/ti_am335x_tsc.h
similarity index 88%
rename from include/linux/input/ti_tscadc.h
rename to include/linux/input/ti_am335x_tsc.h
index ad442a3..49269a2 100644
--- a/include/linux/input/ti_tscadc.h
+++ b/include/linux/input/ti_am335x_tsc.h
@@ -1,5 +1,5 @@
-#ifndef __LINUX_TI_TSCADC_H
-#define __LINUX_TI_TSCADC_H
+#ifndef __LINUX_TI_AM335X_TSC_H
+#define __LINUX_TI_AM335X_TSC_H
 
 /**
  * struct tsc_data Touchscreen wire configuration
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/5] Support for TSC/ADC MFD driver

2012-09-25 Thread Patil, Rachna

This patch set adds a MFD core driver which registers
touchscreen and ADC as its client drivers.
The existing touchscreen has been modified to work as
a MFD client driver and a new ADC driver has been added
in the IIO subsystem.

There are 8 analog input lines, which can be used as:
1. 8 general purpose ADC channels
2. 4 wire TS, with 4 general purpose ADC channels
3. 5 wire TS, with 3 general purpose ADC channels

This patch set has been tested on AM335x EVM.

These set of patches are based on top of Touchscreen patch
set submitted.
Subject: [PATCH 0/4] input: TSC: ti_tscadc: TI Touchscreen driver updates [1]

[1] http://www.spinics.net/lists/linux-input/msg22107.html

Changes in v2:
Dropped one patch send in the last version after
receiving internal comments. I have merged the changes
into existing patches.
Also added a new patch to support suspend/resume feature.
Fixed review comments by Matthias Kaehlcke.

Changes in v3:
Addressed review comments by Jonathan Cameron.
Improved ADC driver with more readable labels,
spaces and comments.

Changes in v4:
Renamed the drivers from ti_xxx to ti_am335x_xxx.
For consistency with other drivers renamed idev to indio_dev
in IIO ADC driver.
Replaced suspend/resume callbacks with dev_pm_ops.

Patil, Rachna (5):
  input: TSC: ti_tscadc: Rename the existing touchscreen driver
  MFD: ti_tscadc: Add support for TI's TSC/ADC MFDevice
  input: TSC: ti_tsc: Convert TSC into a MFDevice
  IIO : ADC: tiadc: Add support of TI's ADC driver
  MFD: ti_tscadc: add suspend/resume functionality

 drivers/iio/adc/Kconfig|7 +
 drivers/iio/adc/Makefile   |1 +
 drivers/iio/adc/ti_am335x_adc.c|  258 
 drivers/input/touchscreen/Kconfig  |4 +-
 drivers/input/touchscreen/Makefile |2 +-
 .../touchscreen/{ti_tscadc.c => ti_am335x_tsc.c}   |  319 ++--
 drivers/mfd/Kconfig|9 +
 drivers/mfd/Makefile   |1 +
 drivers/mfd/ti_am335x_tscadc.c |  259 
 .../linux/input/{ti_tscadc.h => ti_am335x_tsc.h}   |4 +-
 include/linux/mfd/ti_am335x_tscadc.h   |  151 +
 include/linux/platform_data/ti_am335x_adc.h|   14 +
 12 files changed, 801 insertions(+), 228 deletions(-)
 create mode 100644 drivers/iio/adc/ti_am335x_adc.c
 rename drivers/input/touchscreen/{ti_tscadc.c => ti_am335x_tsc.c} (55%)
 create mode 100644 drivers/mfd/ti_am335x_tscadc.c
 rename include/linux/input/{ti_tscadc.h => ti_am335x_tsc.h} (88%)
 create mode 100644 include/linux/mfd/ti_am335x_tscadc.h
 create mode 100644 include/linux/platform_data/ti_am335x_adc.h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86/fixup_irq: Clean the offlining CPU from the irq affinity mask

2012-09-25 Thread Chuansheng Liu


When one CPU is going offline, and fixup_irqs() will re-set the
irq affinity in some cases, we should clean the offlining CPU from
the irq affinity.

The reason is setting offlining CPU as of the affinity is useless.
Moreover, the smp_affinity value will be confusing when the
offlining CPU come back again.

Example:
For irq 93 with 4 CPUS, the default affinity f(),
normal cases: 4 CPUS will receive the irq93 interrupts.

When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will
receive the interrupts.

But after the CPU3 is online again, we will not set affinity,the result
will be:
the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts.

So we should clean the offlining CPU from irq affinity mask
in fixup_irqs().

Signed-off-by: liu chuansheng 
---
 arch/x86/kernel/irq.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index d44f782..671d462 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -239,6 +239,7 @@ void fixup_irqs(void)
struct irq_desc *desc;
struct irq_data *data;
struct irq_chip *chip;
+   int cpu = smp_processor_id();
 
for_each_irq_desc(irq, desc) {
int break_affinity = 0;
@@ -271,7 +272,8 @@ void fixup_irqs(void)
if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) {
break_affinity = 1;
affinity = cpu_online_mask;
-   }
+   } else if (cpumask_test_cpu(cpu, data->affinity))
+   cpumask_clear_cpu(cpu, data->affinity);
 
chip = irq_data_get_irq_chip(data);
if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
-- 
1.7.0.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUESTION] Can uprobe_event support @ADDR, $retval, offs(FETCHARG)?

2012-09-25 Thread Masami Hiramatsu

(2012/09/26 11:52), Hyeoncheol Lee wrote:
> Hi,
> 
> uprobe_event only supports %REG arguments. I think that memory fetch,
> return value fetch, memory dereference functions in
> kernel/trace/trace_probe.c are good for uprobe_event. So with a little
> modification of parse_probe_arg(), uprobe_event can support @ADDR,
> $retval, offs(FETCHARGS) except @SYM, $stack, $stackN. Is it right?

Hi Hyeoncheol,

Perhaps, it is not so small things, but at least, we can try.
In the userspace, memories(pages) can be paged out on swap or
files. In that case, memory dereference function needs to track
down the data on the disk and it causes I/O. This means we will
see the visible performance degradation with tracing.
And also, sometime a pointer value (address) is broken, in that
case we have to ensure the address is actually valid before
accessing it.

Of cause, without tracking paged-out data, it is easy
to support, because that is already done in kprobe event.
I'm not sure how it is useful, because sometimes it will
fail to access gather the data.
However it is good for the first step, I think.

Srikar, what would you think?

BTW, if we can support offs(FETCHARGS), $stack and $stackN
are also available. ;)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUESTION] Can uprobe_event support @ADDR, $retval, offs(FETCHARG)?

2012-09-25 Thread Srikar Dronamraju

Hi Hyeoncheol,

> uprobe_event only supports %REG arguments. I think that memory fetch,
> return value fetch, memory dereference functions in
> kernel/trace/trace_probe.c are good for uprobe_event.

Yes, these will be good to have and is listed as todo.

> So with a little
> modification of parse_probe_arg(), uprobe_event can support @ADDR,
> $retval, offs(FETCHARGS) except @SYM, $stack, $stackN. Is it right?

For some of these like the @sym, trace_uprobe.c may not be the right
place because the kernel will not be able to decode user space symbols. 

$retval needs uretprobes support

-- 
Thanks and Regards
Srikar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)

2012-09-25 Thread Daniel Santos

Hmm, looks like I've had some type of mailer problem as this message
didn't appear on LKML :(  I hope this one goes through, but sorry my
patches aren't properly grouped.

On 09/25/2012 06:24 PM, Daniel Santos wrote:
> First I want to apologize for not being able to work on this over most of the
> summer. I see that some other changes are happening with red-black and
> interval trees in the kernel which look good.  This patch set is based on v3.5
> and is not adjusted for many of the changes in Michel Lespinasse's patches.
> This is still WIP as I have added a good deal of new test code and made a fair
> number of performance tweaks, but I needed to get something out for review
> again to keep this thing rolling.
>
> Summary
> ===
> This patch set improves on Andrea Arcangeli's original Red-Black Tree
> implementation by adding generic search and insert functions with
> complete support for:
>
> o leftmost - keeps a pointer to the leftmost (lowest value) node cached
>   in your container struct
> o rightmost - ditto for rightmost (greatest value)
> o count - optionally update an count variable when you perform inserts
>   or deletes
> o unique or non-unique keys
> o find and insert "near" functions - when you already have a node that
>   is likely near another one you want to search for
> o augmented / interval tree support
> o type-safe wrapper interface available via pre-processor macro
>
> Outstanding Issues
> ==
> General
> ---
> o Need to change comments at head of rbtree.h.
> o Need something in Documents to explain generic rbtrees.
> o Descriptions for new KConfig values incomplete.
> o Due to a bug in gcc's optimizer, extra instructions are generated in various
>   places.  Pavel Pisa has provided me a possible work-around that should be
>   examined more closely to see if it can be working in (Discussed in
>   Performance section).
> o Doc-comments are missing or out of date in some places for the new
>   ins_compare field of struct rb_relationship (including at least one code
>   example).
>
> Selftests
> -
> o In-kernel test module not completed, although the option to build it has
>   already been added to KConfig.
> o Userspace selftest's Makefile should run modules_prepare in KERNELDIR.
> o Validation in self-tests doesn't yet cover tests for
>   - insert_near
>   - find_{first,last,next,prev}
> o Selftest scripts need better portability.
> o It would be nice to have some fault-injection in test code to verify that
>   CONFIG_DEBUG_RBTREE and CONFIG_DEBUG_RBTREE_VALIDATE (and it's
>   RB_VERIFY_INTEGRITY counterpart flag) catch the errors they are supposed to.
>
> Undecided (Opinions Requested!)
> ---
> o With the exception of the rb_node & rb_root structs, "Layer 2" of the code
>   (see below) completely abstracts away the underlying red-black tree
>   mechanism.  The structs rb_node and rb_root can also be abstracted away via
>   a typeset or some other mechanism. Thus, should the "Layer 2" code be
>   separated from "Layer 1" and renamed "Generic Tree (gtree)" or some such,
>   paving the way for an alternate tree implementation in the future?
> o Do we need RB_INSERT_DUPE_RIGHT? (see the last patch)
>
>
> Theory of Operation
> ===
> Historically, genericity in C meant function pointers, the overhead of a
> function call and the inability of the compiler to optimize code across
> the function call boundary.  GCC has been getting better and better at
> optimization and determining when a value is a compile-time constant and
> compiling it out.  As of gcc 4.6, it has finally reached a point where
> it's possible to have generic search & insert cores that optimize
> exactly as well as if they were hand-coded. (see also gcc man page:
> -findirect-inlining)
>
> This implementation actually consists of two layers written on top of the
> existing rbtree implementation.
>
> Layer 1: Type-Specific (But Not Type-Safe)
> --
> The first layer consists of enum rb_flags, struct rb_relationship and
> some generic inline functions(see patch for doc comments).
>
> enum rb_flags {
>   RB_HAS_LEFTMOST = 0x0001,
>   RB_HAS_RIGHTMOST= 0x0002,
>   RB_HAS_COUNT= 0x0004,
>   RB_UNIQUE_KEYS  = 0x0008,
>   RB_INSERT_REPLACES  = 0x0010,
>   RB_IS_AUGMENTED = 0x0040,
>   RB_VERIFY_USAGE = 0x0080,
>   RB_VERIFY_INTEGRITY = 0x0100
> };
>
> struct rb_relationship {
>   ssize_t root_offset;
>   ssize_t left_offset;
>   ssize_t right_offset;
>   ssize_t count_offset;
>   ssize_t node_offset;
>   ssize_t key_offset;
>   int flags;
>   const rb_compare_f compare; /* comparitor for lookups */
>   const rb_compare_f ins_compare; /* comparitor for inserts */
>   const rb_augment_f augment;
>   unsigned key_size;
> };
>
> /* these function for

linux-next: manual merge of the arm-soc tree with the fbdev tree

2012-09-25 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the arm-soc tree got a conflict in
drivers/video/msm/mdp_hw.h between commit 8abf0b31e161 ("video: msm:
Remove useless mach/* includes") from the fbdev tree and commit
1ef21f6343ff ("ARM: msm: move platform_data definitions") from the
arm-soc tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/video/msm/mdp_hw.h
index 2a84137,a0bacf5..000
--- a/drivers/video/msm/mdp_hw.h
+++ b/drivers/video/msm/mdp_hw.h
@@@ -15,7 -15,8 +15,7 @@@
  #ifndef _MDP_HW_H_
  #define _MDP_HW_H_
  
- #include 
 -#include 
+ #include 
  
  struct mdp_info {
struct mdp_device mdp_dev;


pgp91thIauQB0.pgp
Description: PGP signature

Re: [PATCH 00/10] workqueue: restructure flush_workqueue() and start all flusher at the same time

2012-09-25 Thread Lai Jiangshan

On 09/26/2012 04:24 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> On Tue, Sep 25, 2012 at 05:02:43PM +0800, Lai Jiangshan wrote:
>> It is not possible to remove cascading. If cascading code is
>> not in flush_workqueue(), it must be in some where else.
> 
> Yeah, sure, I liked that it didn't have to be done explicitly as a
> separate step.
> 
>> If you force overflow to wait for freed color before do flush(which also
>> force only one flusher for one color), and force the sole flush_workqueue()
>> to grab ->flush_mutex twice, we can simplify the flush_workqueue().
>> (see the attached patch, it remove 100 LOC, and the cascading code becomes
>> only 3 LOC). But these two forcing slow down the caller a little.
> 
> Hmmm... so, that's a lot simpler.  flush_workqueue() isn't a super-hot
> code path and I don't think grabbing mutex twice is too big a deal.  I
> haven't actually reviewed the code but if it can be much simpler and
> thus easier to understand and verify, I might go for that.

I updated it. it is attached, it forces flush_workqueue() to grab mutex 
twice(no other forcing).
overflow queue is implemented in a different way. This new algorithm may become 
our choice
likely, please review this one.

> 
>> (And if you allow to use SRCU(which is only TWO colors), you can remove 
>> another
>> 150 LOC. flush_workqueue() will become single line. But it will add some 
>> more overhead
>> in flush_workqueue() because SRCU's readsite is lockless)
> 
> I'm not really following how SRCU would factor into this but
> supporting multiple colors was something explicitly requested by
> Linus.  The initial implementation was a lot simpler which supported
> only two colors.  Linus was worried that the high possibility of
> flusher clustering could lead to chaining of latencies.
> 

I did not know this history, thank you.

But the number of colors is not essential.
"Does the algorithm chain flushers" is essential.

If we can have multiple flushers for each color. It is not chained.
If we have only one flusher for one color. It is chained. Even we have multiple
color, it is still partially chained(image we have very high frequent 
flush_workqueue()).

The initial implementation of flush_workqueue() is "chained" algorithm.
The initial implementation of SRCU is also "chained" algorithm.
but the current SRCU which was implemented by me is not "chained"
(I don't propose to use SRCU for flush_workqueue(), I just discuss it)

The simple version of flush_workqueue() which I sent yesterday is "chained",
because it forces overflow flushers wait for free color and forces only one
flusher for one color.

Since "not chaining" is important/essential. I sent a new draft implement today.
it uses multiple queues, one for each color(like SRCU).
this version is also simple, it remove 90 LOC.

Thanks,
Lai

This patch is still applied on top of patch7. it replaces patch8~10

 workqueue.c |  152 diff --git 
a/kernel/workqueue.c b/kernel/workqueue.c
index be407e1..00f02ba 100644

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -208,7 +208,6 @@ struct cpu_workqueue_struct {
  */
 struct wq_flusher {
struct list_headlist;   /* F: list of flushers */
-   int flush_color;/* F: flush color waiting for */
struct completion   done;   /* flush completion */
 };
 
@@ -250,9 +249,7 @@ struct workqueue_struct {
int work_color; /* F: current work color */
int flush_color;/* F: current flush color */
atomic_tnr_cwqs_to_flush[WORK_NR_COLORS];
-   struct wq_flusher   *first_flusher; /* F: first flusher */
-   struct list_headflusher_queue;  /* F: flush waiters */
-   struct list_headflusher_overflow; /* F: flush overflow list */
+   struct list_headflusher[WORK_NR_COLORS]; /* F: flushers */
 
mayday_mask_t   mayday_mask;/* cpus requesting rescue */
struct worker   *rescuer;   /* I: rescue worker */
@@ -1000,8 +997,11 @@ static void wq_dec_flusher_ref(struct workqueue_struct 
*wq, int color)
 * If this was the last reference, wake up the first flusher.
 * It will handle the rest.
 */
-   if (atomic_dec_and_test(>nr_cwqs_to_flush[color]))
-   complete(>first_flusher->done);
+   if (atomic_dec_and_test(>nr_cwqs_to_flush[color])) {
+   BUG_ON(color != wq->flush_color);
+   complete(_first_entry(>flusher[color],
+  struct wq_flusher, list)->done);
+   }
 }
 
 /**
@@ -2540,27 +2540,20 @@ static void insert_wq_barrier(struct 
cpu_workqueue_struct *cwq,
  * becomes new flush_color and work_color is advanced by one.
  * All cwq's work_color are set to new work_color(advanced by one).
  *
- * The caller should have initialized @wq->first_flusher prior to
- * calling this function.
- *

[RFC GIT PULL rcu/next] v2 RCU commits for 3.7

2012-09-25 Thread Paul E. McKenney

Hello, Ingo,

This is now an RFC pull for only one of the previous two reasons.
The commits have how been through -next without the adaptive-idle patches,
but although good progress has been made on the adaptive-idle patches,
they are still not quite there yet.  Again, the current adaptive-idle
series has been rebased as rcu/idle on top of rcu/next, the latter branch
being the subject of this pull request.  Also, I have merged the
conflicting -tip branches into rcu/next, resolving the conflicts.
(One set was adjacent insertions/deletions, the other was resolved
as Peter Zijlstra described at https://lkml.org/lkml/2012/9/5/585.

Again, the major features of this series are:

0.  A fix for a latent bug that has been in RCU ever since the
addition of CPU stall warnings.  This bug results in
false-positive stall warnings, but thus far only on embedded
systems with severely cut-down userspace configurations.
This fix is located on an rcu/urgent branch, with the rest
of the commits based on top of it.  This commit CCs stable.
Given that the merge window is coming quite soon and given
the small number of affected users, I do -not- recommend
pushing it to 3.6, but the separate branch makes it easy to
find if someone needs it.

1.  Further reductions in latency spikes for huge systems, along
with additional boot-time adaptation to the actual hardware.
This is a large change, as it moves RCU grace-period
initialization and cleanup, along with quiescent-state forcing,
from softirq to a kthread.  However, it appears to be in
quite good shape (famous last words).  Posted to LKML at
https://lkml.org/lkml/2012/9/20/427.

2.  Updates to documentation and rcutorture, the latter category
including keeping statistics on CPU-hotplug latencies and
fixing some initialization-time races.  Posted to LKML at
https://lkml.org/lkml/2012/8/30/193.

3.  Miscellaneous fixes and improvements, posted to LKML at
https://lkml.org/lkml/2012/8/30/199.

4.  CPU-hotplug fixes and improvements, posted to LKML at
https://lkml.org/lkml/2012/8/30/292 for first three and at
https://lkml.org/lkml/2012/8/3/416.

5.  Idle-loop fixes that were omitted on an earlier submission,
posted to LKML at https://lkml.org/lkml/2012/8/30/251.

As noted earlier, all of these commits have been exposed to -next testing,
albeit in combination with adaptive-idle commits that will hopefully be
following soon.

These changes are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next

Thanx, Paul

>
Dimitri Sivanich (1):
  rcu: Segregate rcu_state fields to improve cache locality

Frederic Weisbecker (11):
  alpha: Fix preemption handling in idle loop
  alpha: Add missing RCU idle APIs on idle loop
  cris: Add missing RCU idle APIs on idle loop
  frv: Add missing RCU idle APIs on idle loop
  h8300: Add missing RCU idle APIs on idle loop
  m32r: Add missing RCU idle APIs on idle loop
  m68k: Add missing RCU idle APIs on idle loop
  mn10300: Add missing RCU idle APIs on idle loop
  parisc: Add missing RCU idle APIs on idle loop
  score: Add missing RCU idle APIs on idle loop
  xtensa: Add missing RCU idle APIs on idle loop

Li Zhong (1):
  rcu: Move TINY_RCU quiescent state out of extended quiescent state

Michael Wang (1):
  kmemleak: Replace list_for_each_continue_rcu with new interface

Paul E. McKenney (46):
  rcu: Move RCU grace-period initialization into a kthread
  rcu: Prevent initialization-time quiescent-state race
  rcu: Allow RCU grace-period initialization to be preempted
  rcu: Move RCU grace-period cleanup into kthread
  rcu: Allow RCU grace-period cleanup to be preempted
  rcu: Break up rcu_gp_kthread() into subfunctions
  rcu: Prevent offline CPUs from executing RCU core code
  rcu: Provide OOM handler to motivate lazy RCU callbacks
  rcu: Move quiescent-state forcing into kthread
  rcu: Allow RCU quiescent-state forcing to be preempted
  rcu: Adjust debugfs tracing for kthread-based quiescent-state forcing
  rcu: Prevent force_quiescent_state() memory contention
  rcu: Control grace-period duration from sysfs
  rcu: Make rcutree module parameters visible in sysfs
  rcu: Fix day-zero grace-period initialization/cleanup race
  rcu: Add random PROVE_RCU_DELAY to grace-period initialization
  rcu: Adjust for unconditional ->completed assignment
  rcu: Eliminate signed overflow in synchronize_rcu_expedited()
  rcu: Reduce synchronize_rcu_expedited() latency
  rcu: Simplify quiescent-state detection
  rcu: Handle unbalanced rcu_node configurations with few CPUs
  rcu: Shrink RCU based on

Re: XHCI Bug discovered in 3.6-RC6 (solution included)

2012-09-25 Thread Vivek Gautam

Apologies for that. I should have been rather a bit patient.

Best regards
Vivek

On Tue, Sep 25, 2012 at 7:10 PM, Greg KH  wrote:
>
> A: No.
> Q: Should I include quotations after my reply?
>
> http://daringfireball.net/2007/07/on_top
>
> On Tue, Sep 25, 2012 at 03:12:33PM +0530, Vivek Gautam wrote:
>> Hi
>>
>> I have posted the required patch for this:
>> usb: host: xhci: Fix Null pointer dereferencing with 71c731a for non-x86 
>> systems
>
> You sent this last saturday, a mere 3 days ago, two of them on a
> weekend, please be patient.
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2] pwm_backlight: Add device tree support for Low Threshold Brightness

2012-09-25 Thread Philip, Avinash

On Tue, Sep 25, 2012 at 11:49:14, Stephen Warren wrote:
> On 09/24/2012 10:29 PM, Philip, Avinash wrote:
> > On Fri, Sep 21, 2012 at 23:13:39, Stephen Warren wrote:
> >> On 09/21/2012 12:03 AM, Philip, Avinash wrote:
> >>> Hi Stephen,
> >>>
> >>> On Fri, Sep 21, 2012 at 10:46:45, Stephen Warren wrote:
>  On 09/20/2012 10:51 PM, Philip, Avinash wrote:
> > Some backlights perform poorly when driven by a PWM with a short
> > duty-cycle. For such devices, the low threshold can be used to specify a
> > lower bound for the duty-cycle and should be chosen to exclude the
> > problematic range.
> >
> > This patch adds support for an optional low-threshold-brightness
> > property.
> 
> > diff --git 
> > a/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt 
> > b/Documentation/devicetree/bindings/video/backlight/pwm-backlight.txt
> 
> >  Optional properties:
> >- pwm-names: a list of names for the PWM devices specified in the
> > "pwms" property (see PWM binding[0])
> > +  - low-threshold-brightness: brightness threshold low level. Low 
> > threshold
> > +brightness set to value so that backlight present on low end of
> > +brightness.
> 
>  For my education, why not just specify values above this value in the
>  brightness-levels array; how do those two interact?
> >>>
> >>> Please find details from 
> >>> https://lkml.org/lkml/2012/7/18/284
> >>
> >> Hmm. That still doesn't really explain what this property does.
> >>
> >> I'm going to guess that if this property is present, and values in the
> >> brightness-levels property get scaled between the
> >> low-threshold-brightness and 255 instead of being used directly.
> > 
> > This is correct.
> > 
> >> But then, in the email you linked to, what does "But brightness-levels 
> >> won't
> >> be uniformly divided" mean?
> > 
> > For some panels, backlight would absent on low end of brightness due to low
> > percentage in duty_cycle. Consider following example where backlight absent
> > for brightness levels from 0 - 51.
> > 
> > pwms = < 0 5>;
> > brightness-levels = <0 51 53 56 62 75 101 152 255>; 
> > default-brightness-level = <6>;
> > 
> > So in the example, brightness-levels are set to have values for backlight 
> > present.
> > Here levels are not uniformly divided.
> 
> So why not just change the values so they /are/ what you want? After
> all, it's just data and you can put whatever values you want there. What
> is preventing you from doing this?

brightness_threshold_level was added to explore lth_brightness support already
present in non-DT case.
 
> 
> Perhaps e.g.:
> 
> brightness-levels = <0 101 106 112 124 150 202 304 511>;
> (just multiplying everything by N, for arbitrary N=2, to get extra
> precision)
> 
> ... plus whatever adjustments are required to make the data "uniformly
> divided", which I can't do in the example here since I'd need to know
> whatever non-linear equation characterizes the backlight's PWM % duty
> cycle to perceived brightness mapping.
> 
> The only thing that could be preventing this is mathematical precision.
> While all the PWM DT examples I've seen have brightness-levels range
> from 0..255, I don't think there is any such actual limit; you could
> range from say 0..100 if you wanted, right?

The observation is correct. There are no fixed levels, configure these values 
as required. This is a scale of division for brightness variation. A linear
division in brightness won't give much difference in high end of 
brightness-levels
scale. So adopting binary division in brightness-levels will allow better 
resolution
in brightness. 

> 
> >> Either way, the DT binding should explain exactly what this value is
> >> used for, and how it affects the interpretation of values in
> >> brightness-levels.
> > 
> > Is DT binding documentation a good place to explain this feature?
> > Initially Thierry suggested document option. So I left out.
> 
> The binding documents are supposed to be a standalone description of
> what the data in DT does; given general no-Linux-specific domain
> knowledge, the binding document should be detailed enough for someone to
> understand how to fill in the DT. So, yes, I think the binding document
> would be a great place to put such documentation.

I will add details.

Thanks
Avinash

> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10002 jiffies)

2012-09-25 Thread Paul E. McKenney

On Wed, Sep 26, 2012 at 12:22:37PM +0800, Fengguang Wu wrote:
> On Tue, Sep 25, 2012 at 08:07:01AM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 25, 2012 at 07:19:38PM +0800, Fengguang Wu wrote:
> > > Hi Paul,
> > > 
> > > I've just bisected down one RCU stall problem:
> > > 
> > > [   12.035785] pktgen: Packet Generator for packet performance testing. 
> > > Version: 2.74
> > > [   12.435439] atkbd: probe of serio0 rejects match -19
> > > [  111.700160] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} 
> > > (detected by 0, t=10002 jiffies)
> > > [  111.700171] Pid: 0, comm: swapper/0 Not tainted 
> > > 3.6.0-rc5-4-gda10491 #1
> > > [  111.700178] Call Trace:
> > > [  111.700475]  [] rcu_check_callbacks+0x544/0x570
> > > [  111.700538]  [] update_process_times+0x36/0x70
> > > [  111.700547]  [] tick_sched_timer+0x57/0xc0
> > > [  111.700552]  [] __run_hrtimer.isra.31+0x4a/0xc0
> > > [  111.700557]  [] ? tick_nohz_handler+0xf0/0xf0
> > > [  111.700559]  [] hrtimer_interrupt+0xf5/0x290
> > > [  111.700562]  [] ? sched_clock_idle_wakeup_event+0x18/0x20
> > > [  111.700565]  [] ? tick_nohz_stop_idle+0x39/0x40
> > > [  111.700572]  [] smp_apic_timer_interrupt+0x4f/0x80
> > > [  111.700587]  [] apic_timer_interrupt+0x2a/0x30
> > > [  111.700593]  [] ? native_safe_halt+0x5/0x10
> > > [  111.700599]  [] default_idle+0x29/0x50
> > > [  111.700601]  [] cpu_idle+0x68/0xb0
> > > [  111.700609]  [] rest_init+0x67/0x70
> > > [  111.700627]  [] start_kernel+0x2ea/0x2f0
> > > [  111.700629]  [] ? repair_env_string+0x51/0x51
> > > [  111.700631]  [] i386_start_kernel+0x78/0x7d
> > > [  127.040302] bus: 'serio': driver_probe_device: matched device serio0 
> > > with driver atkbd
> > > [  127.041308] CPA self-test:
> > > 
> > > to this commit:
> > > 
> > > commit 06ae115a1d551cd952d80df06eaf8b5153351875
> > > Author: Paul E. McKenney 
> > > Date:   Sun Aug 14 15:56:54 2011 -0700
> > > 
> > > rcu: Avoid having just-onlined CPU resched itself when RCU is idle
> > 
> > Interesting.  Of course the stack is from the CPU that detected the
> > problem rather than the problematic CPU.  ;-)
> > 
> > Could you please try the following patch?
> 
> Paul, thanks for the quick fix! However it may still stall sometimes.
> Attached are 3 dmesgs with the stalls.

Do you have c96ea7cf from -rcu applied?  Corresponding patch is below.

Thanx, Paul



rcu: Avoid spurious RCU CPU stall warnings

If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings.  This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.

Reported-by: Becky Bruce 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Josh Triplett 

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2cf8eb3..98f2752 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -819,7 +819,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct 
rcu_data *rdp)
j = ACCESS_ONCE(jiffies);
js = ACCESS_ONCE(rsp->jiffies_stall);
rnp = rdp->mynode;
-   if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
+   if (rcu_gp_in_progress(rsp) &&
+   (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
 
/* We haven't checked in, so go dump stack. */
print_cpu_stall(rsp);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] workqueue: create __flush_delayed_work to avoid duplicating code

2012-09-25 Thread Viresh Kumar

On 25 September 2012 23:17, Tejun Heo  wrote:
> On Tue, Sep 25, 2012 at 04:06:07PM +0530, Viresh Kumar wrote:
>> flush_delayed_work() and flush_delayed_work_sync() had major portion of code
>> similar. This patch introduces another routine __flush_delayed_work() which
>> contains the common part to avoid code duplication.
>
> This part has seen a lot of update in pending wq/for-3.7 branch.
> Please rebase on top of that.
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.7

So, this patch is not required anymore. As they are already merged :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 05/10] vfs: introduce one hash table

2012-09-25 Thread Zhi Yong Wu

On Tue, Sep 25, 2012 at 5:54 PM, Ram Pai  wrote:
> On Sun, Sep 23, 2012 at 08:56:30PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>>
>>   Adds a hash table structure which contains
>> a lot of hash list and is used to efficiently
>> look up the data temperature of a file or its
>> ranges.
>>   In each hash list of hash table, the hash node
>> will keep track of temperature info.
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  fs/hot_tracking.c|   77 
>> -
>>  include/linux/hot_tracking.h |   35 +++
>>  2 files changed, 110 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
>> index fa89f70..5f96442 100644
>> --- a/fs/hot_tracking.c
>> +++ b/fs/hot_tracking.c
>> @@ -16,6 +16,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -24,6 +25,9 @@
>
> ...snip...
>
>> +/* Hash list heads for hot hash table */
>> +struct hot_hash_head {
>> + struct hlist_head hashhead;
>> + rwlock_t rwlock;
>> + u32 temperature;
>> +};
>> +
>> +/* Nodes stored in each hash list of hash table */
>> +struct hot_hash_node {
>> + struct hlist_node hashnode;
>> + struct list_head node;
>> + struct hot_freq_data *hot_freq_data;
>> + struct hot_hash_head *hlist;
>> + spinlock_t lock; /* protects hlist */
>> +
>> + /*
>> +  * number of references to this node
>> +  * equals 1 (hashlist entry)
>> +  */
>> + struct kref refs;
>> +};
>
> Dont see why you need yet another datastructure to hash the inode_item
> and the range_item into a hash list.  You can just add another
If there's such one structure, we can rapidly know which hash bucket
one inode_item is currently linking to via its hlist field.
This is useful if one inode_item corresponding to one file need to
move from one bucket to another bucket when this file get cold or
hotter.
Does it make sense to you?

> hlist_node in the inode_item and range_item. This field can be then used
> to link into the corresponding hash list.
>
> You can use the container_of() get to the inode_item or the range_item
> using the hlist_node field.
>
> You can thus eliminate a lot of code.
As what i said above, i don't think it can eliminate a lo of code.
>
>> +
>>  /* An item representing an inode and its access frequency */
>>  struct hot_inode_item {
>>   /* node for hot_inode_tree rb_tree */
>> @@ -68,6 +93,8 @@ struct hot_inode_item {
>>   spinlock_t lock;
>>   /* prevents kfree */
>>   struct kref refs;
>> + /* hashlist node for this inode */
>> + struct hot_hash_node *heat_node;
>
> this can be just
> struct hlist_node head_node; /* lookup hot_inode hash list */
>
> Use this field to link it into the corresponding hashlist.
>
>>  };
>>
> this can be just
>>  /*
>> @@ -91,6 +118,8 @@ struct hot_range_item {
>>   spinlock_t lock;
>>   /* prevents kfree */
>>   struct kref refs;
>> + /* hashlist node for this range */
>> + struct hot_hash_node *heat_node;
>
> this can be just
> struct hlist_node head_node; /* lookup hot_range hash list */
>
>
>>  };
>>
>>  struct hot_info {
>> @@ -98,6 +127,12 @@ struct hot_info {
>>
>>   /* red-black tree that keeps track of fs-wide hot data */
>>   struct hot_inode_tree hot_inode_tree;
>> +
>> + /* hash map of inode temperature */
>> + struct hot_hash_head heat_inode_hl[HEAT_HASH_SIZE];
>> +
>> + /* hash map of range temperature */
>> + struct hot_hash_head heat_range_hl[HEAT_HASH_SIZE];
>>  };
>>
>>  #endif  /* _LINUX_HOTTRACK_H */
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pagemap: fix wrong KPF_THP on slab pages

2012-09-25 Thread Naoya Horiguchi

On Wed, Sep 26, 2012 at 10:47:54AM +0800, Fengguang Wu wrote:
> On Tue, Sep 25, 2012 at 10:06:08PM -0400, Naoya Horiguchi wrote:
> > On Tue, Sep 25, 2012 at 05:20:48PM -0700, David Rientjes wrote:
> > > On Tue, 25 Sep 2012, Naoya Horiguchi wrote:
> > > 
> > > > KPF_THP can be set on non-huge compound pages like slab pages, because
> > > > PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> > > > and breaks user space applications which look for thp via 
> > > > /proc/kpageflags.
> > > > Currently thp is constructed only on anonymous pages, so this patch 
> > > > makes
> > > > KPF_THP be set when both of PageAnon and PageTransCompound are true.
> > > > 
> > > > Changelog in v2:
> > > >   - add a comment in code
> > > > 
> > > > Signed-off-by: Naoya Horiguchi 
> > > 
> > > Wouldn't PageTransCompound(page) && !PageHuge(page) && !PageSlab(page) be 
> > > better for a future extension of thp support?
> > 
> > Yes, this saves us an additional change when thp starts handling pagecaches.
> > Andrew, can you replace the previous version in -mm tree with new one below?
> > 
> > Thanks,
> > Naoya
> > ---
> > From: Naoya Horiguchi 
> > Date: Tue, 25 Sep 2012 21:30:25 -0400
> > Subject: [PATCH v3] kpageflags: fix wrong KPF_THP on slab pages
> > 
> > KPF_THP can be set on non-huge compound pages like slab pages, because
> > PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> 
> s/sees/checks/
> 
> > and breaks user space applications which look for thp via /proc/kpageflags.
> > This patch rules out setting KPF_THP wrongly by additional PageSlab check.
> > 
> > Changelog in v3:
> >   - check PageSlab instead of PageAnon
> >   - fix patch subject
> > 
> > Changelog in v2:
> >   - add a comment in code
> > 
> > Signed-off-by: Naoya Horiguchi 
> > ---
> >  fs/proc/page.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/proc/page.c b/fs/proc/page.c
> > index 7fcd0d6..e36d1f3 100644
> > --- a/fs/proc/page.c
> > +++ b/fs/proc/page.c
> > @@ -115,7 +115,12 @@ u64 stable_page_flags(struct page *page)
> > u |= 1 << KPF_COMPOUND_TAIL;
> > if (PageHuge(page))
> > u |= 1 << KPF_HUGE;
> > -   else if (PageTransCompound(page))
> > +   /*
> > +* PageTransCompound can be true for slab pages because it just sees
> 
> s/sees/checks/
> 
> > +* PG_head/PG_head, so we need to check PageSlab to make sure the given
> 
> PG_head/PG_head should be PG_head/PG_tail.

Ah, sorry for my carelessness.

> > +* page is a thp, not a non-huge compound page.
> > +*/
> > +   else if (PageTransCompound(page) && !PageSlab(page))
> > u |= 1 << KPF_THP;
> 
> Good catch!
> 
> Will this report THP for the various drivers that do __GFP_COMP
> page allocations?

I'm afraid it will. I think of checking PageLRU as an alternative,
but it needs compound_head() to report tail pages correctly.
In this context, pages are not pinned or locked, so it's unsafe to
use compound_head() because it can return a dangling pointer.
Maybe it's a thp's/hugetlbfs's (not kpageflags specific) problem,
so going forward with compound_head() expecting that it will be
fixed in the future work can be an option.

Thanks,
Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv1 3/6] dmaengine: dw_dmac: Add PCI part of the driver

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:43 PM, Andy Shevchenko
 wrote:
> diff --git a/drivers/dma/dw_dmac_at32.c b/drivers/dma/dw_dmac_at32.c
> index 7bc7ac4..5c9180e 100644
> --- a/drivers/dma/dw_dmac_at32.c
> +++ b/drivers/dma/dw_dmac_at32.c
> @@ -12,6 +12,7 @@
>   * it under the terms of the GNU General Public License version 2 as
>   * published by the Free Software Foundation.
>   */
> +

:(

>  #include 
>  #include 
>  #include 
> diff --git a/drivers/dma/dw_dmac_pci.c b/drivers/dma/dw_dmac_pci.c
> new file mode 100644
> index 000..7490894
> --- /dev/null
> +++ b/drivers/dma/dw_dmac_pci.c
> @@ -0,0 +1,130 @@
> +/*
> + * PCI driver for the Synopsys DesignWare DMA Controller
> + *
> + * Copyright (C) 2012 Intel Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DW_DRIVER_NAME "dw_dmac_pci"
> +
> +#define DRIVER(_is_private, _chan_order, _chan_pri)\
> +   ((kernel_ulong_t)&(struct dw_dma_platform_data) {   \
> +   .is_private = (_is_private),\
> +   .chan_allocation_order = (_chan_order), \
> +   .chan_priority = (_chan_pri),   \
> +   })

See if you can align "\" at the end of every line in one column

> +
> +static int __devinit dw_pci_probe(struct pci_dev *pdev,
> + const struct pci_device_id *id)
> +{
> +   struct platform_device  *pd;

no need of multiple spaces before *pd.

> +   struct resource r[2];
> +   struct dw_dma_platform_data *driver = (void *)id->driver_data;
> +   static int  instance;
> +   int ret;

for all above lines too

> +
> +   ret = pci_enable_device(pdev);
> +   if (ret)
> +   return ret;
> +
> +   pci_set_power_state(pdev, PCI_D0);
> +   pci_set_master(pdev);
> +   pci_try_set_mwi(pdev);
> +
> +   ret = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
> +   if (ret)
> +   goto err0;
> +
> +   ret = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
> +   if (ret)
> +   goto err0;
> +
> +   pd = platform_device_alloc("dw_dmac", instance);
> +   if (!pd) {
> +   dev_err(>dev, "can't allocate dw_dmac platform 
> device\n");
> +   ret = -ENOMEM;
> +   goto err0;
> +   }
> +
> +   memset(r, 0, sizeof(r));
> +
> +   r[0].start  = pci_resource_start(pdev, 0);
> +   r[0].end= pci_resource_end(pdev, 0);
> +   r[0].flags  = IORESOURCE_MEM;

ditto

> +
> +   r[1].start  = pdev->irq;
> +   r[1].flags  = IORESOURCE_IRQ;

ditto

> +   ret = platform_device_add_resources(pd, r, ARRAY_SIZE(r));
> +   if (ret) {
> +   dev_err(>dev, "can't add resources to platform 
> device\n");
> +   goto err1;
> +   }
> +
> +   ret = platform_device_add_data(pd, driver, sizeof(*driver));
> +   if (ret)
> +   goto err1;
> +
> +   dma_set_coherent_mask(>dev, pdev->dev.coherent_dma_mask);
> +   pd->dev.dma_mask = pdev->dev.dma_mask;
> +   pd->dev.dma_parms = pdev->dev.dma_parms;
> +   pd->dev.parent = >dev;
> +
> +   pci_set_drvdata(pdev, pd);
> +
> +   ret = platform_device_add(pd);
> +   if (ret) {
> +   dev_err(>dev, "platform_device_add failed\n");
> +   goto err1;
> +   }
> +
> +   instance++;
> +   return 0;
> +
> +err1:
> +   pci_set_drvdata(pdev, NULL);

Is this required?

> +   platform_device_put(pd);
> +err0:
> +   pci_disable_device(pdev);
> +
> +   return ret;
> +}
> +
> +static void __devexit dw_pci_remove(struct pci_dev *pdev)
> +{
> +   struct platform_device *pd = pci_get_drvdata(pdev);
> +
> +   platform_device_unregister(pd);
> +   pci_set_drvdata(pdev, NULL);
> +   pci_disable_device(pdev);
> +}
> +
> +static DEFINE_PCI_DEVICE_TABLE(dw_pci_id_table) = {
> +   { PCI_VDEVICE(INTEL, 0x0827), DRIVER(1, 0, 0) },
> +   { PCI_VDEVICE(INTEL, 0x0830), DRIVER(1, 0, 0) },
> +   { PCI_VDEVICE(INTEL, 0x0f06), DRIVER(1, 0, 0) },
> +   { 0, }
> +};
> +MODULE_DEVICE_TABLE(pci, dw_pci_id_table);
> +
> +static struct pci_driver dw_pci_driver = {
> +   .name   = DW_DRIVER_NAME,
> +   .id_table   = dw_pci_id_table,
> +   .probe  = dw_pci_probe,
> +   .remove = __devexit_p(dw_pci_remove),
> +};
> +
> +module_pci_driver(dw_pci_driver);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("DesignWare DMAC PCI driver");
> +MODULE_AUTHOR("Heikki Krogerus ");
> +MODULE_AUTHOR("Andy Shevchenko ");
> --
> 1.7.10.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: [PATCHv1 2/6] dmaengine: dw_dmac: add driver for Atmel AT32

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:43 PM, Andy Shevchenko
 wrote:
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index df32537..e47cc90 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -89,6 +89,15 @@ config DW_DMAC
>   Support the Synopsys DesignWare AHB DMA controller.  This
>   can be integrated in chips such as the Atmel AT32ap7000.
>
> +config DW_DMAC_AT32
> +   tristate "Synopsys DesignWare AHB DMA support for Atmel"
> +   depends on HAVE_CLK
> +   select DW_DMAC
> +   default y if CPU_AT32AP7000

Also, remove
default y if CPU_AT32AP7000 from  config DW_DMAC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch -mm] mm, numa: reclaim from all nodes within reclaim distance fix fix

2012-09-25 Thread David Rientjes

It's cleaner if the iteration is explicitly done only for NUMA kernels.  
No functional change.

Intended to be folded into 
mm-numa-reclaim-from-all-nodes-within-reclaim-distance.patch already in 
-mm.

Signed-off-by: David Rientjes 
---
 mm/page_alloc.c |   24 
 1 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1802,6 +1802,17 @@ static bool zone_allows_reclaim(struct zone *local_zone, 
struct zone *zone)
return node_isset(local_zone->node, zone->zone_pgdat->reclaim_nodes);
 }
 
+static void __paginginit init_zone_allows_reclaim(int nid)
+{
+   int i;
+
+   for_each_online_node(i)
+   if (node_distance(nid, i) <= RECLAIM_DISTANCE) {
+   node_set(i, NODE_DATA(nid)->reclaim_nodes);
+   zone_reclaim_mode = 1;
+   }
+}
+
 #else  /* CONFIG_NUMA */
 
 static nodemask_t *zlc_setup(struct zonelist *zonelist, int alloc_flags)
@@ -1827,6 +1838,10 @@ static bool zone_allows_reclaim(struct zone *local_zone, 
struct zone *zone)
 {
return true;
 }
+
+static inline void init_zone_allows_reclaim(int nid)
+{
+}
 #endif /* CONFIG_NUMA */
 
 /*
@@ -4551,20 +4566,13 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
unsigned long node_start_pfn, unsigned long *zholes_size)
 {
pg_data_t *pgdat = NODE_DATA(nid);
-   int i;
 
/* pg_data_t should be reset to zero when it's allocated */
WARN_ON(pgdat->nr_zones || pgdat->classzone_idx);
 
pgdat->node_id = nid;
pgdat->node_start_pfn = node_start_pfn;
-   for_each_online_node(i)
-   if (node_distance(nid, i) <= RECLAIM_DISTANCE) {
-   node_set(i, pgdat->reclaim_nodes);
-#ifdef CONFIG_NUMA
-   zone_reclaim_mode = 1;
-#endif
-   }
+   init_zone_allows_reclaim(nid);
calculate_node_totalpages(pgdat, zones_size, zholes_size);
 
alloc_node_mem_map(pgdat);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv1 2/6] dmaengine: dw_dmac: add driver for Atmel AT32

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:43 PM, Andy Shevchenko
 wrote:
> From: Heikki Krogerus 
>
> This driver should be usable on all platforms that depend on clk API.

This is not what you mentioned in subject :(

> Signed-off-by: Heikki Krogerus 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/dma/Kconfig|9 +++
>  drivers/dma/Makefile   |1 +
>  drivers/dma/dw_dmac.c  |   23 +--
>  drivers/dma/dw_dmac_at32.c |  150 
> 

I don't agree with the naming used here. There is nothing AT32
specific here. It is actively used
by SPEAr. It should be named dw_dmac_platform.c or *pltfm.c

>  4 files changed, 162 insertions(+), 21 deletions(-)
>  create mode 100644 drivers/dma/dw_dmac_at32.c
>
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index df32537..e47cc90 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -89,6 +89,15 @@ config DW_DMAC
>   Support the Synopsys DesignWare AHB DMA controller.  This
>   can be integrated in chips such as the Atmel AT32ap7000.
>
> +config DW_DMAC_AT32
> +   tristate "Synopsys DesignWare AHB DMA support for Atmel"

:(

> +   depends on HAVE_CLK

Even this is not a must for all users of platform driver. So can leave this.

> +   select DW_DMAC
> +   default y if CPU_AT32AP7000
> +   help
> + Support the Synopsys DesignWare AHB DMA controller in the chips
> + such as the Atmel AT32ap7000.
> +
>  config AT_HDMAC
> tristate "Atmel AHB DMA support"
> depends on ARCH_AT91

> diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
> index d9344a7..9c8a500 100644
> --- a/drivers/dma/dw_dmac.c
> +++ b/drivers/dma/dw_dmac.c
> @@ -18,7 +18,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -1681,35 +1680,17 @@ static const struct dev_pm_ops dw_dev_pm_ops = {
> .poweroff_noirq = dw_suspend_noirq,
>  };
>
> -#ifdef CONFIG_OF
> -static const struct of_device_id dw_dma_id_table[] = {
> -   { .compatible = "snps,dma-spear1340" },
> -   {}
> -};
> -MODULE_DEVICE_TABLE(of, dw_dma_id_table);
> -#endif
> -
>  static struct platform_driver dw_driver = {
> +   .probe  = dw_probe,
> .remove = __devexit_p(dw_remove),
> .shutdown   = dw_shutdown,
> .driver = {
> .name   = "dw_dmac",
> .pm = _dev_pm_ops,
> -   .of_match_table = of_match_ptr(dw_dma_id_table),
> },
>  };
>
> -static int __init dw_init(void)
> -{
> -   return platform_driver_probe(_driver, dw_probe);
> -}
> -subsys_initcall(dw_init);
> -
> -static void __exit dw_exit(void)
> -{
> -   platform_driver_unregister(_driver);
> -}
> -module_exit(dw_exit);
> +module_platform_driver(dw_driver);

Shouldn't this be a separate patch?

>
>  MODULE_LICENSE("GPL v2");
>  MODULE_DESCRIPTION("Synopsys DesignWare DMA Controller driver");
> diff --git a/drivers/dma/dw_dmac_at32.c b/drivers/dma/dw_dmac_at32.c
> new file mode 100644
> index 000..7bc7ac4
> --- /dev/null
> +++ b/drivers/dma/dw_dmac_at32.c
> @@ -0,0 +1,150 @@
> +/*
> + * Driver for the Synopsys DesignWare DMA Controller

How is this file different from dw_dmac.c?

> + *
> + * The driver is based on the excerpts from the original dw_dmac.c. That's 
> why
> + * the same copyright holders are mentioned here as well.
> + *
> + * Copyright (C) 2007-2008 Atmel Corporation
> + * Copyright (C) 2010-2011 ST Microelectronics
> + * Copyright (C) 2012 Intel Corporation
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct dw_at32 {
> +   struct platform_device  *pdev;
> +   struct clk  *clk;
> +};
> +
> +static int __init dw_at32_probe(struct platform_device *pdev)
> +{
> +   struct dw_at32  *at32;
> +   struct platform_device  *pd;
> +   struct dw_dma_platform_data *pdata = pdev->dev.platform_data;
> +   static int  instance;
> +   int ret;
> +
> +   at32 = devm_kzalloc(>dev, sizeof(*at32), GFP_KERNEL);
> +   if (!at32) {
> +   dev_err(>dev, "can't allocate memory\n");
> +   return -ENOMEM;
> +   }
> +
> +   pd = platform_device_alloc("dw_dmac", instance);
> +   if (!pd) {
> +   dev_err(>dev, "can't allocate dw_dmac platform 
> device\n");
> +   return -ENOMEM;
> +   }

Why create another device? Why not simply call dw_probe() from here?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the

Re: [GIT PULL] Asymmetric keys and module signing

2012-09-25 Thread Rusty Russell

David Howells  writes:
> The module signing patches provide:
>
>  - Some fixes to Rusty's patch.  Also an additional patch to extend the policy
>handling for modules signed with an unknown key and to handle FIPS mode.

Ok, I merged some of this (after our previous accidentally-off-list
discussion).

You previously wrote:
> You can't compare them that easily.  One has a FIPS-mode panic and the other
> doesn't.  Do we want to panic if we reject an unsigned module in enforcing
> mode when we're in FIPS mode?

It's a line ball, but I think consistency wins.  Not a validly signed
module => panic.

The code becomes pretty straightforward then:

if (!err) {
info->sig_ok = true;
return 0;
}
if (fips_enabled)
   panic("Module verification failed with error %d in FIPS mode\n",
 err);
if (err == -ENOKEY && !sig_enforce)
err = 0;

return err;

In preparation, I've changed that below (and also, fixed up the -ENOKEY
which I said I'd do, and didn't).

Thanks,
Rusty.
PS.  Agree with Kconfig options move, but I'll do that in separate patch.

From: Rusty Russell 
Subject: module: signature checking hook

We do a very simple search for a particular string appended to the module
(which is cache-hot and about to be SHA'd anyway).  There's both a config
option and a boot parameter which control whether we accept (and taint) or
fail with unsigned modules.

(Useful feedback and tweaks by David Howells )

Signed-off-by: Rusty Russell 

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1582,6 +1582,12 @@ bytes respectively. Such letter suffixes
log everything. Information is printed at KERN_DEBUG
so loglevel=8 may also need to be specified.
 
+   module.sig_enforce
+   [KNL] When CONFIG_MODULE_SIG is set, this means that
+   modules without (valid) signatures will fail to load.
+   Note that if CONFIG_MODULE_SIG_ENFORCE is set, that
+   is always true, so this option does nothing.
+
mousedev.tap_time=
[MOUSE] Maximum time between finger touching and
leaving touchpad surface for touch to be considered
diff --git a/include/linux/module.h b/include/linux/module.h
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -21,6 +21,9 @@
 #include 
 #include 
 
+/* In stripped ARM and x86-64 modules, ~ is surprisingly rare. */
+#define MODULE_SIG_STRING "~Module signature appended~\n"
+
 /* Not Yet Implemented */
 #define MODULE_SUPPORTED_DEVICE(name)
 
@@ -260,6 +263,11 @@ struct module
const unsigned long *unused_gpl_crcs;
 #endif
 
+#ifdef CONFIG_MODULE_SIG
+   /* Signature was verified. */
+   bool sig_ok;
+#endif
+
/* symbols that will be GPL-only in the near future. */
const struct kernel_symbol *gpl_future_syms;
const unsigned long *gpl_future_crcs;
diff --git a/init/Kconfig b/init/Kconfig
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1585,6 +1585,20 @@ config MODULE_SRCVERSION_ALL
  the version).  With this option, such a "srcversion" field
  will be created for all modules.  If unsure, say N.
 
+config MODULE_SIG
+   bool "Module signature verification"
+   depends on MODULES
+   help
+ Check modules for valid signatures upon load: the signature
+ is simply appended to the module. For more information see
+ Documentation/module-signing.txt.
+
+config MODULE_SIG_FORCE
+   bool "Require modules to be validly signed"
+   depends on MODULE_SIG
+   help
+ Reject unsigned modules or signed modules for which we don't have a
+ key.  Without this, such modules will simply taint the kernel.
 endif # MODULES
 
 config INIT_ALL_POSSIBLE
diff --git a/kernel/Makefile b/kernel/Makefile
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -55,6 +55,7 @@ obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock
 obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
 obj-$(CONFIG_UID16) += uid16.o
 obj-$(CONFIG_MODULES) += module.o
+obj-$(CONFIG_MODULE_SIG) += module_signing.o
 obj-$(CONFIG_KALLSYMS) += kallsyms.o
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_KEXEC) += kexec.o
diff --git a/kernel/module-internal.h b/kernel/module-internal.h
new file mode 100644
--- /dev/null
+++ b/kernel/module-internal.h
@@ -0,0 +1,13 @@
+/* Module internals
+ *
+ * Copyright (C) 2012 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+

Re: [PATCH] x86: remove the useless branch in c_start()

2012-09-25 Thread Michael Wang

On 09/19/2012 01:42 PM, Michael Wang wrote:
> Since 'cpu == -1' in cpumask_next() is legal, no need to handle '*pos == 0'
> specially.
> 
> About the comments:
>   /* just in case, cpu 0 is not the first */
> A test with a cpumask in which cpu 0 is not the first has been done, and it
> works well.

Could I get some comments on this patch?

Regards,
Michael Wang

> 
> This patch will remove that useless branch to clean the code.
> 
> Signed-off-by: Michael Wang 
> ---
>  arch/x86/kernel/cpu/proc.c |5 +
>  1 files changed, 1 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> index 8022c66..fbd8955 100644
> --- a/arch/x86/kernel/cpu/proc.c
> +++ b/arch/x86/kernel/cpu/proc.c
> @@ -140,10 +140,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> 
>  static void *c_start(struct seq_file *m, loff_t *pos)
>  {
> - if (*pos == 0)  /* just in case, cpu 0 is not the first */
> - *pos = cpumask_first(cpu_online_mask);
> - else
> - *pos = cpumask_next(*pos - 1, cpu_online_mask);
> + *pos = cpumask_next(*pos - 1, cpu_online_mask);
>   if ((*pos) < nr_cpu_ids)
>   return _data(*pos);
>   return NULL;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched: rewrite the wrong annotation for select_task_rq_fair()

2012-09-25 Thread Michael Wang

On 09/18/2012 04:16 PM, Michael Wang wrote:
> The annotation for select_task_rq_fair() is wrong since commit c88d5910, it's
> actually for a removed function.
> 
> This patch rewrite the wrong annotation to make it correct.

Could I get some comments on this patch?

Regards,
Michael Wang

> 
> Signed-off-by: Michael Wang 
> ---
>  kernel/sched/fair.c |   14 --
>  1 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6b800a1..35eb43a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2682,15 +2682,17 @@ done:
>  }
> 
>  /*
> - * sched_balance_self: balance the current task (running on cpu) in domains
> - * that have the 'flag' flag set. In practice, this is SD_BALANCE_FORK and
> - * SD_BALANCE_EXEC.
> + * select_task_rq_fair:
> + *   Select a suitable CPU for task p, in order to keep load balance.
>   *
> - * Balance, ie. select the least loaded group.
> + *   sd_flag:
> + *   the domain we try to balance should have "sd_flag" flag set.
>   *
> - * Returns the target CPU number, or the same CPU if no balancing is needed.
> + *   wake_flags:
> + *   indicate WF_SYNC.
>   *
> - * preempt must be disabled.
> + *   Must hold p->pi_lock and disable irq before invoke.
> + *   Return the CPU number for task p to running on.
>   */
>  static int
>  select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flags)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv1 1/6] dmaengine: dw_dmac: Remove clk API dependency

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:43 PM, Andy Shevchenko
 wrote:
> From: Heikki Krogerus 
>
> Not all platforms support clk API.
>
> Signed-off-by: Heikki Krogerus 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/dma/Kconfig|1 -
>  drivers/dma/dw_dmac.c  |   18 +++---
>  drivers/dma/dw_dmac_regs.h |1 -
>  3 files changed, 3 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index 677cd6e..df32537 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -83,7 +83,6 @@ config INTEL_IOP_ADMA
>
>  config DW_DMAC
> tristate "Synopsys DesignWare AHB DMA support"
> -   depends on HAVE_CLK
> select DMA_ENGINE
> default y if CPU_AT32AP7000
> help

This change is good.

> diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
> index 9316d03..d9344a7 100644
> --- a/drivers/dma/dw_dmac.c
> +++ b/drivers/dma/dw_dmac.c
> @@ -1,9 +1,9 @@
>  /*
> - * Driver for the Synopsys DesignWare DMA Controller (aka DMACA on
> - * AVR32 systems.)
> + * Core driver for the Synopsys DesignWare DMA Controller
>   *
>   * Copyright (C) 2007-2008 Atmel Corporation
>   * Copyright (C) 2010-2011 ST Microelectronics
> + * Copyright (C) 2012 Intel Corporation
>   *
>   * This program is free software; you can redistribute it and/or modify
>   * it under the terms of the GNU General Public License version 2 as

these too.

> @@ -12,7 +12,6 @@
>  #define VERBOSE_DEBUG
>  #define DEBUG
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -224,7 +223,6 @@ static inline void dwc_dump_chan_regs(struct dw_dma_chan 
> *dwc)
> channel_readl(dwc, CTL_LO));
>  }
>
> -
>  static inline void dwc_chan_disable(struct dw_dma *dw, struct dw_dma_chan 
> *dwc)
>  {
> channel_clear_bit(dw, CH_EN, dwc->mask);
> @@ -1508,11 +1506,6 @@ static int __devinit dw_probe(struct platform_device 
> *pdev)
> if (!dw)
> return -ENOMEM;
>
> -   dw->clk = devm_clk_get(>dev, "hclk");
> -   if (IS_ERR(dw->clk))
> -   return PTR_ERR(dw->clk);
> -   clk_prepare_enable(dw->clk);
> -
> dw->regs = regs;
>
> /* get hardware configuration parameters */
> @@ -1657,19 +1650,14 @@ static int __devexit dw_remove(struct platform_device 
> *pdev)
>
>  static void dw_shutdown(struct platform_device *pdev)
>  {
> -   struct dw_dma   *dw = platform_get_drvdata(pdev);
> -
> dw_dma_off(platform_get_drvdata(pdev));
> -   clk_disable_unprepare(dw->clk);
>  }
>
>  static int dw_suspend_noirq(struct device *dev)
>  {
> struct platform_device *pdev = to_platform_device(dev);
> -   struct dw_dma   *dw = platform_get_drvdata(pdev);
>
> dw_dma_off(platform_get_drvdata(pdev));
> -   clk_disable_unprepare(dw->clk);
>
> return 0;
>  }
> @@ -1679,8 +1667,8 @@ static int dw_resume_noirq(struct device *dev)
> struct platform_device *pdev = to_platform_device(dev);
> struct dw_dma   *dw = platform_get_drvdata(pdev);
>
> -   clk_prepare_enable(dw->clk);
> dma_writel(dw, CFG, DW_CFG_DMA_EN);
> +
> return 0;
>  }
>
> diff --git a/drivers/dma/dw_dmac_regs.h b/drivers/dma/dw_dmac_regs.h
> index ff39fa6..50e0b63 100644
> --- a/drivers/dma/dw_dmac_regs.h
> +++ b/drivers/dma/dw_dmac_regs.h
> @@ -229,7 +229,6 @@ struct dw_dma {
> struct dma_device   dma;
> void __iomem*regs;
> struct tasklet_struct   tasklet;
> -   struct clk  *clk;
>
> u8  all_chan_mask;

But not these. There are dummy clk routines available now for platforms which
don't support clk framework. So, even if your platform doesn't support clk
framework, you should be able to compile your code.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched: fix should_resched() to avoid do schedule in atomic

2012-09-25 Thread Michael Wang

On 09/18/2012 11:13 AM, Michael Wang wrote:
> This patch try to fix the BUG:
> 
> [0.043953] BUG: scheduling while atomic: swapper/0/1/0x1002
> [0.044017] no locks held by swapper/0/1.
> [0.044692] Pid: 1, comm: swapper/0 Not tainted 3.6.0-rc1-00420-gb7aebb9 
> #34
> [0.045861] Call Trace:
> [0.048071]  [] __schedule_bug+0x5e/0x70
> [0.048890]  [] __schedule+0x91/0xb10
> [0.049660]  [] ? vsnprintf+0x33a/0x450
> [0.050444]  [] ? lg_local_lock+0x6/0x70
> [0.051256]  [] ? wait_for_xmitr+0x31/0x90
> [0.052019]  [] ? do_raw_spin_unlock+0xa5/0xf0
> [0.052903]  [] ? _raw_spin_unlock+0x22/0x30
> [0.053759]  [] ? up+0x1b/0x70
> [0.054421]  [] __cond_resched+0x1b/0x30
> [0.055228]  [] _cond_resched+0x45/0x50
> [0.056020]  [] mutex_lock_nested+0x28/0x370
> [0.056884]  [] ? console_unlock+0x3a2/0x4e0
> [0.057741]  [] __irq_alloc_descs+0x39/0x1c0
> [0.058589]  [] io_apic_setup_irq_pin+0x2c/0x310
> [0.060042]  [] setup_IO_APIC+0x101/0x744
> [0.060878]  [] ? clear_IO_APIC+0x31/0x50
> [0.061695]  [] native_smp_prepare_cpus+0x538/0x680
> [0.062644]  [] ? do_one_initcall+0x12c/0x12c
> [0.063517]  [] ? do_one_initcall+0x12c/0x12c
> [0.064016]  [] kernel_init+0x4b/0x17f
> [0.064790]  [] ? do_one_initcall+0x12c/0x12c
> [0.065660]  [] kernel_thread_helper+0x6/0x10
> 
> The process to trigger the BUG is:
> 
>   native_smp_prepare_cpus()
>   preempt_disable()   //preempt_count++
>   __irq_alloc_descs()
>   mutex_lock()
>   might_sleep()   //should_resched() return true
>   __schedule()
>   preempt_disable()   //preempt_count++
>   schedule_bug()  //preempt_count > 1, report bug
> 
> So the issue is that should_resched() should not return true while the preempt
> already disabled.

Hi, Peter

Could we use this solution to fix the bug?

Regards,
Michael Wang

> 
> This patch will fix the issue, then might_sleep() won't do schedule in atomic
> any more.
> 
> Reported-by: Fengguang Wu 
> Signed-off-by: Michael Wang 
> ---
>  kernel/sched/core.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b38f00e..2b7cd15 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4171,7 +4171,7 @@ SYSCALL_DEFINE0(sched_yield)
> 
>  static inline int should_resched(void)
>  {
> - return need_resched() && !(preempt_count() & PREEMPT_ACTIVE);
> + return need_resched() && !preempt_count();
>  }
> 
>  static void __cond_resched(void)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv1 6/6] MAINTAINERS: add recently created files to dw_dmac section

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 7:07 PM, Andy Shevchenko
 wrote:
> On Tue, 2012-09-25 at 06:19 -0700, Joe Perches wrote:
>> On Tue, 2012-09-25 at 15:13 +0300, Andy Shevchenko wrote:
>> > Signed-off-by: Andy Shevchenko 
>>
>> You also added yourself as a maintainer.
>> Congrats/sympathies, etc...
> Actually I prefer to be just a supporter, because I introduce few
> additional files. But if I understood correctly, there is no way to
> distinguish person statuses in one section.

Then probably you just need to mark your entries in driver files?

>> > diff --git a/MAINTAINERS b/MAINTAINERS
>> []
>> > @@ -6007,10 +6007,13 @@ F:  drivers/tty/serial
>> >
>> >  SYNOPSYS DESIGNWARE DMAC DRIVER
>> >  M: Viresh Kumar 
>> > +M: Andy Shevchenko 
>> >  S: Maintained
>> >  F: include/linux/dw_dmac.h
>> >  F: drivers/dma/dw_dmac_regs.h
>> >  F: drivers/dma/dw_dmac.c
>> > +F: drivers/dma/dw_dmac_at32.c
>> > +F: drivers/dma/dw_dmac_pci.c
>>
>> Perhaps these F: lines instead
>>
>> F:include/linux/dw_dmac.h
>> F:drivers/dma/dw_dmac*
> Might be. There is another idea to move them under drivers/dma/dw/
> folder. So, Viresh, what do you think is better?

Yes. Can be done.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv1 5/6] MAINTAINERS: fix indentation for Viresh Kumar

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:43 PM, Andy Shevchenko
 wrote:
> Signed-off-by: Andy Shevchenko 
> Cc: Viresh Kumar 
> ---
>  MAINTAINERS |   16 
>  1 file changed, 8 insertions(+), 8 deletions(-)

Acked-by: Viresh Kumar 

Though i am not sure if this patch should be part of the set.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3] dw_dmac: autoconfigure data_width or get it via platform data

2012-09-25 Thread viresh kumar

On Tue, Sep 25, 2012 at 5:09 PM, Andy Shevchenko
 wrote:
> Not all of the controllers support the 64 bit data width. Make it configurable
> via platform data. The driver will try to get a value from the component
> parameters, otherwise it will use the platform data.
>
> Signed-off-by: Andy Shevchenko 
> ---
> Since v2:
> - sometimes memory-to-memory test is failed, that's why we need to choose
>   minimum data portion between source and destination limits

Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 01/10] vfs: introduce private rb structures

2012-09-25 Thread Zhi Yong Wu

On Tue, Sep 25, 2012 at 6:20 PM, Ram Pai  wrote:
> On Sun, Sep 23, 2012 at 08:56:26PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>>
>>   One root structure hot_info is defined, is hooked
>> up in super_block, and will be used to hold rb trees
>> root, hash list root and some other information, etc.
>>   Adds hot_inode_tree struct to keep track of
>> frequently accessed files, and be keyed by {inode, offset}.
>> Trees contain hot_inode_items representing those files
>> and ranges.
>>   Having these trees means that vfs can quickly determine the
>> temperature of some data by doing some calculations on the
>> hot_freq_data struct that hangs off of the tree item.
>>   Define two items hot_inode_item and hot_range_item,
>> one of them represents one tracked file
>> to keep track of its access frequency and the tree of
>> ranges in this file, while the latter represents
>> a file range of one inode.
>>   Each of the two structures contains a hot_freq_data
>> struct with its frequency of access metrics (number of
>> {reads, writes}, last {read,write} time, frequency of
>> {reads,writes}).
>>   Also, each hot_inode_item contains one hot_range_tree
>> struct which is keyed by {inode, offset, length}
>> and used to keep track of all the ranges in this file.
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>> +
> ..snip..
>
>> +/* A tree that sits on the hot_info */
>> +struct hot_inode_tree {
>> + struct rb_root map;
>> + rwlock_t lock;
>> +};
>> +
>> +/* A tree of ranges for each inode in the hot_inode_tree */
>> +struct hot_range_tree {
>> + struct rb_root map;
>> + rwlock_t lock;
>> +};
>
> Can as well have a generic datastructure called hot_tree instead
> of having two different datastructure which basically are the same.
OK.
>
>> +
>> +/* A frequency data struct holds values that are used to
>> + * determine temperature of files and file ranges. These structs
>> + * are members of hot_inode_item and hot_range_item
>> + */
>> +struct hot_freq_data {
>> + struct timespec last_read_time;
>> + struct timespec last_write_time;
>> + u32 nr_reads;
>> + u32 nr_writes;
>> + u64 avg_delta_reads;
>> + u64 avg_delta_writes;
>> + u8 flags;
>> + u32 last_temperature;
>> +};
>> +
>> +/* An item representing an inode and its access frequency */
>> +struct hot_inode_item {
>> + /* node for hot_inode_tree rb_tree */
>> + struct rb_node rb_node;
>> + /* tree of ranges in this inode */
>> + struct hot_range_tree hot_range_tree;
>> + /* frequency data for this inode */
>> + struct hot_freq_data hot_freq_data;
>> + /* inode number, copied from inode */
>> + unsigned long i_ino;
>> + /* used to check for errors in ref counting */
>> + u8 in_tree;
>> + /* protects hot_freq_data, i_no, in_tree */
>> + spinlock_t lock;
>> + /* prevents kfree */
>> + struct kref refs;
>> +};
>> +
>> +/*
>> + * An item representing a range inside of an inode whose frequency
>> + * is being tracked
>> + */
>> +struct hot_range_item {
>> + /* node for hot_range_tree rb_tree */
>> + struct rb_node rb_node;
>> + /* frequency data for this range */
>> + struct hot_freq_data hot_freq_data;
>> + /* the hot_inode_item associated with this hot_range_item */
>> + struct hot_inode_item *hot_inode;
>> + /* starting offset of this range */
>> + u64 start;
>> + /* length of this range */
>> + u64 len;
>> + /* used to check for errors in ref counting */
>> + u8 in_tree;
>> + /* protects hot_freq_data, start, len, and in_tree */
>> + spinlock_t lock;
>> + /* prevents kfree */
>> + struct kref refs;
>> +};
>
> might as well have just one generic datastructure called hot_item with
> all the common fields and then have
>
> struct hot_inode_item  {
> struct hot_item hot_inode;
> struct hot_tree hot_range_tree;
> unsigned long i_ino;
> }
>
> and
>
> struct hot_range_item {
> struct hot_item hot_range;
> u64 start;
> u64 len;/* length of this range */
> }
>
> This should help you eliminate some duplicate code as well.
OK, i will try to apply them. thanks.
>
>
> RP
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson

On Wed, 2012-09-26 at 01:01 +0200, Florian Dazinger wrote:
> Am Tue, 25 Sep 2012 13:43:46 -0600
> schrieb Alex Williamson :
> 
> > On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > > Am Tue, 25 Sep 2012 12:32:50 -0600
> > > schrieb Alex Williamson :
> > > 
> > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > > Hi,
> > > > > I think I've found a regression, which causes an early boot crash, I
> > > > > appended the kernel output via jpg file, since I do not have a serial
> > > > > console or sth.
> > > > > 
> > > > > after bisection, it boils down to this commit:
> > > > > 
> > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > > Author: Alex Williamson 
> > > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > > 
> > > > > amd_iommu: Support IOMMU groups
> > > > > 
> > > > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > > > Existing notifiers make sure this gets called for each device.
> > > > > 
> > > > > Signed-off-by: Alex Williamson 
> > > > > Signed-off-by: Joerg Roedel 
> > > > > 
> > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > > > 
> > > > [switching back to mailing list thread]
> > > > 
> > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > > 
> > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
> > > > 1300
> > > > [1.485683] AMD-Vi:mmio-addr: feb2
> > > > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 
> > > > 00
> > > > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > > > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > > > flags: 00
> > > > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 
> > > > 00
> > > > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > > > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > > > flags: 00
> > > > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > > > flags: 00
> > > > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > > > flags: 00
> > > > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > > > flags: 00
> > > > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > > > flags: 00
> > > > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > > > flags: 00
> > > > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > > > flags: 00
> > > > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > > > flags: 00
> > > > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > > > flags: 00
> > > > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > > > flags: 00
> > > > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > > > flags: 00
> > > > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > > > flags: 00
> > > > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > > > flags: 00 devid_to: 08:00.0
> > > > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > > > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > > > flags: 00
> > > > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 
> > > > 00
> > > > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > > > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 
> > > > 00
> > > > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > > > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > > > flags: d7
> > > > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > > > flags: 00
> > > > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > > > flags: 00
> > > > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > > > flags: 00 devid_to: 00:14.4
> > > > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > > > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > > > flags: 00
> > > > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 
> > > > 00
> > > > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > > > 
> > > > 
> > > > > lspci:
> > > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI 
> > > > > to PCI bridge (external gfx0 port B) (rev 02)
> > > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > > > Management Unit (IOMMU)
> > > > > 00:02.0 PCI bridge: Advanced Micro

Re: [PATCH v2 2/2] btrfs-progs: Fix up memory leakage

2012-09-25 Thread Zhi Yong Wu

On Wed, Sep 26, 2012 at 1:14 AM, Goffredo Baroncelli  wrote:
> On 09/25/2012 12:14 PM, David Sterba wrote:
>>
>> On Tue, Sep 25, 2012 at 10:02:16AM +0800, zwu.ker...@gmail.com wrote:
>>>
>>> From: Zhi Yong Wu
>>>
>>>Some code pathes forget to free memory on exit.
>>
>>
>> Same as with the fd's, kernel will free all memory for us at exit().
>
>
> I strongly disagree with this approach. The callee often don't know what
> happen after and before the call. The same is true for the programmer,
> because the code is quite often updated by several people. A clean exit() is
> the right thing to do as general rule. I don't see any valid reason (in the
> btrfs context) to do otherwise.
>
> Relying on the exit() for a proper clean-up increase the likelihood of bug
> when the code evolves (see my patch   [RESPOST][BTRFS-PROGS][PATCH]
> btrfs_read_dev_super(): uninitialized variable for an example of what means
> an incorrect deallocation of resource).
>
>
>> If there's lots of memory allocated, it may be even faster to leave the
>> unallocation process to kernel as it will do it in one go, while the
>> application would unnecessarily free it chunk by chunk.
>
>
> May be I am wrong, but I don't think that the increase of speed of the btrfs
> "command" is even measurable relying on exit instead of free()-ing each
> chunk of memory one at time The same should be true for the
> open()/close()

I fully agree with you. In one same function, i find that some code
path free system sources,
while other code path doesn't. This is one nice way.

>
> My 2¢
>
> BR
> G.Baroncelli
>
>>
>> david
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> .
>>
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 03/10] vfs: add one new mount option '-o hottrack'

2012-09-25 Thread Zhi Yong Wu

On Tue, Sep 25, 2012 at 5:28 PM, Dave Chinner  wrote:
> On Sun, Sep 23, 2012 at 08:56:28PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>>
>>   Introduce one new mount option '-o hottrack',
>> and add its parsing support.
>>   Its usage looks like:
>>mount -o hottrack
>>mount -o nouser,hottrack
>>mount -o nouser,hottrack,loop
>>mount -o hottrack,nouser
>
> I think that this option parsing should be done by the filesystem,
> even though the tracking functionality is in the VFS. That way ony
> the filesystems that can use the tracking information will turn it
> on, rather than being able to turn it on for everything regardless
> of whether it is useful or not.
>
> Along those lines, just using a normal superblock flag to indicate
> it is active (e.g. MS_HOT_INODE_TRACKING in sb->s_flags) means you
> don't need to allocate the sb->s_hot_info structure just to be able
> to check whether we are tracking hot inodes or not.
>
> This then means the hot inode tracking for the superblock can be
> initialised by the filesystem as part of it's fill_super method,
> along with the filesystem specific code that will use the hot
> tracking information the VFS gathers
I can see what you mean, but don't know if other guys also agree with this.
If so, all FS specific code which use hot tracking feature wll have to add
the same chunk of code in it fill_super method. Is it good?

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 02/10] vfs: add support for updating access frequency

2012-09-25 Thread Zhi Yong Wu

thanks a lot for your review in my heart, Dave. It is very helpful to me.

On Tue, Sep 25, 2012 at 5:17 PM, Dave Chinner  wrote:
> On Sun, Sep 23, 2012 at 08:56:27PM +0800, zwu.ker...@gmail.com wrote:
>> From: Zhi Yong Wu 
>>
>>   Add some utils helpers to update access frequencies
>> for one file or its range.
>>
>> Signed-off-by: Zhi Yong Wu 
>> ---
>>  fs/hot_tracking.c |  359 
>> +
>>  fs/hot_tracking.h |   15 +++
>>  2 files changed, 374 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/hot_tracking.c b/fs/hot_tracking.c
>> index 173054b..52ed926 100644
>> --- a/fs/hot_tracking.c
>> +++ b/fs/hot_tracking.c
>> @@ -106,6 +106,365 @@ inode_err:
>>  }
>>
>>  /*
>> + * Drops the reference out on hot_inode_item by one and free the structure
>> + * if the reference count hits zero
>> + */
>> +void hot_rb_free_hot_inode_item(struct hot_inode_item *he)
>
> hot_inode_item_put()
>
>> +{
>> + if (!he)
>> + return;
>
> It's a bug to call a put function on a kref counted item with a null
> pointer. Let the kernel crash so it is noticed and fixed.
OK, will remove it.
>
>> +
>> + if (atomic_dec_and_test(>refs.refcount)) {
>> + WARN_ON(he->in_tree);
>> + kmem_cache_free(hot_inode_item_cache, he);
>> + }
>
> Isn't this abusing krefs? i.e. this should be:
Sorry, thanks for your explaination as below:
>
> hot_inode_item_free()
> {
> WARN_ON(he->in_tree);
> kmem_cache_free(hot_inode_item_cache, he);
> }
>
> hot_inode_item_put()
> {
> kref_put(>refs, hot_inode_item_free)
> }
>
>> +/*
>> + * Drops the reference out on hot_range_item by one and free the structure
>> + * if the reference count hits zero
>> + */
>> +static void hot_rb_free_hot_range_item(struct hot_range_item *hr)
>
> same comments as above.
OK, thanks.
> 
>> +static struct rb_node *hot_rb_insert_hot_inode_item(struct rb_root *root,
>> + unsigned long inode_num,
>> + struct rb_node *node)
>
> static struct rb_node *
> hot_inode_item_find(. )
OK, thanks.
>
>> +{
>> + struct rb_node **p = >rb_node;
>> + struct rb_node *parent = NULL;
>> + struct hot_inode_item *entry;
>> +
>> + /* walk tree to find insertion point */
>> + while (*p) {
>> + parent = *p;
>> + entry = rb_entry(parent, struct hot_inode_item, rb_node);
>> +
>> + if (inode_num < entry->i_ino)
>> + p = &(*p)->rb_left;
>> + else if (inode_num > entry->i_ino)
>> + p = &(*p)->rb_right;
>> + else
>> + return parent;
>> + }
>> +
>> + entry = rb_entry(node, struct hot_inode_item, rb_node);
>> + entry->in_tree = 1;
>> + rb_link_node(node, parent, p);
>> + rb_insert_color(node, root);
>
> So the hot inode item search key is the inode number? Why use an
Yes
> rb-tree then? Wouldn't a btree be a more memory efficient way to
> hold a sparse index that has fixed key and pointer sizes?
Yes, i know, but if we don't use btree, what will be better? Radix tree?

>
> Also, the API seems quite strange. you pass in the the rb_node and
> the inode number which instead of passing in the hot inode item that
> already holds this information. You then convert the rb_node back to
> a hot inode item to set the in_tree variable. So why not just pass
> in the struct hot_inode_item in the first place?
Good catch, thanks for your remider.
>
>> +static u64 hot_rb_range_end(struct hot_range_item *hr)
>
> hot_range_item_end()
OK
>
>> +{
>> + if (hr->start + hr->len < hr->start)
>> + return (u64)-1;
>> +
>> + return hr->start + hr->len - 1;
>> +}
>> +
>> +static struct rb_node *hot_rb_insert_hot_range_item(struct rb_root *root,
>
> hot_range_item_find()
OK
>
>> + u64 start,
>> + struct rb_node *node)
>> +{
>> + struct rb_node **p = >rb_node;
>> + struct rb_node *parent = NULL;
>> + struct hot_range_item *entry;
>> +
>> + /* ensure start is on a range boundary */
>> + start = start & RANGE_SIZE_MASK;
>> + /* walk tree to find insertion point */
>> + while (*p) {
>> + parent = *p;
>> + entry = rb_entry(parent, struct hot_range_item, rb_node);
>> +
>> + if (start < entry->start)
>> + p = &(*p)->rb_left;
>> + else if (start >= hot_rb_range_end(entry))
>
> Shouldn't an aligned end always be one byte short of the start
> offset of the next aligned region? i.e. start ==
> hot_rb_range_end(entry) is an indication of an off-by one bug
> somewhere?
This is really one good catch, thanks.
>
>> + p = &(*p)->rb_right;
>> + else
>> + return parent;
>> + }
>> +
>> + entry = rb_entry(node, struct hot_range_item,

Re: [PATCH] pagemap: fix wrong KPF_THP on slab pages

2012-09-25 Thread Fengguang Wu

On Tue, Sep 25, 2012 at 10:06:08PM -0400, Naoya Horiguchi wrote:
> On Tue, Sep 25, 2012 at 05:20:48PM -0700, David Rientjes wrote:
> > On Tue, 25 Sep 2012, Naoya Horiguchi wrote:
> > 
> > > KPF_THP can be set on non-huge compound pages like slab pages, because
> > > PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> > > and breaks user space applications which look for thp via 
> > > /proc/kpageflags.
> > > Currently thp is constructed only on anonymous pages, so this patch makes
> > > KPF_THP be set when both of PageAnon and PageTransCompound are true.
> > > 
> > > Changelog in v2:
> > >   - add a comment in code
> > > 
> > > Signed-off-by: Naoya Horiguchi 
> > 
> > Wouldn't PageTransCompound(page) && !PageHuge(page) && !PageSlab(page) be 
> > better for a future extension of thp support?
> 
> Yes, this saves us an additional change when thp starts handling pagecaches.
> Andrew, can you replace the previous version in -mm tree with new one below?
> 
> Thanks,
> Naoya
> ---
> From: Naoya Horiguchi 
> Date: Tue, 25 Sep 2012 21:30:25 -0400
> Subject: [PATCH v3] kpageflags: fix wrong KPF_THP on slab pages
> 
> KPF_THP can be set on non-huge compound pages like slab pages, because
> PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug

s/sees/checks/

> and breaks user space applications which look for thp via /proc/kpageflags.
> This patch rules out setting KPF_THP wrongly by additional PageSlab check.
> 
> Changelog in v3:
>   - check PageSlab instead of PageAnon
>   - fix patch subject
> 
> Changelog in v2:
>   - add a comment in code
> 
> Signed-off-by: Naoya Horiguchi 
> ---
>  fs/proc/page.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 7fcd0d6..e36d1f3 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -115,7 +115,12 @@ u64 stable_page_flags(struct page *page)
>   u |= 1 << KPF_COMPOUND_TAIL;
>   if (PageHuge(page))
>   u |= 1 << KPF_HUGE;
> - else if (PageTransCompound(page))
> + /*
> +  * PageTransCompound can be true for slab pages because it just sees

s/sees/checks/

> +  * PG_head/PG_head, so we need to check PageSlab to make sure the given

PG_head/PG_head should be PG_head/PG_tail.

> +  * page is a thp, not a non-huge compound page.
> +  */
> + else if (PageTransCompound(page) && !PageSlab(page))
>   u |= 1 << KPF_THP;

Good catch!

Will this report THP for the various drivers that do __GFP_COMP
page allocations?

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] virtio_console:Merge struct buffer_token into struct port_buffer

2012-09-25 Thread Masami Hiramatsu

(2012/09/25 22:47), sjur.brandel...@stericsson.com wrote:
> From: Sjur Brændeland 
> 
> This merge reduces code size by unifying the approach for
> sending scatter-lists and regular buffers. Any type of
> write operation (splice, write, put_chars) will now allocate
> a port_buffer and send_buf() and free_buf() can always be used.

Thanks!
This looks much nicer and simpler. I just have some comments below.

> Signed-off-by: Sjur Brændeland 
> cc: Rusty Russell 
> cc: Michael S. Tsirkin 
> cc: Amit Shah 
> cc: Linus Walleij 
> cc: Masami Hiramatsu 
> ---
>  drivers/char/virtio_console.c |  141 
> ++---
>  1 files changed, 62 insertions(+), 79 deletions(-)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index 8ab9c3d..f4f7b04 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -111,6 +111,11 @@ struct port_buffer {
>   size_t len;
>   /* offset in the buf from which to consume data */
>   size_t offset;
> +
> + /* If sgpages == 0 then buf is used, else sg is used */
> + unsigned int sgpages;
> +
> + struct scatterlist sg[1];
>  };
>  
>  /*
> @@ -338,23 +343,46 @@ static inline bool use_multiport(struct ports_device 
> *portdev)
>  
>  static void free_buf(struct port_buffer *buf)
>  {
> + int i;
> +
>   kfree(buf->buf);

this should be done only when !buf->sgpages, or (see below)

> +
> + if (buf->sgpages)
> + for (i = 0; i < buf->sgpages; i++) {
> + struct page *page = sg_page(>sg[i]);
> + if (!page)
> + break;
> + put_page(page);
> + }
> +
>   kfree(buf);
>  }
>  
> -static struct port_buffer *alloc_buf(size_t buf_size)
> +static struct port_buffer *alloc_buf(struct virtqueue *vq, size_t buf_size,
> +  int nrbufs)
>  {
>   struct port_buffer *buf;
> + size_t alloc_size;
>  
> - buf = kmalloc(sizeof(*buf), GFP_KERNEL);
> + /* Allocate buffer and the scatter list */
> + alloc_size = sizeof(*buf) + sizeof(struct scatterlist) * nrbufs;

This allocates one redundant sg entry when nrbuf > 0,
but I think it is OK. (just a comment)

> + buf = kmalloc(alloc_size, GFP_ATOMIC);

This should be kzalloc(), or buf->buf and others are not initialized,
which will cause unexpected kfree bug at kfree(buf->buf) in free_buf.

>   if (!buf)
>   goto fail;
> - buf->buf = kzalloc(buf_size, GFP_KERNEL);
> +
> + buf->sgpages = nrbufs;
> + if (nrbufs > 0)
> + return buf;
> +
> + buf->buf = kmalloc(buf_size, GFP_ATOMIC);

You can also use kzalloc here as previous code does.
But if the reason why using kzalloc comes from the security,
I think kmalloc is enough here, since the host can access
all the guest pages anyway.

>   if (!buf->buf)
>   goto free_buf;
>   buf->len = 0;
>   buf->offset = 0;
>   buf->size = buf_size;
> +
> + /* Prepare scatter buffer for sending */
> + sg_init_one(buf->sg, buf->buf, buf_size);
>   return buf;
>  
>  free_buf:

Thank you,


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-25 Thread Mike Galbraith

On Tue, 2012-09-25 at 19:22 -0700, Linus Torvalds wrote: 
> On Tue, Sep 25, 2012 at 7:00 PM, Mike Galbraith  wrote:
> >
> > Yes.  On AMD, the best thing you can do for fast switchers AFAIKT is
> > turn it off.  Different story on Intel.
> 
> I doubt it's all that different on Intel.

The behavioral difference is pretty large, question is why.


> Am I on the right track here? Or do you mean something completely
> different? Please explain it more verbosely.

A picture is worth a thousand words they say...

x3550 M3 E5620, SMT off, revert reverted, nohz off, zero knob twiddles,
governor=performance.

tbench1 2 4
398   820  1574  -select_idle_sibling() 
454   902  1574  +select_idle_sibling()
397   737  1556  +select_idle_sibling() virgin source

netperf TCP_RR, one unbound pair
114674   -select_idle_sibling()
131422   +select_idle_sibling()
111551   +select_idle_sibling() virgin source

These 1:1 buddy pairs scheduled cross core on E5620 feel no pain once
you kill the bouncing.  The bounce pain with 4 cores is _tons_ less
intense than on the 10 core Westmere, but it's still quite visible.  The
point though is that cross core doesn't hurt Westmere, but demolishes
Opteron for some reason.  (OTOH, bounce _helps_ fugly 1:N load.. grr;)


> Your patch showed improvement for Intel too on this same benchmark
> (tbench). Borislav just went even further. I'd suggest testing that
> patch on Intel too, and wouldn't be surprised at all if it shows
> improvement there too.

See above.

> It's pgbench that then regressed with your patch, and I suspect it
> will regress with Borislav's too.

Yeah, strongly suspect you're right.

> You probably looked at the fact that the original report from Nikolay
> says that the Intel E6300 hadn't regressed on pgbench, but I suspect
> you didn't realize that E6300 is just a dual-core CPU without even HT.
> So I doubt it's about "Intel vs AMD", it's more about "six cores" vs
> "just two".

No, I knew, and yeah, it's about number of paths.

> And the thing is - with just two cores, the fact that your patch
> didn't change the Intel numbers is totally irrelevant. With two cores,
> the whole "buddy_cpu" was equivalent to the old code, since there was
> ever only one other core to begin with!
> 
> So AMD and Intel do have differences, but they aren't all that radical.

Looks fairly radical to me, but as noted in mail to Boris, it boils down
to "what does it cost, and where does the breakeven lie?".

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the wireless-next tree with the pci tree

2012-09-25 Thread Sujith Manoharan

Stephen Rothwell wrote:
> Hi John,
> 
> Today's linux-next merge of the wireless-next tree got a conflict in
> drivers/net/wireless/ath/ath9k/pci.c between commit 08bd108096b6 ("ath9k:
> Use PCI Express Capability accessors") from the pci tree and commit
> 046b6802c8d3 ("ath9k: Disable ASPM only for AR9285") from the
> wireless-next tree.
> 
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

The fix looks good, thanks.

Sujith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pagemap: fix wrong KPF_THP on slab pages

2012-09-25 Thread David Rientjes

On Tue, 25 Sep 2012, Naoya Horiguchi wrote:

> KPF_THP can be set on non-huge compound pages like slab pages, because
> PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> and breaks user space applications which look for thp via /proc/kpageflags.
> This patch rules out setting KPF_THP wrongly by additional PageSlab check.
> 
> Changelog in v3:
>   - check PageSlab instead of PageAnon
>   - fix patch subject
> 
> Changelog in v2:
>   - add a comment in code
> 
> Signed-off-by: Naoya Horiguchi 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-25 Thread Mike Galbraith

On Tue, 2012-09-25 at 20:42 +0200, Borislav Petkov wrote:

> Right, so why did we need it all, in the first place? There has to be
> some reason for it.

Easy.  Take two communicating tasks.  Is an affine wakeup a good idea?
It depends on how much execution overlap there is.  Wake affine when
there is overlap larger than cache miss cost, and you just tossed
throughput into the bin.

select_idle_sibling() was originally about shared L2, where any overlap
was salvageable.  On modern processors with no shared L2, you have to
get past the cost, but the gain is still there.  Intel wins with loads
that AMD loses very bady on, so I can only guess that Intel must feed
caches more efficiently.  Dunno.  It just doesn't matter though, point
is that there is a win to be had in both cases, the breakeven just isn't
at the same point.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-25 Thread Linus Torvalds

On Tue, Sep 25, 2012 at 7:00 PM, Mike Galbraith  wrote:
>
> Yes.  On AMD, the best thing you can do for fast switchers AFAIKT is
> turn it off.  Different story on Intel.

I doubt it's all that different on Intel.

Your patch showed improvement for Intel too on this same benchmark
(tbench). Borislav just went even further. I'd suggest testing that
patch on Intel too, and wouldn't be surprised at all if it shows
improvement there too.

It's pgbench that then regressed with your patch, and I suspect it
will regress with Borislav's too.

So I'm sure there are architecture differences (where HT in particular
probably changes optimal scheduling strategy, although I'd expect the
bulldozer approach to not be *that*different - but I don't know if BD
shows up as "HT siblings" or not, so dissimilar topology
interpretation may make it *look* very different).

So I suspect the architectural differences are smaller than you claim,
and it's much more about the loads in question.

You probably looked at the fact that the original report from Nikolay
says that the Intel E6300 hadn't regressed on pgbench, but I suspect
you didn't realize that E6300 is just a dual-core CPU without even HT.
So I doubt it's about "Intel vs AMD", it's more about "six cores" vs
"just two".

And the thing is - with just two cores, the fact that your patch
didn't change the Intel numbers is totally irrelevant. With two cores,
the whole "buddy_cpu" was equivalent to the old code, since there was
ever only one other core to begin with!

So AMD and Intel do have differences, but they aren't all that radical.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Distinguish TLB shootdown interrupts from other functions call interrupts

2012-09-25 Thread Alex Shi

On 09/26/2012 10:11 AM, Tomoki Sekiyama wrote:

> Hi Alex,
> 
> On 2012/09/25 11:57, Alex Shi wrote:
>> On 09/24/2012 09:37 AM, Alex Shi wrote:
>>
>>> On 09/20/2012 04:50 PM, Tomoki Sekiyama wrote:
>>>
unsigned int irq_resched_count;
unsigned int irq_call_count;
 +  /* irq_tlb_count is double-counted in irq_call_count, so it must be
 + subtracted from irq_call_count when displaying irq_call_count */
unsigned int irq_tlb_count;
>>>
>>> Review again this patch, above comments is not kernel compatible format.
>>> Could you change it like standard comment format:
>>>
>>> /*
>>>  * xxx
>>>  * 
>>>  */
>>>
>>
>> the 3.6 kernel will closed soon. it will be great to has this patch in.
>> So, could you like to refresh your patch with popular comments format? :)
> 
> Fixed patch is below.
> Thank you for the review again.
> 


Acked-by: Alex Shi 

> --
> As TLB shootdown requests to other CPU cores are now using function call
> interrupts, TLB shootdowns entry in /proc/interrupts is always shown as 0.
> 
> This behavior change was introduced by commit 52aec3308db8 ("x86/tlb:
> replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR").
> 
> This patch reverts TLB shootdowns entry in /proc/interrupts to count TLB
> shootdowns separately from the other function call interrupts.
> 
> Signed-off-by: Tomoki Sekiyama 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Alex Shi 
> ---
>  arch/x86/include/asm/hardirq.h |4 
>  arch/x86/kernel/irq.c  |4 ++--
>  arch/x86/mm/tlb.c  |2 ++
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
> index d3895db..81f04ce 100644
> --- a/arch/x86/include/asm/hardirq.h
> +++ b/arch/x86/include/asm/hardirq.h
> @@ -18,6 +18,10 @@ typedef struct {
>  #ifdef CONFIG_SMP
>   unsigned int irq_resched_count;
>   unsigned int irq_call_count;
> + /*
> +  * irq_tlb_count is double-counted in irq_call_count, so it must be
> +  * subtracted from irq_call_count when displaying irq_call_count
> +  */
>   unsigned int irq_tlb_count;
>  #endif
>  #ifdef CONFIG_X86_THERMAL_VECTOR
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index d44f782..e4595f1 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -92,7 +92,8 @@ int arch_show_interrupts(struct seq_file *p, int prec)
>   seq_printf(p, "  Rescheduling interrupts\n");
>   seq_printf(p, "%*s: ", prec, "CAL");
>   for_each_online_cpu(j)
> - seq_printf(p, "%10u ", irq_stats(j)->irq_call_count);
> + seq_printf(p, "%10u ", irq_stats(j)->irq_call_count -
> + irq_stats(j)->irq_tlb_count);
>   seq_printf(p, "  Function call interrupts\n");
>   seq_printf(p, "%*s: ", prec, "TLB");
>   for_each_online_cpu(j)
> @@ -147,7 +148,6 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
>  #ifdef CONFIG_SMP
>   sum += irq_stats(cpu)->irq_resched_count;
>   sum += irq_stats(cpu)->irq_call_count;
> - sum += irq_stats(cpu)->irq_tlb_count;
>  #endif
>  #ifdef CONFIG_X86_THERMAL_VECTOR
>   sum += irq_stats(cpu)->irq_thermal_count;
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 613cd83..2d6d8ed 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -98,6 +98,8 @@ static void flush_tlb_func(void *info)
>  {
>   struct flush_tlb_info *f = info;
>  
> + inc_irq_stat(irq_tlb_count);
> +
>   if (f->flush_mm != this_cpu_read(cpu_tlbstate.active_mm))
>   return;
>  
> 



-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Distinguish TLB shootdown interrupts from other functions call interrupts

2012-09-25 Thread Tomoki Sekiyama

Hi Alex,

On 2012/09/25 11:57, Alex Shi wrote:
> On 09/24/2012 09:37 AM, Alex Shi wrote:
>
>> On 09/20/2012 04:50 PM, Tomoki Sekiyama wrote:
>>
>>> unsigned int irq_resched_count;
>>> unsigned int irq_call_count;
>>> +   /* irq_tlb_count is double-counted in irq_call_count, so it must be
>>> +  subtracted from irq_call_count when displaying irq_call_count */
>>> unsigned int irq_tlb_count;
>>
>> Review again this patch, above comments is not kernel compatible format.
>> Could you change it like standard comment format:
>>
>> /*
>>  * xxx
>>  * 
>>  */
>>
>
> the 3.6 kernel will closed soon. it will be great to has this patch in.
> So, could you like to refresh your patch with popular comments format? :)

Fixed patch is below.
Thank you for the review again.

--
As TLB shootdown requests to other CPU cores are now using function call
interrupts, TLB shootdowns entry in /proc/interrupts is always shown as 0.

This behavior change was introduced by commit 52aec3308db8 ("x86/tlb:
replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR").

This patch reverts TLB shootdowns entry in /proc/interrupts to count TLB
shootdowns separately from the other function call interrupts.

Signed-off-by: Tomoki Sekiyama 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Alex Shi 
---
 arch/x86/include/asm/hardirq.h |4 
 arch/x86/kernel/irq.c  |4 ++--
 arch/x86/mm/tlb.c  |2 ++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index d3895db..81f04ce 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -18,6 +18,10 @@ typedef struct {
 #ifdef CONFIG_SMP
unsigned int irq_resched_count;
unsigned int irq_call_count;
+   /*
+* irq_tlb_count is double-counted in irq_call_count, so it must be
+* subtracted from irq_call_count when displaying irq_call_count
+*/
unsigned int irq_tlb_count;
 #endif
 #ifdef CONFIG_X86_THERMAL_VECTOR
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index d44f782..e4595f1 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -92,7 +92,8 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "  Rescheduling interrupts\n");
seq_printf(p, "%*s: ", prec, "CAL");
for_each_online_cpu(j)
-   seq_printf(p, "%10u ", irq_stats(j)->irq_call_count);
+   seq_printf(p, "%10u ", irq_stats(j)->irq_call_count -
+   irq_stats(j)->irq_tlb_count);
seq_printf(p, "  Function call interrupts\n");
seq_printf(p, "%*s: ", prec, "TLB");
for_each_online_cpu(j)
@@ -147,7 +148,6 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
 #ifdef CONFIG_SMP
sum += irq_stats(cpu)->irq_resched_count;
sum += irq_stats(cpu)->irq_call_count;
-   sum += irq_stats(cpu)->irq_tlb_count;
 #endif
 #ifdef CONFIG_X86_THERMAL_VECTOR
sum += irq_stats(cpu)->irq_thermal_count;
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 613cd83..2d6d8ed 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -98,6 +98,8 @@ static void flush_tlb_func(void *info)
 {
struct flush_tlb_info *f = info;
 
+   inc_irq_stat(irq_tlb_count);
+
if (f->flush_mm != this_cpu_read(cpu_tlbstate.active_mm))
return;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pagemap: fix wrong KPF_THP on slab pages

2012-09-25 Thread Naoya Horiguchi

On Tue, Sep 25, 2012 at 05:20:48PM -0700, David Rientjes wrote:
> On Tue, 25 Sep 2012, Naoya Horiguchi wrote:
> 
> > KPF_THP can be set on non-huge compound pages like slab pages, because
> > PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> > and breaks user space applications which look for thp via /proc/kpageflags.
> > Currently thp is constructed only on anonymous pages, so this patch makes
> > KPF_THP be set when both of PageAnon and PageTransCompound are true.
> > 
> > Changelog in v2:
> >   - add a comment in code
> > 
> > Signed-off-by: Naoya Horiguchi 
> 
> Wouldn't PageTransCompound(page) && !PageHuge(page) && !PageSlab(page) be 
> better for a future extension of thp support?

Yes, this saves us an additional change when thp starts handling pagecaches.
Andrew, can you replace the previous version in -mm tree with new one below?

Thanks,
Naoya
---
From: Naoya Horiguchi 
Date: Tue, 25 Sep 2012 21:30:25 -0400
Subject: [PATCH v3] kpageflags: fix wrong KPF_THP on slab pages

KPF_THP can be set on non-huge compound pages like slab pages, because
PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
and breaks user space applications which look for thp via /proc/kpageflags.
This patch rules out setting KPF_THP wrongly by additional PageSlab check.

Changelog in v3:
  - check PageSlab instead of PageAnon
  - fix patch subject

Changelog in v2:
  - add a comment in code

Signed-off-by: Naoya Horiguchi 
---
 fs/proc/page.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 7fcd0d6..e36d1f3 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -115,7 +115,12 @@ u64 stable_page_flags(struct page *page)
u |= 1 << KPF_COMPOUND_TAIL;
if (PageHuge(page))
u |= 1 << KPF_HUGE;
-   else if (PageTransCompound(page))
+   /*
+* PageTransCompound can be true for slab pages because it just sees
+* PG_head/PG_head, so we need to check PageSlab to make sure the given
+* page is a thp, not a non-huge compound page.
+*/
+   else if (PageTransCompound(page) && !PageSlab(page))
u |= 1 << KPF_THP;
 
/*
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ALSA: hda - Add inverted internal mic quirk for Lenovo IdeaPad U310

2012-09-25 Thread Greg KH

On Wed, Sep 26, 2012 at 01:20:44AM +0200, Felix Kaechele wrote:
> The Lenovo IdeaPad U310 has an internal mic where the right channel
> is phase inverted.
> 
> Signed-off-by: Felix Kaechele 
> ---
>  sound/pci/hda/patch_conexant.c | 1 +
>  1 file changed, 1 insertion(+)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the wireless-next tree with the pci tree

2012-09-25 Thread Stephen Rothwell

Hi John,

Today's linux-next merge of the wireless-next tree got a conflict in
drivers/net/wireless/ath/ath9k/pci.c between commit 08bd108096b6 ("ath9k:
Use PCI Express Capability accessors") from the pci tree and commit
046b6802c8d3 ("ath9k: Disable ASPM only for AR9285") from the
wireless-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/net/wireless/ath/ath9k/pci.c
index c4f6980,c0c5996..000
--- a/drivers/net/wireless/ath/ath9k/pci.c
+++ b/drivers/net/wireless/ath/ath9k/pci.c
@@@ -123,10 -128,12 +123,11 @@@ static void ath_pci_aspm_init(struct at
if (!parent)
return;
  
-   if (ath9k_hw_get_btcoex_scheme(ah) != ATH_BTCOEX_CFG_NONE) {
-   /* Bluetooth coexistance requires disabling ASPM. */
+   if ((ath9k_hw_get_btcoex_scheme(ah) != ATH_BTCOEX_CFG_NONE) &&
+   (AR_SREV_9285(ah))) {
+   /* Bluetooth coexistance requires disabling ASPM for AR9285. */
 -  pci_read_config_byte(pdev, pos + PCI_EXP_LNKCTL, );
 -  aspm &= ~(PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
 -  pci_write_config_byte(pdev, pos + PCI_EXP_LNKCTL, aspm);
 +  pcie_capability_clear_word(pdev, PCI_EXP_LNKCTL,
 +  PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
  
/*
 * Both upstream and downstream PCIe components should


pgpAEo3LWCk41.pgp
Description: PGP signature

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

2012-09-25 Thread Mike Galbraith

On Tue, 2012-09-25 at 10:21 -0700, Linus Torvalds wrote: 
> On Tue, Sep 25, 2012 at 10:00 AM, Borislav Petkov  wrote:
> >
> > 3.6-rc6+tip/auto-latest-kill select_idle_sibling()
> 
> Is this literally just removing it entirely? Because apart from the
> latency spike at 4 procs (and the latency numbers look very noisy, so
> that's probably just noise), it looks clearly superior to everything
> else. On that benchmark, at least.

Yes.  On AMD, the best thing you can do for fast switchers AFAIKT is
turn it off.  Different story on Intel.

> How does pgbench look? That's the one that apparently really wants to
> spread out, possibly due to user-level spinlocks. So I assume it will
> show the reverse pattern, with "kill select_idle_sibling" being the
> worst case. Sad, because it really would be lovely to just remove that
> thing ;)

It _is_ irritating.  There's nohz, governors, and then come radically
different cross cpu data blasting ability on top. On Intel, it wins at
the same fast movers it demolishes on AMD.  Throttle it, and that goes
away, along with some other issues.

Or just kill it, then integrate what it does for you into a smarter
lighter wakeup balance.. but then that has to climb that same hills.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] GPIO: gpio-pxa: fix bug when get gpio value

2012-09-25 Thread Neil Zhang

We need to return 0 or 1 when get gpio value.

Signed-off-by: Neil Zhang 
---
 drivers/gpio/gpio-pxa.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/gpio/gpio-pxa.c b/drivers/gpio/gpio-pxa.c
index 9cac88a..3c9dc8c 100644
--- a/drivers/gpio/gpio-pxa.c
+++ b/drivers/gpio/gpio-pxa.c
@@ -269,7 +269,9 @@ static int pxa_gpio_direction_output(struct gpio_chip *chip,
 
 static int pxa_gpio_get(struct gpio_chip *chip, unsigned offset)
 {
-   return readl_relaxed(gpio_chip_base(chip) + GPLR_OFFSET) & (1 << 
offset);
+   u32 gplr = readl_relaxed(gpio_chip_base(chip) + GPLR_OFFSET);
+
+   return (gplr & (1 << offset)) ? 1 : 0;
 }
 
 static void pxa_gpio_set(struct gpio_chip *chip, unsigned offset, int value)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the net-next tree with Linus' tree

2012-09-25 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in
net/batman-adv/bat_iv_ogm.c between commit 7caf69fb9c50 ("batman-adv: Fix
symmetry check / route flapping in multi interface setups") from Linus'
tree and commit bbb1f90efba8 ("batman-adv: Don't break statements after
assignment operator") from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc net/batman-adv/bat_iv_ogm.c
index 469daab,df79300..000
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@@ -642,9 -652,9 +652,10 @@@ batadv_iv_ogm_orig_update(struct batadv
struct batadv_neigh_node *router = NULL;
struct batadv_orig_node *orig_node_tmp;
struct hlist_node *node;
 +  int if_num;
uint8_t sum_orig, sum_neigh;
uint8_t *neigh_addr;
+   uint8_t tq_avg;
  
batadv_dbg(BATADV_DBG_BATMAN, bat_priv,
   "update_originator(): Searching and updating originator 
entry of received packet\n");


pgp1j73yR9GNd.pgp
Description: PGP signature

Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)

2012-09-25 Thread Steven Rostedt

On Tue, 2012-09-25 at 20:02 -0500, Daniel Santos wrote:
> >> Q
> >> ===
> >> Q: Why did you add BUILD_BUG_ON_NON_CONST() and
> >>BUILD_BUG_ON_NON_CONST42()?
> > A: Because BUILD_BUG_ON_NON_CONST42() will crash if it does not result
> > in the answer to life, the universe and everything!
> 
> By the way, I have a theory before time, God was writing code on some
> cosmic computer (beyond our understanding, of course) when he
> accidentally tried to divide by zero, resulting in a core dump that we
> now know as the universe we live in.  So thus, we are just the excrement
> of some celestial computer after it failed to properly execute its code.
> 

And that operation on which God failed on was:

ans = 42 / 0

Which totally explains the point where... "to understand the answer, you
must first understand the question".

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: export trace_block_unplug

2012-09-25 Thread NeilBrown

On Tue, 25 Sep 2012 14:50:34 +0200 Jens Axboe  wrote:

> On 09/25/2012 01:57 AM, NeilBrown wrote:
> > 
> > Hi Jens,
> >  is there any chance this can be in the next merge window?  I'm
> > adding block tracing to md and found I need another export.
> 
> No problem, applied.
> 

Thanks.
NeilBrown


signature.asc
Description: PGP signature

Re: [PATCH 0/3 v3] perf: precise mode and exclude_guest

2012-09-25 Thread David Ahern


On 9/13/12 2:59 PM, David Ahern wrote:

Hopefully this wraps up the precise mode-exclude_guest dependency.
I'm sure someone will let me know if I screwed up the attribution
in the second patch.

David Ahern (2):
   perf tool: precise mode requires exclude_guest
   perf tool: give user better message if precise is not supported

Peter Zijlstra (1):
   perf: require exclude_guest to use PEBS - kernel side enforcement

v3: removed extra tab in patch 2
v2: updated commit messages for patches 1 and 2


Ping? Is this version Peter approved?

David

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -next] device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit

2012-09-25 Thread Joe Perches

Also add __printf() verification for format string.

Reported-by: Geert Uytterhoeven 
Signed-off-by: Joe Perches 

---

 include/linux/device.h |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 8873603..86ef6ab 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -902,8 +902,9 @@ extern const char *dev_driver_string(const struct device 
*dev);
 
 #ifdef CONFIG_PRINTK
 
-extern int dev_vprintk_emit(int level, const struct device *dev,
-   const char *fmt, va_list args);
+extern __printf(3, 0)
+int dev_vprintk_emit(int level, const struct device *dev,
+const char *fmt, va_list args);
 extern __printf(3, 4)
 int dev_printk_emit(int level, const struct device *dev, const char *fmt, ...);
 
@@ -927,8 +928,9 @@ int _dev_info(const struct device *dev, const char *fmt, 
...);
 
 #else
 
-static int dev_vprintk_emit(int level, const struct device *dev,
-   const char *fmt, va_list args)
+static inline __printf(3, 0)
+int dev_vprintk_emit(int level, const struct device *dev,
+const char *fmt, va_list args)
 { return 0; }
 static inline __printf(3, 4)
 int dev_printk_emit(int level, const struct device *dev, const char *fmt, ...)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/10] mm, util: Use dup_user to duplicate user memory

2012-09-25 Thread Ezequiel Garcia

Hi Andrew,

On Tue, Sep 25, 2012 at 6:29 PM, Andrew Morton
 wrote:
> On Sat,  8 Sep 2012 17:47:54 -0300
> Ezequiel Garcia  wrote:
>
>> Previously the strndup_user allocation was being done through memdup_user,
>> and the caller was wrongly traced as being strndup_user
>> (the correct trace must report the caller of strndup_user).
>>
>> This is a common problem: in order to get accurate callsite tracing,
>> a utils function can't allocate through another utils function,
>> but instead do the allocation himself (or inlined).
>>
>> Here we fix this by creating an always inlined dup_user() function to
>> performed the real allocation and to be used by memdup_user and strndup_user.
>
> This patch increases util.o's text size by 238 bytes.  A larger kernel
> with a worsened cache footprint.
>
> And we did this to get marginally improved tracing output?  This sounds
> like a bad tradeoff to me.
>

Mmm, that's bad tradeoff indeed.
It's certainly odd since the patch shouldn't increase the text size
*that* much.
Is it too much to ask that you send your kernel config and gcc version.

My compilation (x86 kernel in gcc 4.7.1) shows a kernel less bloated:

$ readelf -s util-dup-user.o | grep dup_user
   161: 1c10   108 FUNCGLOBAL DEFAULT1 memdup_user
   169: 1df0   159 FUNCGLOBAL DEFAULT1 strndup_user
$ readelf -s util.o | grep dup_user
   161: 1c10   108 FUNCGLOBAL DEFAULT1 memdup_user
   169: 1df098 FUNCGLOBAL DEFAULT1 strndup_user

$ size util.o
   textdata bss dec hex filename
  183192077   0   203964fac util.o
$ size util-dup-user.o
   textdata bss dec hex filename
  183672077   0   204444fdc util-dup-user.o

Am I doing anything wrong?
If you still feel this is unnecessary bloatness, perhaps I could think of
something depending on CONFIG_TRACING (though I know
we all hate those nasty ifdefs).

Anyway, thanks for the review,
Ezequiel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: use the correct macros

2012-09-25 Thread Yasuaki Ishimatsu

This patch fixes to use the correct macros. 

CC: Len Brown 
CC: Thomas Gleixner 
CC: Ingo Molnar 
CC: H. Peter Anvin 
Signed-off-by: Yasuaki Ishimatsu 
---
 arch/x86/kernel/acpi/boot.c |2 +-
 drivers/acpi/numa.c |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-3.6-rc5/arch/x86/kernel/acpi/boot.c
===
--- linux-3.6-rc5.orig/arch/x86/kernel/acpi/boot.c  2012-09-13 
15:44:30.0 +0900
+++ linux-3.6-rc5/arch/x86/kernel/acpi/boot.c   2012-09-13 15:46:31.743850426 
+0900
@@ -601,7 +601,7 @@ static void __cpuinit acpi_map_cpu2node(
int nid;
 
nid = acpi_get_node(handle);
-   if (nid == -1 || !node_online(nid))
+   if (nid == NUMA_NO_NODE || !node_online(nid))
return;
set_apicid_to_node(physid, nid);
numa_set_node(cpu, nid);
Index: linux-3.6-rc5/drivers/acpi/numa.c
===
--- linux-3.6-rc5.orig/drivers/acpi/numa.c  2012-09-13 15:44:59.0 
+0900
+++ linux-3.6-rc5/drivers/acpi/numa.c   2012-09-13 15:46:03.079850552 +0900
@@ -327,12 +327,12 @@ int acpi_get_pxm(acpi_handle h)
return pxm;
status = acpi_get_parent(handle, );
} while (ACPI_SUCCESS(status));
-   return -1;
+   return PXM_INVAL;
 }
 
 int acpi_get_node(acpi_handle *handle)
 {
-   int pxm, node = -1;
+   int pxm, node = NUMA_NO_NODE;
 
pxm = acpi_get_pxm(handle);
if (pxm >= 0 && pxm < MAX_PXM_DOMAINS)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] dma-debug: New interfaces to debug dma mapping errors

2012-09-25 Thread Shuah Khan

A recent dma mapping error analysis effort showed that a large percentage
of dma_map_single() and dma_map_page() returns are not checked for mapping
errors.

Reference:
http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis

Adding support for tracking dma mapping and unmapping errors to help assess
the following:

When do dma mapping errors get detected?
How often do these errors occur?
Why don't we see failures related to missing dma mapping error checks?
Are they silent failures?

Enhance dma-debug infrastructure to track dma mapping, and unmapping errors.

dma_map_errors: (system wide counter)
  Total number of dma mapping errors returned by the dma mapping interfaces,
  in response to mapping requests from all devices in the system.
dma_map_errors_not_checked: (system wide counter)
  Total number of dma mapping errors devices failed to check before using
  the returned address.
dma_unmap_errors: (system wide counter)
  Total number of times devices tried to unmap or free an invalid dma
  address.
dma_map_error_flag: (new field added to dma_debug_entry structure)
  New field to maintain dma mapping error check status. This flag is applicable
  to dma map page and dma map single entries tracked by dma-debug API. This
  status indicates whether or not a good mapping is checked by the device
  before it is used. dma_map_single() and dma_map_page() could fail to create
  a mapping in some cases, and drivers are expected to call dma_mapping_error()
  to check for errors.

Enhancements to dma-debug API are made to add new debugfs interfaces to
report total dma errors, dma errors that are not checked, and unmap errors
for the entire system. Please note that these are system wide counters for
all devices in the system.

The following new dma-debug interface is added:

debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr):
Sets dma map error checked status for the dma map entry if one is
found. Decrements the system wide dma_map_errors_not_checked counter
that is incremented by debug_dma_map_page() when it checks for
mapping error before adding it to the dma debug entry table.

The following existing dma-debug APIs are changed to support this feature:

debug_dma_map_page():
Increments dma_map_errors and dma_map_errors_not_checked error totals
for the system when dma_addr is invalid. Please note that this routine
can no longer call dma_mapping_error(), because of the newly added
debug_dma_mapping_error() interface. Calling this routine at the time
dma error unchecked state is registered, will not help if state gets
changed right away.

check_unmap():
This is an existing internal routine that checks for various mapping
errors. Changed to increment system wide dma_unmap_errors, when a
device requests an invalid address to be unmapped. Please note that
this routine can no longer call dma_mapping_error(), because of the
newly added debug_dma_mapping_error() interface. Calling
dma_mapping_error() from this routine will change the dma map error
flag erroneously.

Changed arch/x86/include/asm/dma-mapping.h to call debug_dma_mapping_error()
to validate these new interfaces on x86_64. Other architectures will be
changed in a subsequent patch.

Tested: Intel iommu and swiotlb (iommu=soft) on x86-64 with
CONFIG_DMA_API_DEBUG enabled and disabled.

Signed-off-by: Shuah Khan 
---
 Documentation/DMA-API.txt  |   13 +
 arch/x86/include/asm/dma-mapping.h |1 +
 include/linux/dma-debug.h  |7 +++
 lib/dma-debug.c|   92 ++--
 4 files changed, 109 insertions(+), 4 deletions(-)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 66bd97a..59d58b9 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -638,6 +638,19 @@ this directory the following files can currently be found:
dma-api/error_count This file is read-only and shows the total
numbers of errors found.
 
+   dma-api/dma_map_errors  This file is read-only and shows the total
+   number of dma mapping errors detected.
+
+   dma-api/dma_map_errors_not_checked
+   This file is read-only and shows the total
+   number of dma mapping errors, device drivers
+   failed to check prior to using the returned
+   address.
+
+   dma-api/dma_unmap_errors
+   This file is read-only and shows the total
+   number of invalid dma unmapping attempts.
+
dma-api/num_errors  The number in this file shows how many
warnings will be printed to the kernel log
before it stops. This

Re: [PATCH v5 0/25] Generic Red-Black Trees (still WIP)

2012-09-25 Thread Daniel Santos


>> Q
>> ===
>> Q: Why did you add BUILD_BUG_ON_NON_CONST() and
>>BUILD_BUG_ON_NON_CONST42()?
> A: Because BUILD_BUG_ON_NON_CONST42() will crash if it does not result
> in the answer to life, the universe and everything!

By the way, I have a theory before time, God was writing code on some
cosmic computer (beyond our understanding, of course) when he
accidentally tried to divide by zero, resulting in a core dump that we
now know as the universe we live in.  So thus, we are just the excrement
of some celestial computer after it failed to properly execute its code.

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: rename mp_register_lapic in a comment

2012-09-25 Thread Yasuaki Ishimatsu

Commit 31d2092eb0c23636b73d2c24c0c11b66470cef58 (x86: move
mp_register_lapic_address to boot.c) renamed mp_register_lapic
to acpi_register_lapic. But mp_register_lapic remains in a
comment. So the patch rename it.

CC: Len Brown 
CC: Thomas Gleixner 
CC: Ingo Molnar 
CC: H. Peter Anvin 
Signed-off-by: Yasuaki Ishimatsu 
---
 arch/x86/kernel/acpi/boot.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-3.6-rc5/arch/x86/kernel/acpi/boot.c
===
--- linux-3.6-rc5.orig/arch/x86/kernel/acpi/boot.c  2012-09-19 
11:38:03.990715466 +0900
+++ linux-3.6-rc5/arch/x86/kernel/acpi/boot.c   2012-09-26 09:50:42.269534856 
+0900
@@ -656,7 +656,7 @@ static int __cpuinit _acpi_map_lsapic(ac
acpi_register_lapic(physid, ACPI_MADT_ENABLED);
 
/*
-* If mp_register_lapic successfully generates a new logical cpu
+* If acpi_register_lapic successfully generates a new logical cpu
 * number, then the following will get us exactly what was mapped
 */
cpumask_andnot(new_map, cpu_present_mask, tmp_map);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-25 Thread Dave Chinner

On Tue, Sep 25, 2012 at 04:59:55PM +0800, Guo Chao wrote:
> On Mon, Sep 24, 2012 at 06:26:54PM +1000, Dave Chinner wrote:
> > @@ -783,14 +783,19 @@ static void __wait_on_freeing_inode(struct inode 
> > *inode);
> >  static struct inode *find_inode(struct super_block *sb,
> > struct hlist_head *head,
> > int (*test)(struct inode *, void *),
> > -   void *data)
> > +   void *data, bool locked)
> >  {
> > struct hlist_node *node;
> > struct inode *inode = NULL;
> > 
> >  repeat:
> > -   hlist_for_each_entry(inode, node, head, i_hash) {
> > +   rcu_read_lock();
> > +   hlist_for_each_entry_rcu(inode, node, head, i_hash) {
> > spin_lock(>i_lock);
> > +   if (inode_unhashed(inode)) {
> > +   spin_unlock(>i_lock);
> > +   continue;
> > +   }
> 
> Is this check too early? If the unhashed inode happened to be the target
> inode, we are wasting our time to continue the traversal and we do not wait 
> on it.

If the inode is unhashed, then it is already passing through evict()
or has already passed through. If it has already passed through
evict() then it is too late to call __wait_on_freeing_inode() as the
wakeup occurs in evict() immediately after the inode is removed
from the hash. i.e:

remove_inode_hash(inode);

spin_lock(>i_lock);
wake_up_bit(>i_state, __I_NEW);
BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
spin_unlock(>i_lock);

i.e. if we get the case:

Thread 1, RCU hash traversalThread 2, evicting foo

rcu_read_lock()
found inode foo
remove_inode_hash(inode);
spin_lock(>i_lock);
wake_up(I_NEW)
spin_unlock(>i_lock);
destroy_inode()
..
spin_lock(foo->i_lock)
match sb, ino
I_FREEING
  rcu_read_unlock()

  wait_on_freeing_inode
wait_on_bit(I_NEW)

Hence if the inode is unhashed, it doesn't matter what inode it is,
it is never valid to use it any further because it may have already
been freed and the only reason we can safely access here it is that
the RCU grace period will not expire until we call
rcu_read_unlock().

> > @@ -1078,8 +1098,7 @@ struct inode *iget_locked(struct super_block *sb, 
> > unsigned long ino)
> > struct inode *old;
> > 
> > spin_lock(_hash_lock);
> > -   /* We released the lock, so.. */
> > -   old = find_inode_fast(sb, head, ino);
> > +   old = find_inode_fast(sb, head, ino, true);
> > if (!old) {
> > inode->i_ino = ino;
> > spin_lock(>i_lock);
> 
> E ... couldn't we use memory barrier API instead of irrelevant spin
> lock on newly allocated inode to publish I_NEW?

Yes, we could.

However, having multiple synchronisation methods for a single
variable that should only be used in certain circumstances is
something that is easy to misunderstand and get wrong. Memory
barriers are much more subtle and harder to understand than spin
locks, and every memory barrier needs to be commented to explain
what the barrier is actually protecting against.

In the case where a spin lock is guaranteed to be uncontended and
the cache line hot in the CPU cache, it makes no sense to replace
the spin lock with a memory barrier, especially when every other
place we modify the i_state/i_hash fields we have to wrap them
with i_lock

Simple code is good code - save the complexity for something that
needs it.

> I go through many mails of the last trend of scaling VFS. Many patches
> seem quite natural, say RCU inode lookup

Sure, but the implementation in those RCU lookup patches sucked.

> or per-bucket inode hash lock or 

It was a bad idea. At minimum, you can't use lockdep on it. Worse
for the realtime guys is the fact it can't be converted to a
sleeping lock. Worst was the refusal to change it in any way to
address concerns.

And realistically, the fundamental problem is not with the
inode_hash_lock, it's with the fact that the cache is based on a
hash table rather than a more scalable structure like a radix tree
or btree. This is a primary reason for XFS having it's own inode
cache - hashes can only hold a certain number of entries before
performance collapses catastrophically and so don't scale well to
tens or hundreds of millions of entries.

> per-superblock inode list lock,

Because it isn't a particularly hot lock, and given that
most workloads hit on a single filesystem, scalability is not
improved by making this change. As such, as long as there is a
single linked list used to iterate all inodes in the superblock,
a single lock is as good as scalability will get

> did

Re: [PATCH] slab: Ignore internal flags in cache creation

2012-09-25 Thread David Rientjes

On Tue, 25 Sep 2012, Christoph Lameter wrote:

> > No cache should ever pass those as a creation flags. We can just ignore
> > this bit if it happens to be passed (such as when duplicating a cache in
> > the kmem memcg patches)
> 
> Acked-by: Christoph Lameter 
> 

Nack, this is already handled by CREATE_MASK in the mm/slab.c allocator; 
the flag extensions beyond those defined in the generic slab.h header are 
implementation defined.  It may be true that SLAB uses a bit only 
internally (and already protects it with a BUG_ON() in 
__kmem_cache_create()) but that doesn't mean other implementations can't 
use such a flag that would be a no-op on another allocator.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/9] mm: compaction: Cache if a pageblock was scanned and no pages were isolated

2012-09-25 Thread Minchan Kim

On Tue, Sep 25, 2012 at 10:12:07AM +0100, Mel Gorman wrote:
> On Mon, Sep 24, 2012 at 02:26:44PM -0700, Andrew Morton wrote:
> > On Mon, 24 Sep 2012 10:39:38 +0100
> > Mel Gorman  wrote:
> > 
> > > On Fri, Sep 21, 2012 at 02:36:56PM -0700, Andrew Morton wrote:
> > > 
> > > > Also, what has to be done to avoid the polling altogether?  eg/ie, zap
> > > > a pageblock's PB_migrate_skip synchronously, when something was done to
> > > > that pageblock which justifies repolling it?
> > > > 
> > > 
> > > The "something" event you are looking for is pages being freed or
> > > allocated in the page allocator. A movable page being allocated in block
> > > or a page being freed should clear the PB_migrate_skip bit if it's set.
> > > Unfortunately this would impact the fast path of the alloc and free paths
> > > of the page allocator. I felt that that was too high a price to pay.
> > 
> > We already do a similar thing in the page allocator: clearing of
> > ->all_unreclaimable and ->pages_scanned. 
> 
> That is true but that is a simple write (shared cache line but still) to
> a struct zone. Worse, now that you point it out, that's pretty stupid. It
> should be checking if the value is non-zero before writing to it to avoid
> a cache line bounce.
> 
> Clearing the PG_migrate_skip in this path to avoid the need to ever pool is
> not as cheap as it needs to
> 
> set_pageblock_skip
>   -> set_pageblock_flags_group
> -> page_zone
> -> page_to_pfn
> -> get_pageblock_bitmap
> -> pfn_to_bitidx
> -> __set_bit
> 
> > But that isn't on the "fast
> > path" really - it happens once per pcp unload. 
> 
> That's still an important enough path that I'm wary of making it fatter
> and that only covers the free path. To avoid the polling, the allocation
> side needs to be handled too. It could be shoved down into rmqueue() to
> put it into a slightly colder path but still, it's a price to pay to keep
> compaction happy.
> 
> > Can we do something
> > like that?  Drop some hint into the zone without having to visit each
> > page?
> > 
> 
> Not without incurring a cost, but yes, t is possible to give a hint on when
> PG_migrate_skip should be cleared and move away from that time-based hammer.
> 
> First, we'd introduce a variant of get_pageblock_migratetype() that returns
> all the bits for the pageblock flags and then helpers to extract either the
> migratetype or the PG_migrate_skip. We already are incurring the cost of
> get_pageblock_migratetype() so it will not be much more expensive than what
> is already there. If there is an allocation or free within a pageblock that
> as the PG_migrate_skip bit set then we increment a counter. When the counter
> reaches some to-be-decided "threshold" then compaction may clear all the
> bits. This would match the criteria of the clearing being based on activity.
> 
> There are four potential problems with this
> 
> 1. The logic to retrieve all the bits and split them up will be a little
>convulated but maybe it would not be that bad.
> 
> 2. The counter is a shared-writable cache line but obviously it could
>be moved to vmstat and incremented with inc_zone_page_state to offset
>the cost a little.
> 
> 3. The biggested weakness is that there is not way to know if the
>counter is incremented based on activity in a small subset of blocks.
> 
> 4. What should the threshold be?
> 
> The first problem is minor but the other three are potentially a mess.
> Adding another vmstat counter is bad enough in itself but if the counter
> is incremented based on a small subsets of pageblocks, the hint becomes
> is potentially useless.

Another idea is that we can add two bits(PG_check_migrate/PG_check_free)
in pageblock_flags_group.
In allocation path, we can set PG_check_migrate in a pageblock
In free path, we can set PG_check_free in a pageblock.
And they are cleared by compaction's scan like now.
So we can discard 3 and 4 at least.

Another idea is that let's cure it by fixing fundamental problem.
Make zone's locks more fine-grained.
As time goes by, system uses bigger memory but our lock of zone
isn't scalable. Recently, lru_lock and zone->lock contention report
isn't rare so i think it's good time that we move next step.

How about defining struct sub_zone per 2G or 4G?
so a zone can have several sub_zone as size and subzone can replace
current zone's role and zone is just container of subzones.
Of course, it's not easy to implement but I think someday we should
go that way. Is it a really overkill?

> 
> However, does this match what you have in mind or am I over-complicating
> things?
> 
> > > > >
> > > > > ...
> > > > >
> > > > > +static void reset_isolation_suitable(struct zone *zone)
> > > > > +{
> > > > > + unsigned long start_pfn = zone->zone_start_pfn;
> > > > > + unsigned long end_pfn = zone->zone_start_pfn + 
> > > > > zone->spanned_pages;
> > > > > + unsigned long pfn;
> > > > > +
> > > > > + /*
> > > > > +  * Do not reset more than once

[PATCH] Thermal: Fix bug on cpu_cooling, cooling device's id conflict problem.

2012-09-25 Thread Jonghwa Lee

This patch fixes small bug on cpu_cooling. CPU cooling device has own
id generated with idr mathod. However in the previous version, it swapped
to all same id at last stage of probing as 0. This makes id's collision and
also occures error when it releases that id.

Signed-off-by: Jonghwa Lee 
---
 drivers/thermal/cpu_cooling.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 99a5d75..9050c1b 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -351,7 +351,7 @@ struct thermal_cooling_device *cpufreq_cooling_register(
struct cpufreq_cooling_device *cpufreq_dev = NULL;
unsigned int cpufreq_dev_count = 0, min = 0, max = 0;
char dev_name[THERMAL_NAME_LENGTH];
-   int ret = 0, id = 0, i;
+   int ret = 0, i;
struct cpufreq_policy policy;
 
list_for_each_entry(cpufreq_dev, _cpufreq_list, node)
@@ -396,7 +396,6 @@ struct thermal_cooling_device *cpufreq_cooling_register(
kfree(cpufreq_dev);
return ERR_PTR(-EINVAL);
}
-   cpufreq_dev->id = id;
cpufreq_dev->cool_dev = cool_dev;
cpufreq_dev->cpufreq_state = 0;
mutex_lock(_cpufreq_lock);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] gpio: add TS-5500 DIO headers support

2012-09-25 Thread Vivien Didelot

The Technologic Systems TS-5500 platform provides 3 digital I/O headers:
DIO1, DIO2, and the LCD port, that may be used as a DIO header.

Signed-off-by: Vivien Didelot 
Signed-off-by: Jerome Oufella 
---
 drivers/gpio/Kconfig  |   8 +
 drivers/gpio/Makefile |   1 +
 drivers/gpio/gpio-ts5500.c| 344 ++
 include/linux/platform_data/gpio-ts5500.h |  34 +++
 4 files changed, 387 insertions(+)
 create mode 100644 drivers/gpio/gpio-ts5500.c
 create mode 100644 include/linux/platform_data/gpio-ts5500.h

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index ba7926f5..f8bf8e1 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -395,6 +395,14 @@ config GPIO_TPS65912
help
  This driver supports TPS65912 gpio chip
 
+config GPIO_TS5500
+   tristate "TS-5500 DIO Headers"
+   depends on TS5500
+   help
+ This driver supports the 3 Digital I/O headers of the Technologic
+ Systems TS-5500 platform: DIO1, DIO2, and the LCD port which may be
+ used as a DIO header.
+
 config GPIO_TWL4030
tristate "TWL4030, TWL5030, and TPS659x0 GPIOs"
depends on TWL4030_CORE
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 153cace..48e8670 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -66,6 +66,7 @@ obj-$(CONFIG_ARCH_DAVINCI_TNETV107X) += gpio-tnetv107x.o
 obj-$(CONFIG_GPIO_TPS6586X)+= gpio-tps6586x.o
 obj-$(CONFIG_GPIO_TPS65910)+= gpio-tps65910.o
 obj-$(CONFIG_GPIO_TPS65912)+= gpio-tps65912.o
+obj-$(CONFIG_GPIO_TS5500)  += gpio-ts5500.o
 obj-$(CONFIG_GPIO_TWL4030) += gpio-twl4030.o
 obj-$(CONFIG_GPIO_UCB1400) += gpio-ucb1400.o
 obj-$(CONFIG_GPIO_VR41XX)  += gpio-vr41xx.o
diff --git a/drivers/gpio/gpio-ts5500.c b/drivers/gpio/gpio-ts5500.c
new file mode 100644
index 000..d91fee9
--- /dev/null
+++ b/drivers/gpio/gpio-ts5500.c
@@ -0,0 +1,344 @@
+/*
+ * GPIO (DIO) driver for Technologic Systems TS-5500
+ *
+ * Copyright (c) 2010-2012 Savoir-faire Linux Inc.
+ * Vivien Didelot 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * The TS-5500 platform has 38 Digital Input/Output lines (DIO), exposed by 3
+ * DIO headers: DIO1, DIO2, and the LCD port which may be used as a DIO header.
+ *
+ * The datasheet is available at:
+ * http://embeddedx86.com/documentation/ts-5500-manual.pdf.
+ * See section 6 "Digital I/O" for details about the pinout.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * This array describes the names of the DIO lines, but also the mapping 
between
+ * the datasheet, and corresponding offsets exposed by the driver.
+ */
+static const char * const ts5500_pinout[38] = {
+   /* DIO1 Header (offset 0-13) */
+   [0]  = "DIO1_0",  /* pin 1  */
+   [1]  = "DIO1_1",  /* pin 3  */
+   [2]  = "DIO1_2",  /* pin 5  */
+   [3]  = "DIO1_3",  /* pin 7  */
+   [4]  = "DIO1_4",  /* pin 9  */
+   [5]  = "DIO1_5",  /* pin 11 */
+   [6]  = "DIO1_6",  /* pin 13 */
+   [7]  = "DIO1_7",  /* pin 15 */
+   [8]  = "DIO1_8",  /* pin 4  */
+   [9]  = "DIO1_9",  /* pin 6  */
+   [10] = "DIO1_10", /* pin 8  */
+   [11] = "DIO1_11", /* pin 10 */
+   [12] = "DIO1_12", /* pin 12 */
+   [13] = "DIO1_13", /* pin 14 */
+
+   /* DIO2 Header (offset 14-26) */
+   [14] = "DIO2_0",  /* pin 1  */
+   [15] = "DIO2_1",  /* pin 3  */
+   [16] = "DIO2_2",  /* pin 5  */
+   [17] = "DIO2_3",  /* pin 7  */
+   [18] = "DIO2_4",  /* pin 9  */
+   [19] = "DIO2_5",  /* pin 11 */
+   [20] = "DIO2_6",  /* pin 13 */
+   [21] = "DIO2_7",  /* pin 15 */
+   [22] = "DIO2_8",  /* pin 4  */
+   [23] = "DIO2_9",  /* pin 6  */
+   [24] = "DIO2_10", /* pin 8  */
+   [25] = "DIO2_11", /* pin 10 */
+   [26] = "DIO2_13", /* pin 14 */
+
+   /* LCD Port as DIO (offset 27-37) */
+   [27] = "LCD_0",   /* pin 8  */
+   [28] = "LCD_1",   /* pin 7  */
+   [29] = "LCD_2",   /* pin 10 */
+   [30] = "LCD_3",   /* pin 9  */
+   [31] = "LCD_4",   /* pin 12 */
+   [32] = "LCD_5",   /* pin 11 */
+   [33] = "LCD_6",   /* pin 14 */
+   [34] = "LCD_7",   /* pin 13 */
+   [35] = "LCD_EN",  /* pin 5  */
+   [36] = "LCD_WR",  /* pin 6  */
+   [37] = "LCD_RS",  /* pin 3  */
+};
+
+#define IN (1 << 0)
+#define OUT(1 << 1)
+#ifndef NO_IRQ
+#define NO_IRQ -1
+#endif
+
+/*
+ * This structure is used to describe capabilities of DIO lines,
+ * such as available directions, and mapped IRQ (if any).
+ */
+struct ts5500_dio {
+   const unsigned long value_addr;
+   const int value_bit;
+   const unsigned long control_addr;
+   const int control_bit;
+   const int irq;
+   const int direction;
+};
+

Re: [PATCH v3 05/26] md: Convert md_trim_bio() to use bio_advance()

2012-09-25 Thread NeilBrown

On Mon, 24 Sep 2012 15:34:45 -0700 Kent Overstreet 
wrote:

> Signed-off-by: Kent Overstreet 
> CC: Jens Axboe 
> CC: NeilBrown 
> ---
>  drivers/md/md.c | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7a2b079..51ce48c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -190,25 +190,16 @@ void md_trim_bio(struct bio *bio, int offset, int size)
>   struct bio_vec *bvec;
>   int sofar = 0;
>  
> - size <<= 9;
>   if (offset == 0 && size == bio->bi_size)
>   return;

Safer to leave the "size <<= 9" where it is, otherwise that condition will
always fail.

Otherwise,
 Acked-by: NeilBrown 

NeilBrown


>  
> - bio->bi_sector += offset;
> - bio->bi_size = size;
> - offset <<= 9;
>   clear_bit(BIO_SEG_VALID, >bi_flags);
>  
> - while (bio->bi_idx < bio->bi_vcnt &&
> -bio->bi_io_vec[bio->bi_idx].bv_len <= offset) {
> - /* remove this whole bio_vec */
> - offset -= bio->bi_io_vec[bio->bi_idx].bv_len;
> - bio->bi_idx++;
> - }
> - if (bio->bi_idx < bio->bi_vcnt) {
> - bio->bi_io_vec[bio->bi_idx].bv_offset += offset;
> - bio->bi_io_vec[bio->bi_idx].bv_len -= offset;
> - }
> + bio_advance(bio, offset << 9);
> +
> + size <<= 9;
> + bio->bi_size = size;
> +
>   /* avoid any complications with bi_idx being non-zero*/
>   if (bio->bi_idx) {
>   memmove(bio->bi_io_vec, bio->bi_io_vec+bio->bi_idx,



signature.asc
Description: PGP signature

Re: hot-added cpu is not asiggned to the correct node

2012-09-25 Thread Yasuaki Ishimatsu


Hi Dan,

At first, thank you for your comment.

2012/09/24 18:33, Dan Carpenter wrote:

On Wed, Sep 12, 2012 at 02:33:11PM +0900, Yasuaki Ishimatsu wrote:

When I hot-added CPUs and memories simultaneously using container driver,
all the hot-added CPUs were mistakenly assigned to node0.



Is this something which used to work correctly?  If so which was the
most recent working kernel?


The cpu hot-adding is first time on my x86 box. So I don't know
whether old kernel can work well or not. But it seems that x86
does not permit to create memory-less-node. So I guess the problem
occurs on old kernel.

Thanks,
Yasuaki Ishimatsu


regards,
dan carpenter




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 3/3] tracing: format non-nanosec times from tsc clock without a decimal point.

2012-09-25 Thread David Sharp

On Tue, Sep 25, 2012 at 4:36 PM, Steven Rostedt  wrote:
> On Tue, 2012-09-25 at 15:29 -0700, David Sharp wrote:
>
>
>> >> + ret = trace_seq_printf(
>> >> + s, "[%08llx] %ld.%03ldms (+%ld.%03ldms): ",
>> >> + ns2usecs(iter->ts),
>> >> + abs_msec, abs_usec,
>> >> + rel_msec, rel_usec);
>> >> + } else if (verbose && !in_ns) {
>> >> + ret = trace_seq_printf(
>> >> + s, "[%016llx] %lld (+%lld): ",
>> >> + iter->ts, abs_ts, rel_ts);
>> >> + } else { /* !verbose */
>> >> + ret = trace_seq_printf(
>> >> + s, " %4lld%s%c: ",
>> >> + abs_ts,
>> >> + in_ns ? "us" : "",
>> >> + rel_ts > mark_thresh ? '!' :
>> >> +   rel_ts > 1 ? '+' : ' ');
>>
>> I just noticed something about this: with x86-tsc clock, this will
>> always print a '+'. Does it matter? Also, is the 200k cycle threshold
>> for '!' okay? I guess the counter clock will always end up with rel_ts
>> == 1, so marks should never appear.
>>
>
> Actually, I'm thinking that counters should not add those annotations.
> As it just doesn't make sense.

Right. But they won't appear anyway, since the delta will always be 1.

wait, by "counters" are you including TSC? Surely that makes sense,
since it is a measurement of time.

Eh... sorry I brought it up. I don't really want to change it. I never
use the latency tracer, so I mostly just don't want to break it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pagemap: fix wrong KPF_THP on slab pages

2012-09-25 Thread David Rientjes

On Tue, 25 Sep 2012, Naoya Horiguchi wrote:

> KPF_THP can be set on non-huge compound pages like slab pages, because
> PageTransCompound only sees PG_head and PG_tail. Obviously this is a bug
> and breaks user space applications which look for thp via /proc/kpageflags.
> Currently thp is constructed only on anonymous pages, so this patch makes
> KPF_THP be set when both of PageAnon and PageTransCompound are true.
> 
> Changelog in v2:
>   - add a comment in code
> 
> Signed-off-by: Naoya Horiguchi 

Wouldn't PageTransCompound(page) && !PageHuge(page) && !PageSlab(page) be 
better for a future extension of thp support?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/9] mm: compaction: Acquire the zone->lru_lock as late as possible

2012-09-25 Thread Minchan Kim

On Tue, Sep 25, 2012 at 02:39:31PM -0700, Andrew Morton wrote:
> On Tue, 25 Sep 2012 17:13:27 +0900
> Minchan Kim  wrote:
> 
> > I see. To me, your saying is better than current comment.
> > I hope comment could be more explicit.
> > 
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index df01b4e..f1d2cc7 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -542,8 +542,9 @@ isolate_migratepages_range(struct zone *zone, struct 
> > compact_control *cc,
> >  * splitting and collapsing (collapsing has already happened
> >  * if PageLRU is set) but the lock is not necessarily taken
> >  * here and it is wasteful to take it just to check 
> > transhuge.
> > -* Check transhuge without lock and skip if it's either a
> > -* transhuge or hugetlbfs page.
> > +* Check transhuge without lock and *skip* if it's either a
> > +* transhuge or hugetlbfs page because it's not safe to call
> > +* compound_order.
> >  */
> > if (PageTransHuge(page)) {
> > if (!locked)
> 
> Going a bit further:
> 
> --- 
> a/mm/compaction.c~mm-compaction-acquire-the-zone-lru_lock-as-late-as-possible-fix
> +++ a/mm/compaction.c
> @@ -415,7 +415,8 @@ isolate_migratepages_range(struct zone *
>* if PageLRU is set) but the lock is not necessarily taken
>* here and it is wasteful to take it just to check transhuge.
>* Check transhuge without lock and skip if it's either a
> -  * transhuge or hugetlbfs page.
> +  * transhuge or hugetlbfs page because calling compound_order()
> +  * requires lru_lock to exclude isolation and splitting.
>*/
>   if (PageTransHuge(page)) {
>   if (!locked)
> _
> 
> 
> but...  the requirement to hold lru_lock for compound_order() is news
> to me.  It doesn't seem to be written down or explained anywhere, and
> one wonders why the cheerily undocumented compound_lock() doesn't have
> this effect.  What's going on here??

First of all, I don't know why we should mention hugetlbfs in comment.
I don't know hugetlbfs well so I had a time to look through code but
can't find a place setting PG_lru so I'm not sure hugetlbfs page can
reach on this code. Please correct me if I was wrong.

On THP, I think compound_lock you mentioned is okay, too but I think
it's sort of optimization because we don't need both lru_lock and
compound_lock. If we hold lru_lock, we can't prevent race with
__split_huge_page_refcount so that the page couldn't be freed.

Namely, it's safe to call compound_order.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mmotm 2012-09-25-17-06 uploaded

2012-09-25 Thread akpm

The mm-of-the-moment snapshot 2012-09-25-17-06 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (3.x
or 3.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/?p=linux-mmots.git;a=summary

and use of this tree is similar to
http://git.cmpxchg.org/?p=linux-mmotm.git, described above.


This mmotm tree contains the following patches against 3.6-rc7:
(patches marked "*" will be included in linux-next)

  origin.patch
* pwm-backlight-take-over-maintenance.patch
* checksyscalls-fix-here-document-handling.patch
* 
lib-flex_proportionsc-fix-corruption-of-denominator-in-flexible-proportions.patch
* c-r-prctl-fix-build-error-for-no-mmu-case.patch
* 
thp-avoid-vm_bug_on-page_countpage-false-positives-in-__collapse_huge_page_copy.patch
* pagemap-fix-wrong-kpf_thp-on-slab-pages.patch
  linux-next.patch
  i-need-old-gcc.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
* cris-fix-i-o-macros.patch
* selinux-fix-sel_netnode_insert-suspicious-rcu-dereference.patch
* vfs-d_obtain_alias-needs-to-use-as-default-name.patch
* cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch
* cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved-fix.patch
* 
acpi_memhotplugc-fix-memory-leak-when-memory-device-is-unbound-from-the-module-acpi_memhotplug.patch
* acpi_memhotplugc-free-memory-device-if-acpi_memory_enable_device-failed.patch
* acpi_memhotplugc-remove-memory-info-from-list-before-freeing-it.patch
* 
acpi_memhotplugc-dont-allow-to-eject-the-memory-device-if-it-is-being-used.patch
* acpi_memhotplugc-bind-the-memory-device-when-the-driver-is-being-loaded.patch
* 
acpi_memhotplugc-auto-bind-the-memory-device-which-is-hotplugged-before-the-driver-is-loaded.patch
* 
arch-x86-platform-iris-irisc-register-a-platform-device-and-a-platform-driver.patch
* x86-numa-dont-check-if-node-is-numa_no_node.patch
* arch-x86-tools-insn_sanityc-identify-source-of-messages.patch
* audith-replace-defines-with-c-stubs.patch
* mn10300-only-add-mmem-funcs-to-kbuild_cflags-if-gcc-supports-it.patch
* dma-dmaengine-lower-the-priority-of-failed-to-get-dma-channel-message.patch
* pcmcia-move-unbind-rebind-into-dev_pm_opscomplete.patch
* prctl-use-access_ok-instead-of-task_size-in-prctl_set_mm.patch
* drm-i915-optimize-div_round_closest-call.patch
* gpu-drm-ttm-use-copy_highpage.patch
  cyber2000fb-avoid-palette-corruption-at-higher-clocks.patch
* timeconstpl-remove-deprecated-defined-array.patch
* time-dont-inline-export_symbol-functions.patch
* kbuild-make-fix-if_changed-when-command-contains-backslashes.patch
* h8300-select-generic-atomic64_t-support.patch
* unicore32-select-generic-atomic64_t-support.patch
* readahead-fault-retry-breaks-mmap-file-read-random-detection.patch
* drivers-scsi-atp870uc-fix-bad-use-of-udelay.patch
* cciss-cleanup-bitops-usage.patch
* cciss-use-check_signature.patch
* block-store-partition_meta_infouuid-as-a-string.patch
* init-reduce-partuuid-min-length-to-1-from-36.patch
*

Re: [RFC PATCH 2/17] input: RMI4 core bus and sensor drivers.

2012-09-25 Thread Christopher Heiny

Sorry about the delay in following up on this one - we're going over the 
feedback, and realized we'd missed this comment.

On 08/23/2012 01:55 AM, Linus Walleij wrote:

+/* Create templates for given types */
>+#define simple_show_union_struct_unsigned(regtype, propname)\
>+simple_show_union_struct(regtype, propname, "%u\n")
>+
>+#define simple_show_union_struct_unsigned2(regtype, reg_group, propname)\
>+simple_show_union_struct2(regtype, reg_group, propname, "%u\n")
>+
>+#define show_union_struct_unsigned(regtype, reg_group, propname)\
>+show_union_struct(regtype, reg_group, propname, "%u\n")
>+
>+#define show_store_union_struct_unsigned(regtype, reg_group, propname)\
>+show_store_union_struct(regtype, reg_group, propname, "%u\n")
>+
>+#define show_repeated_union_struct_unsigned(regtype, reg_group, propname)\
>+show_repeated_union_struct(regtype, reg_group, propname, "%u")
>+
>+#define show_store_repeated_union_struct_unsigned(regtype, reg_group, 
propname)\
>+show_store_repeated_union_struct(regtype, reg_group, propname, "%u")
>+
>+/* Remove access to raw format string versions */
>+/*#undef simple_show_union_struct
>+#undef show_union_struct_unsigned
>+#undef show_store_union_struct
>+#undef show_repeated_union_struct
>+#undef show_store_repeated_union_struct*/

This looks like trying to reimplement ioctl() in sysfs.

If what you want is to send big structs in/out of the kernel,
use either ioctl() on device nodes (should be trivial since input
is using real device nodes) or use configfs.

I'm a little confused.  There's repeated emphasis in the kernel doc that 
you shouldn't use ioctl() anymore - use sysfs instead.  So we've been 
using sysfs, though it seems somewhat klutzy.  If it's actually OK to 
use ioctl(), that could simplify things.  On the other hand, using 
configfs might be more appropriate.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ktest.pl always returns 0?

2012-09-25 Thread Steven Rostedt

On Tue, 2012-09-25 at 16:33 -0400, Steven Rostedt wrote:
> On Tue, 2012-09-25 at 12:40 -0700, Greg KH wrote:
> 
> > Hey, it's not my fault your employer has a crummy email system that
> > can't handle remote access well, I just went off of the Author: line in
> > your ktest.pl kernel commits :)
> 
> Yeah, I'm not upset by it. I just want to warn people that there's times
> I may spend long periods of not answering that email.
> 
> > 
> > > > I'm trying to use ktest to do build tests of the stable patch series to
> > > > verify I didn't mess anything up, but I'm finding that ktest always
> > > > returns 0 when finished, no matter if the build test was successful or
> > > > failed.
> > > 
> > > Hmm, I should fix that. Yeah, I agree, if it fails a test it should
> > > return something other than zero. But I think that only happens if you
> > > have DIE_ON_FAILURE = 0. As IIRC, the perl "die" command should exit the
> > > application with an error code.
> > > 
> > > But yeah, I agree, if one of the tests fail, the error code should not
> > > be zero. I'll write up a patch to fix that. Or at least add an option to
> > > make that happen.
> > 
> > That would be great.
> > 
> > > > Is this right?  Is there some other way to determine if ktest fails
> > > > other than greping the output log?
> > > 
> > > If you have DIE_ON_FAILURE = 1 (default) it should exit with non zero.
> > 
> > It doesn't do that, test it and see (this is with what is in Linus's
> > 3.6-rc7 tree, I didn't test linux-next if that is newer, my apologies.)
> 
> This should have been something from day one. I'll go ahead and try it
> out. According to the perl-doc man pages the "die" command has:
> 
>If an uncaught exception results in interpreter exit, the exit
>code is determined from the values of $! and $? with this
>pseudocode:
> 
>exit $! if $!;  # errno
>exit $? >> 8 if $? >> 8;# child exit status
>exit 255;   # last resort
> 
> I'll investigate this further.
> 
> > 
> > > > Oh, and any hints on kicking off a ktest process on a remote machine in
> > > > a "simple" way?  I'm just using ssh to copy over a script that runs
> > > > there, wrapping ktest.pl up with other stuff, I didn't miss the fact
> > > > that ktest itself can run remotely already, did I?
> > > 
> > > I'm a little confused by this question. Do you want a server ktest? That
> > > is, have a ktest daemon that listens for clients that sends it config
> > > files and then runs them? That would actually be a fun project ;-)
> > > 
> > > You're not running ktest on the target machine are you? The way I use it
> > > is the following:
> > > 
> > > I have a server that I ssh to and run ktest from. It does all the builds
> > > there on the server and this server has a means to monitor some target.
> > > I use ttywatch that connects to the serial of the target, in which ktest
> > > uses to read from.
> > > 
> > > Sometimes this "server" is the machine I'm logged in to.  And I just run
> > > ktest directly.
> > > 
> > > Can you explain more of what you are looking for?
> > 
> > I want to be able to say:
> > - take this set of stable patches and go run a 'make
> >   allmodconfig' build on a remote machine and email me back the
> >   answer because I might not be able to keep an internet
> >   connection open for the next 5-15 minutes it might take to
> >   complete that task.
> 
> I cheat and run all my ktests in screen sessions ;-)
> 
> > 
> > I don't do boot tests with these kernel build tests, although sometime
> > in the future it would be nice to do that.  Right now I do that testing
> > manually, as it's pretty infrequent (once per release usually.)
> > 
> > So yes, a 'ktest' server would be nice.  I've attached the (horrible)
> > script below that I'm using for this so far.  It seems to work well, and
> > I can do builds on a "cloud" server as well as my local build server
> > just fine, only thing needed to do is change the user and machine name
> > in the script.
> 
> This looks like my next "when I have time" project ;-).
> 
> 
> > 
> > I know ktest doesn't handle quilt patches yet, which is why I apply them
> > "by hand" now to a given git tree branch, if you ever do add that
> > option, I'll gladly test it out and change my script to use whatever
> > format it needs.
> > 
> 
> Yeah, I need to make ktest work with quilt, as I'm still a fan.
> 
> But currently the ones that pay me actually are giving me things to do.
> Something about satisfying customers or some other crap. Thus, my "down
> time" is limited at the moment :-(  But when things on the customer side
> slows down again, I'll definitely work on these changes.
> 
> Thanks for the ideas! I'm actually looking forward to working on this.
> But in the mean time, I will test the next time ktest fails on me to see
> what the result of $? is.
> 

I just forced a build failure to see what

[PATCH v5 20/25] selftest: Add generic tree self-test common code.

2012-09-25 Thread Daniel Santos

Self-test code for both performance and correctness testing. The files
tools/testing/selftests/grbtree/common.{h,c} contain code for use in
both the user- and kernel-space test program/module and depends upon a
few functions being made available by said.

The purpose of these tests is to verify correctness across compilers and
document the performance difference between the generic and hand-coded
red-black tree implementations on various compilers, which is identified
as critical for determining feasibility of adding this this tree
implementation to the kernel, as older compilers optimize the generic
code more poorly than its hand-coded counterpart.

Signed-off-by: Daniel Santos 
---
 tools/testing/selftests/grbtree/common.c |  957 ++
 tools/testing/selftests/grbtree/common.h |  252 
 2 files changed, 1209 insertions(+), 0 deletions(-)
 create mode 100644 tools/testing/selftests/grbtree/common.c
 create mode 100644 tools/testing/selftests/grbtree/common.h

diff --git a/tools/testing/selftests/grbtree/common.c 
b/tools/testing/selftests/grbtree/common.c
new file mode 100644
index 000..9625d60
--- /dev/null
+++ b/tools/testing/selftests/grbtree/common.c
@@ -0,0 +1,957 @@
+/* common.c - generic red-black tree test functions for use in both kernel and
+ * user space.
+ * Copyright (C) 2012  Daniel Santos 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, 
USA.
+ */
+
+#include "common.h"
+
+#define _CONCAT2(a, b) a ## b
+#define _CONCAT(a, b) _CONCAT2(a, b)
+
+
+const char *grbtest_type_desc[GRBTEST_TYPE_COUNT] = {
+   "insertion performance",
+   "insertion & deletion performance",
+   "insertion validation"
+};
+
+#if GRBTEST_USE_AUGMENTED
+static void mytree_augment_fn(struct rb_node *node, void *data)
+{
+   // does nothing
+}
+#endif
+
+/* more efficient alternative to rb_init_node */
+static inline void grbtest_init_node(struct rb_node *node)
+{
+   node->rb_parent_color = (unsigned long)node;
+   node->rb_right = NULL;
+   node->rb_left = NULL;
+}
+
+#if GRBTEST_BUILD_GENERIC
+/
+ * Generic Implementation
+ */
+
+#define __GRBTEST_FLAGS\
+   ((GRBTEST_UNIQUE_KEYS ? RB_UNIQUE_KEYS : 0) |   \
+(GRBTEST_INSERT_REPLACES ? RB_INSERT_REPLACES : 0))
+
+
+
+static inline long compare_s32(const s32 *a, const s32 *b) {return *a - *b;}
+static inline long greater_s32(const s32 *a, const s32 *b) {return *a > *b;}
+
+static inline long compare_u32(const u32 *a, const u32 *b) {return (long)*a - 
(long)*b;}
+static inline long greater_u32(const u32 *a, const u32 *b) {return *a > *b;}
+
+static inline long compare_s64(const s64 *a, const s64 *b) {return *a - *b;}
+static inline long greater_s64(const s64 *a, const s64 *b) {return *a > *b;}
+
+static inline long compare_u64(const u64 *a, const u64 *b) {return *a - *b;}
+static inline long greater_u64(const u64 *a, const u64 *b) {return *a > *b;}
+
+
+RB_DEFINE_INTERFACE(
+   mytree,
+   struct container, tree,
+#if GRBTEST_USE_LEFTMOST
+   leftmost
+#endif
+   ,
+#if GRBTEST_USE_RIGHTMOST
+   rightmost
+#endif
+   ,
+#if GRBTEST_USE_COUNT
+   count
+#endif
+   ,
+   struct object, node, key,
+   __GRBTEST_FLAGS, _CONCAT(compare_, GRBTEST_KEY_TYPE),
+#if GRBTEST_UNIQUE_KEYS
+   _CONCAT(compare_, GRBTEST_KEY_TYPE),
+#else
+   _CONCAT(greater_, GRBTEST_KEY_TYPE),
+#endif
+#if GRBTEST_USE_AUGMENTED
+   mytree_augment_fn
+#endif
+   ,
+   static __flatten inline,/* find */
+   static __flatten inline,/* insert */
+   static __flatten inline,/* find_near */
+   static __flatten inline);   /* insert_near */
+
+
+#else /* GRBTEST_BUILD_GENERIC */
+
+/
+ * Hand-coded Implementation
+ *
+ * This section implements the find, insert & remove functions as one would do
+ * so were they hand-coding it, except that we use pre-processor to include (or
+ * omit) the various features & rules.
+ *
+ * In order to account for compilers that may fail to optimize out a simple
+ * if(0) or if(1) construct, we'll make sure that such extra code is

[PATCH v5 21/25] selftest: Add userspace test program.

2012-09-25 Thread Daniel Santos

Userspace test program using code from common.{c,h}.  Userspace
compliation is accomplished by ovreriding a few kernel headers in the
overrides directory.  This is an invasive hack (involving many internals
of kernel headers) that is expected to require fairly frequent
maintainence as kernel headers change.

The program grbtest can be run with -h option for help and outputs both
a human-readable format as well as a delimited text for more extensive
processing.

The exact behavior of the grbtest program depends upon the following
pre-processor variables, which the Makefile expects to be -Defined in a
CONFIG variable. Each variable should be assigned a value of 1 or 0, as
documented in common.h. If CONFIG is not set, a default is assigned in
the Makefile.

GRBTEST_BUILD_GENERIC
GRBTEST_USE_LEFTMOST
GRBTEST_USE_RIGHTMOST
GRBTEST_USE_COUNT
GRBTEST_UNIQUE_KEYS
GRBTEST_INSERT_REPLACES
GRBTEST_USE_AUGMENTED

Generation of the resultant executable is also dependent upon the below
.config values.

DEBUG_RBTREE
DEBUG_RBTREE_VALIDATE

Signed-off-by: Daniel Santos 
---
 tools/testing/selftests/grbtree/user/Makefile  |   66 +++
 tools/testing/selftests/grbtree/user/facilities.c  |   58 ++
 tools/testing/selftests/grbtree/user/main.c|  614 
 .../grbtree/user/overrides/linux/export.h  |   31 +
 .../grbtree/user/overrides/linux/kernel.h  |   80 +++
 5 files changed, 849 insertions(+), 0 deletions(-)
 create mode 100644 tools/testing/selftests/grbtree/user/Makefile
 create mode 100644 tools/testing/selftests/grbtree/user/facilities.c
 create mode 100644 tools/testing/selftests/grbtree/user/main.c
 create mode 100644 
tools/testing/selftests/grbtree/user/overrides/linux/export.h
 create mode 100644 
tools/testing/selftests/grbtree/user/overrides/linux/kernel.h

diff --git a/tools/testing/selftests/grbtree/user/Makefile 
b/tools/testing/selftests/grbtree/user/Makefile
new file mode 100644
index 000..e487a85
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/Makefile
@@ -0,0 +1,66 @@
+# Default configuration (used if CONFIG not supplied)
+# See common.h for docs
+CONFIG ?= -DGRBTEST_KEY_TYPE=u32   \
+ -DGRBTEST_BUILD_GENERIC=0 \
+ -DGRBTEST_USE_LEFTMOST=1  \
+ -DGRBTEST_USE_RIGHTMOST=1 \
+ -DGRBTEST_USE_COUNT=1 \
+ -DGRBTEST_UNIQUE_KEYS=1   \
+ -DGRBTEST_INSERT_REPLACES=1   \
+ -DGRBTEST_USE_AUGMENTED=0
+
+ifeq ($(KERNELRELEASE),)
+# Assume the source tree is where the running kernel was built
+# You should set KERNELDIR in the environment if it's elsewhere
+KERNELDIR ?= /lib/modules/$(shell uname -r)/build
+endif
+
+PWD := $(shell pwd)
+
+# The below KERNEL_ARCH works on x86_64, but hasn't been tested elsewhere
+# (e.g., x86 32-bit, ARM, etc.)
+KERNEL_ARCH = $(shell uname -m | sed 's/x86_64/x86/')
+
+# Kernel include directories
+KERNEL_INCLUDES = -I$(KERNELDIR)/include \
+  -I$(KERNELDIR)/arch/$(KERNEL_ARCH)/include
+CPPFLAGS   += -DGRBTEST_USERLAND=1 -D__KERNEL__ -I$(PWD)/.. \
+  -I$(PWD)/overrides $(KERNEL_INCLUDES) $(CONFIG)
+WARN_FLAGS  = -Wall -Wundef -Wstrict-prototypes -Wno-unused-variable \
+  -Werror-implicit-function-declaration -Wno-trigraphs \
+  -Wno-format-security -Wno-unused-variable -Werror
+
+# Standard CFLAGS if not already set
+CFLAGS ?= -O2 -pipe
+# FIXME: breaks glibc
+#CFLAGS+= -mno-see
+# TODO: Can get tese from KBUILD_CFLAGS in arch//Makefile?
+#CFLAGS+= -march=native -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
+CFLAGS += $(WARN_FLAGS) -fno-strict-aliasing -fno-common \
+  -fno-delete-null-pointer-checks
+CC ?= gcc
+CPPFLAGS   += -DGRBTEST_CFLAGS="$(CFLAGS)" -DGRBTEST_CONFIG="$(CONFIG)" \
+  -DGRBTEST_CC="$(CC)" \
+  -DGRBTEST_ARCH="$(KERNEL_ARCH)" \
+  -DGRBTEST_ARCH_FLAGS="-march=k8 -m64" \
+  -DGRBTEST_PROCESSOR="$(shell uname -p)" \
+
+all: grbtest
+
+OBJ_FILES = main.o rbtree.o common.o facilities.o
+HEADER_FILES = $(KERNELDIR)/include/linux/rbtree.h ../common.h
+
+rbtree.c:
+   ln -s $(KERNELDIR)/lib/rbtree.c $(PWD)/rbtree.c
+
+common.c:
+   ln -s ../common.c common.c
+
+rbtree.o: rbtree.c $(KERNELDIR)/include/linux/rbtree.h
+
+grbtest: $(OBJ_FILES) $(HEADER_FILES)
+   $(CC) $(CFLAGS) $(OBJ_FILES) -o grbtest
+
+clean:
+   rm -f grbtest *.o rbtree.c common.c
+
diff --git a/tools/testing/selftests/grbtree/user/facilities.c 
b/tools/testing/selftests/grbtree/user/facilities.c
new file mode 100644
index 000..63c29aa
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/facilities.c
@@ -0,0 +1,58 @@
+/* facilities.c - userspace facilities used by common.c/h
+ * Copyright (C) 2012  Daniel Santos 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU

[PATCH v5 22/25] selftest: Add script to compile & run userspace test program

2012-09-25 Thread Daniel Santos

This is a basic script is designed to automate the testing process (for
a single compiler) and generate delimited text output suitable for later
processing. All test parameters can be supplied via the environment, or
defaults will be used for them. The following variables affect the
script and if not supplied, will have the these default values:

CC="gcc"
KERNELDIR="../../../../.."
CFLAGS="-O2 -pipe -march=k8"
key_type="u64"
use_leftmost=0
use_rightmost=0
use_count=0
unique_keys=0
insert_replaces=0
augmented=0
test_num=0

Other variables are hard-coded and will have be tweaked to need.

Signed-off-by: Daniel Santos 
---
 tools/testing/selftests/grbtree/user/runtest.sh |  108 +++
 1 files changed, 108 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/runtest.sh

diff --git a/tools/testing/selftests/grbtree/user/runtest.sh 
b/tools/testing/selftests/grbtree/user/runtest.sh
new file mode 100755
index 000..f371fea
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/runtest.sh
@@ -0,0 +1,108 @@
+#!/bin/bash
+
+# runtest.sh - script to compile and run userspace tests of gerneric red-black
+#  tree implementation for a single compiler.
+#
+# Copyright (C) 2012  Daniel Santos 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, 
USA.
+
+# Variables affecting code generation
+CC="${CC:-gcc}"
+KERNELDIR="${KERNELDIR:-../../../../..}"
+CFLAGS="${CFLAGS:-"-O2 -pipe -march=k8"}"
+# CONFIG parameters:
+key_type=${key_type:-u64}
+use_leftmost=${use_leftmost:-0}
+use_rightmost=${use_rightmost:-0}
+use_count=${use_count:-0}
+unique_keys=${unique_keys:-0}
+insert_replaces=${insert_replaces:-0}
+augmented=${augmented:-0}
+# For test_num, see grbtest -h
+test_num=${test_num:-0}
+
+# Variables passed to grbtest program
+reps=0x2
+count=0x800
+keymask=0xfff
+
+# Output files
+logfile=runtest.log
+datafile=runtest.out
+
+build_desc[0]="hand-coded"
+build_desc[1]="generic"
+
+die() {
+   echo "ERROR${@:+": "}$@" 1>&2
+   exit -1
+}
+
+. /etc/profile || die
+
+do_cpp() {
+   echo "$1" > /tmp/gnucver.$$.c || die
+   ${CC} -E /tmp/gnucver.$$.c | grep -v '^#' | tr -d ' '
+   rm /tmp/gnucver.$$.c
+}
+
+gccverstr=$(do_cpp "__GNUC__.__GNUC_MINOR__.__GNUC_PATCHLEVEL__") || die
+
+execute_tests() {
+   for build_type in 1 0; do
+   CONFIG=$(echo   \
+   -DGRBTEST_KEY_TYPE=${key_type}  \
+   -DGRBTEST_BUILD_GENERIC=${build_type}   \
+   -DGRBTEST_USE_LEFTMOST=${use_leftmost}  \
+   -DGRBTEST_USE_RIGHTMOST=${use_rightmost}\
+   -DGRBTEST_USE_COUNT=${use_count}\
+   -DGRBTEST_UNIQUE_KEYS=${unique_keys}\
+   -DGRBTEST_INSERT_REPLACES=${insert_replaces}\
+   -DGRBTEST_USE_AUGMENTED=${augmented}\
+   )
+
+   echo ""
+   echo "Starting build at $(date '+%Y-%m-%d %H:%M:%S')..."
+   echo "  build_type = ${build_desc[${build_type}]}"
+   echo "  compiler   = ${gccverstr}"
+   echo "  CFLAGS = ${CFLAGS}"
+   echo "  KERNELDIR  = ${KERNELDIR}"
+   echo
+
+   #set -x
+   CC="${CC}"  \
+   CFLAGS="${CFLAGS}"  \
+   CONFIG="${CONFIG}"  \
+   KERNELDIR="${KERNELDIR}"\
+   make clean all || die
+   set +x
+
+   echo
+   echo "Executing test..."
+   echo
+   ./grbtest --seed 1  \
+ --reps ${reps}\
+ --count ${count}  \
+ --keymask ${keymask}  \
+ --delim "|"   \
+ --quote ""\
+ --test ${test_num} | tee -a "${datafile}"
+   echo
+   echo
+   done
+}
+
+execute_tests | tee -a ${logfile}
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe

[PATCH v5 16/25] kernel-doc: bugfix - empty line in Example section

2012-09-25 Thread Daniel Santos

If you have a section named "Example" that contains an empty line,
attempting to generate htmldocs give you the error:

/path/Documentation/DocBook/kernel-api.xml:3455: parser error : Opening and 
ending tag mismatch: programlisting line 3449 and para
   
  ^
/path/Documentation/DocBook/kernel-api.xml:3473: parser error : Opening and 
ending tag mismatch: para line 3467 and programlisting

 ^
/path/Documentation/DocBook/kernel-api.xml:3678: parser error : Opening and 
ending tag mismatch: programlisting line 3672 and para
   
  ^
/path/Documentation/DocBook/kernel-api.xml:3701: parser error : Opening and 
ending tag mismatch: para line 3690 and programlisting

 ^
unable to parse
/path/Documentation/DocBook/kernel-api.xml

Essentially, the script attempts to close a  with a
closing tag for a  block.  This patch corrects the problem by
simply not outputting anything extra when we're dumping pre-formatted
text, since the empty line will be rendered correctly anyway.

Signed-off-by: Daniel Santos 
---
 scripts/kernel-doc |   11 ++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 55ab5e4..69efb2f 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -230,6 +230,7 @@ my $dohighlight = "";
 
 my $verbose = 0;
 my $output_mode = "man";
+my $output_preformatted = 0;
 my $no_doc_sections = 0;
 my %highlights = %highlights_man;
 my $blankline = $blankline_man;
@@ -460,7 +461,9 @@ sub output_highlight {
 
 foreach $line (split "\n", $contents) {
if ($line eq ""){
-   print $lineprefix, local_unescape($blankline);
+   if (! $output_preformatted) {
+   print $lineprefix, local_unescape($blankline);
+   }
} else {
$line =~ s/\\/\&/g;
if ($output_mode eq "man" && substr($line, 0, 1) eq ".") {
@@ -643,10 +646,12 @@ sub output_section_xml(%) {
print "$section\n";
if ($section =~ m/EXAMPLE/i) {
print "\n";
+   $output_preformatted = 1;
} else {
print "\n";
}
output_highlight($args{'sections'}{$section});
+   $output_preformatted = 0;
if ($section =~ m/EXAMPLE/i) {
print "\n";
} else {
@@ -949,10 +954,12 @@ sub output_blockhead_xml(%) {
}
if ($section =~ m/EXAMPLE/i) {
print "\n";
+   $output_preformatted = 1;
} else {
print "\n";
}
output_highlight($args{'sections'}{$section});
+   $output_preformatted = 0;
if ($section =~ m/EXAMPLE/i) {
print "\n";
} else {
@@ -1028,10 +1035,12 @@ sub output_function_gnome {
print "\n $section\n";
if ($section =~ m/EXAMPLE/i) {
print "\n";
+   $output_preformatted = 1;
} else {
}
print "\n";
output_highlight($args{'sections'}{$section});
+   $output_preformatted = 0;
print "\n";
if ($section =~ m/EXAMPLE/i) {
print "\n";
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 13/25] fair.c: Use generic rbtree impl in fair scheduler

2012-09-25 Thread Daniel Santos

Signed-off-by: Daniel Santos 
---
 kernel/sched/fair.c |   75 ++
 1 files changed, 21 insertions(+), 54 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c099cc6..8feb4ea 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -447,6 +447,12 @@ static inline int entity_before(struct sched_entity *a,
return (s64)(a->vruntime - b->vruntime) < 0;
 }
 
+static inline long greater_vruntime(u64 *a, u64 *b)
+{
+   s64 diff = (s64)(*a - *b);
+   return diff > 0;
+}
+
 static void update_min_vruntime(struct cfs_rq *cfs_rq)
 {
u64 vruntime = cfs_rq->min_vruntime;
@@ -472,56 +478,17 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
 #endif
 }
 
-/*
- * Enqueue an entity into the rb-tree:
+/* NOTE: we're passing greater_vruntime for both compare & greater because we
+ * don't need to use the find function.
  */
-static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-   struct rb_node **link = _rq->tasks_timeline.rb_node;
-   struct rb_node *parent = NULL;
-   struct sched_entity *entry;
-   int leftmost = 1;
-
-   /*
-* Find the right place in the rbtree:
-*/
-   while (*link) {
-   parent = *link;
-   entry = rb_entry(parent, struct sched_entity, run_node);
-   /*
-* We dont care about collisions. Nodes with
-* the same key stay together.
-*/
-   if (entity_before(se, entry)) {
-   link = >rb_left;
-   } else {
-   link = >rb_right;
-   leftmost = 0;
-   }
-   }
-
-   /*
-* Maintain a cache of leftmost tree entries (it is frequently
-* used):
-*/
-   if (leftmost)
-   cfs_rq->rb_leftmost = >run_node;
-
-   rb_link_node(>run_node, parent, link);
-   rb_insert_color(>run_node, _rq->tasks_timeline);
-}
-
-static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-   if (cfs_rq->rb_leftmost == >run_node) {
-   struct rb_node *next_node;
-
-   next_node = rb_next(>run_node);
-   cfs_rq->rb_leftmost = next_node;
-   }
-
-   rb_erase(>run_node, _rq->tasks_timeline);
-}
+RB_DEFINE_INTERFACE(
+   fair_tree,
+   struct cfs_rq, tasks_timeline, rb_leftmost, /* no right or count */, ,
+   struct sched_entity, run_node, vruntime,
+   0, greater_vruntime, greater_vruntime, /* no augment */,
+   /* find unused */ ,
+   static __flatten, /* let gcc decide rather or not to inline insert */
+   /* find_near unused */, /* insert_near unused */)
 
 struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
 {
@@ -1108,7 +1075,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se, int flags)
update_stats_enqueue(cfs_rq, se);
check_spread(cfs_rq, se);
if (se != cfs_rq->curr)
-   __enqueue_entity(cfs_rq, se);
+   fair_tree_insert(cfs_rq, se);
se->on_rq = 1;
 
if (cfs_rq->nr_running == 1) {
@@ -1189,7 +1156,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se, int flags)
clear_buddies(cfs_rq, se);
 
if (se != cfs_rq->curr)
-   __dequeue_entity(cfs_rq, se);
+   fair_tree_remove(cfs_rq, se);
se->on_rq = 0;
update_cfs_load(cfs_rq, 0);
account_entity_dequeue(cfs_rq, se);
@@ -1260,7 +1227,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct 
sched_entity *se)
 * runqueue.
 */
update_stats_wait_end(cfs_rq, se);
-   __dequeue_entity(cfs_rq, se);
+   fair_tree_remove(cfs_rq, se);
}
 
update_stats_curr_start(cfs_rq, se);
@@ -1339,7 +1306,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct 
sched_entity *prev)
if (prev->on_rq) {
update_stats_wait_start(cfs_rq, prev);
/* Put 'current' back into the tree. */
-   __enqueue_entity(cfs_rq, prev);
+   fair_tree_insert(cfs_rq, prev);
}
cfs_rq->curr = NULL;
 }
@@ -3593,7 +3560,7 @@ void update_group_power(struct sched_domain *sd, int cpu)
/*
 * !SD_OVERLAP domains can assume that child groups
 * span the current group.
-*/ 
+*/
 
group = child->groups;
do {
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 15/25] kernel-doc: bugfix - multi-line macros

2012-09-25 Thread Daniel Santos

Prior to this patch the following code breaks:

/**
 * multiline_example - this breaks kernel-doc
 */
 #define multiline_example( \
myparam)

Producing this error:

Error(somefile.h:983): cannot understand prototype: 'multiline_example( \ '

This patch fixes the issue by appending all lines ending in a blackslash
(optionally followed by whitespace), removing the backslash and any
whitespace after it prior to appending (just like the C pre-processor
would).

This fixes a break in kerel-doc introduced by the additions to rbtree.h.

Signed-off-by: Daniel Santos 
---
 scripts/kernel-doc |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 9b0c0b8..55ab5e4 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2045,6 +2045,9 @@ sub process_file($) {
 
 $section_counter = 0;
 while () {
+   while (s/\\\s*$//) {
+   $_ .= ;
+   }
if ($state == 0) {
if (/$doc_start/o) {
$state = 1; # next line is always the function name
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 6/25] bug.h: Replace __linktime_error with __compiletime_error

2012-09-25 Thread Daniel Santos

Signed-off-by: Daniel Santos 
---
 include/linux/bug.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index aaac4bb..298a916 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -73,7 +73,7 @@ extern int __build_bug_on_failed;
 #define BUILD_BUG()\
do {\
extern void __build_bug_failed(void)\
-   __linktime_error("BUILD_BUG failed");   \
+   __compiletime_error("BUILD_BUG failed");\
__build_bug_failed();   \
} while (0)
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 8/25] bug.h: Make BUILD_BUG_ON generate compile-time error

2012-09-25 Thread Daniel Santos

Negative sized arrays wont create a compile-time error in some cases
starting with gcc 4.4 (e.g., inlined functions), but gcc 4.3 introduced
the error function attribute that will.  This patch modifies
BUILD_BUG_ON to behave like BUILD_BUG already does, using the error
function attribute so that you don't have to build the entire kernel to
discover that you have a problem, and then enjoy trying to track it down
from a link-time error.

Signed-off-by: Daniel Santos 
---
 include/linux/bug.h |   24 ++--
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index 298a916..c70b833 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -42,24 +42,28 @@ struct pt_regs;
  * @condition: the condition which the compiler should know is false.
  *
  * If you have some code which relies on certain constants being equal, or
- * other compile-time-evaluated condition, you should use BUILD_BUG_ON to
+ * some other compile-time-evaluated condition, you should use BUILD_BUG_ON to
  * detect if someone changes it.
  *
  * The implementation uses gcc's reluctance to create a negative array, but
  * gcc (as of 4.4) only emits that error for obvious cases (eg. not arguments
- * to inline functions).  So as a fallback we use the optimizer; if it can't
- * prove the condition is false, it will cause a link error on the undefined
- * "__build_bug_on_failed".  This error message can be harder to track down
- * though, hence the two different methods.
+ * to inline functions).  Luckily, in 4.3 they added the "error" function
+ * attribute just for this type of case.  Thus, we use a negative sized array
+ * (should always create an error pre-gcc-4.4) and then call an undefined
+ * function with the error attribute (should always creates an error 4.3+).  If
+ * for some reason, neither creates a compile-time error, we'll still have a
+ * link-time error, which is harder to track down.
  */
 #ifndef __OPTIMIZE__
 #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
 #else
-extern int __build_bug_on_failed;
-#define BUILD_BUG_ON(condition)\
-   do {\
-   ((void)sizeof(char[1 - 2*!!(condition)]));  \
-   if (condition) __build_bug_on_failed = 1;   \
+#define BUILD_BUG_ON(condition)
\
+   do {\
+   extern void __build_bug_on_failed(void) \
+   __compiletime_error("BUILD_BUG_ON failed"); \
+   ((void)sizeof(char[1 - 2*!!(condition)]));  \
+   if (condition)  \
+   __build_bug_on_failed();\
} while(0)
 #endif
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 4/25] compiler-gcc{3,4}.h: Use GCC_VERSION macro

2012-09-25 Thread Daniel Santos

Using GCC_VERSION reduces complexity, is easier to read and is GCC's
recommended mechanism for doing version checks. (Just don't ask me why
they didn't define it in the first place.)  This also makes it easy to
merge compiler-gcc{3,4}.h should somebody want to.

Signed-off-by: Daniel Santos 
---
 include/linux/compiler-gcc3.h |8 
 include/linux/compiler-gcc4.h |   12 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h
index 37d4124..7d89feb 100644
--- a/include/linux/compiler-gcc3.h
+++ b/include/linux/compiler-gcc3.h
@@ -2,22 +2,22 @@
 #error "Please don't include  directly, include 
 instead."
 #endif
 
-#if __GNUC_MINOR__ < 2
+#if GCC_VERSION < 30200
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 30300
 # define __used__attribute__((__used__))
 #else
 # define __used__attribute__((__unused__))
 #endif
 
-#if __GNUC_MINOR__ >= 4
+#if GCC_VERSION >= 30400
 #define __must_check   __attribute__((warn_unused_result))
 #endif
 
 #ifdef CONFIG_GCOV_KERNEL
-# if __GNUC_MINOR__ < 4
+# if GCC_VERSION < 30400
 #   error "GCOV profiling support for gcc versions below 3.4 not included"
 # endif /* __GNUC_MINOR__ */
 #endif /* CONFIG_GCOV_KERNEL */
diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index a334107..7ad60cd 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -4,7 +4,7 @@
 
 /* GCC 4.1.[01] miscompiles __weak */
 #ifdef __KERNEL__
-# if __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ <= 1
+# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
 #  error Your version of gcc miscompiles the __weak directive
 # endif
 #endif
@@ -13,11 +13,11 @@
 #define __must_check   __attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
-#if __GNUC_MINOR__ > 0
+#if GCC_VERSION >= 40102
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
 
-#if __GNUC_MINOR__ >= 3
+#if GCC_VERSION >= 40300
 /* Mark functions as cold. gcc will assume any path leading to a call
to them will be unlikely.  This means a lot of manual unlikely()s
are unnecessary now for any paths leading to the usual suspects
@@ -39,9 +39,9 @@
 # define __compiletime_warning(message) __attribute__((warning(message)))
 # define __compiletime_error(message) __attribute__((error(message)))
 #endif /* __CHECKER__ */
-#endif /* __GNUC_MINOR__ >= 3 */
+#endif /* GCC_VERSION >= 40300 */
 
-#if __GNUC_MINOR__ >= 5
+#if GCC_VERSION >= 40500
 /*
  * Mark a position in code as unreachable.  This can be used to
  * suppress control flow warnings after asm blocks that transfer
@@ -56,5 +56,5 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone  __attribute__((__noclone__))
 
-#endif /* __GNUC_MINOR__ >= 5 */
+#endif /* GCC_VERSION >= 40500 */
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 3/3] tracing: format non-nanosec times from tsc clock without a decimal point.

2012-09-25 Thread Steven Rostedt

On Tue, 2012-09-25 at 15:29 -0700, David Sharp wrote:


> >> + ret = trace_seq_printf(
> >> + s, "[%08llx] %ld.%03ldms (+%ld.%03ldms): ",
> >> + ns2usecs(iter->ts),
> >> + abs_msec, abs_usec,
> >> + rel_msec, rel_usec);
> >> + } else if (verbose && !in_ns) {
> >> + ret = trace_seq_printf(
> >> + s, "[%016llx] %lld (+%lld): ",
> >> + iter->ts, abs_ts, rel_ts);
> >> + } else { /* !verbose */
> >> + ret = trace_seq_printf(
> >> + s, " %4lld%s%c: ",
> >> + abs_ts,
> >> + in_ns ? "us" : "",
> >> + rel_ts > mark_thresh ? '!' :
> >> +   rel_ts > 1 ? '+' : ' ');
> 
> I just noticed something about this: with x86-tsc clock, this will
> always print a '+'. Does it matter? Also, is the 200k cycle threshold
> for '!' okay? I guess the counter clock will always end up with rel_ts
> == 1, so marks should never appear.
> 

Actually, I'm thinking that counters should not add those annotations.
As it just doesn't make sense.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 25/25] rbtree.h: (optional?) Add RB_INSERT_DUPE_RIGHT flag

2012-09-25 Thread Daniel Santos

I'm not sure if this is needed anywhere in the kernel. If not, it will
cause inserts run on older compilers to slow down very slightly.

This flag affects the behavior of inserts in trees where duplicate keys
are allowed.  It's generally assumed that the new node will be added to
the head of a group of nodes with the same key value.  However, this
behavior may not always be desired and this flag offers the option to
choose them to be inserted at the tail of such a group.

Signed-off-by: Daniel Santos 
---
 include/linux/rbtree.h |   79 ---
 1 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index f1fbdea..2ca553b 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -264,6 +264,11 @@ static inline void rb_link_node(struct rb_node * node, 
struct rb_node * parent,
  * @RB_INSERT_REPLACES:When set, the rb_insert() will replace a value 
if it
  * matches the supplied one (valid only when
  * RB_UNIQUE_KEYS is set).
+ * @RB_INSERT_DUPE_RIGHT: Normally, when inserting a duplicate value into a
+ * tree with non-unique keys, the new value is inserted at
+ * the the head of the group same-value objects.  This
+ * flag causes inserts in such cases to put the new value
+ * at the tail of the group.
  * @RB_IS_AUGMENTED:   is an augmented tree
  * @RB_VERIFY_USAGE:   Perform checks upon node insertion for a small run-time
  * overhead. This has other usage restrictions, so read
@@ -300,12 +305,14 @@ enum rb_flags {
RB_HAS_COUNT= 0x0004,
RB_UNIQUE_KEYS  = 0x0008,
RB_INSERT_REPLACES  = 0x0010,
+   RB_INSERT_DUPE_RIGHT= 0x0020,
RB_IS_AUGMENTED = 0x0040,
RB_VERIFY_USAGE = 0x0080 & __RB_DEBUG_ENABLE_MASK,
RB_VERIFY_INTEGRITY = 0x0100 & __RB_DEBUG_ENABLE_MASK,
RB_ALL_FLAGS= RB_HAS_LEFTMOST | RB_HAS_RIGHTMOST
| RB_HAS_COUNT | RB_UNIQUE_KEYS
| RB_INSERT_REPLACES | RB_IS_AUGMENTED
+   | RB_INSERT_DUPE_RIGHT
| RB_VERIFY_USAGE | RB_VERIFY_INTEGRITY,
 };
 
@@ -504,15 +511,19 @@ void __rb_assert_good_rel(const struct rb_relationship 
*rel)
/* Due to a bug in versions of gcc prior to 4.6, the following
 * expressions are always evalulated at run-time:
 *
+* ((rel->flags & RB_UNIQUE_KEYS) && (rel->flags & 
RB_INSERT_DUPE_RIGHT))
 * (!(rel->flags & RB_UNIQUE_KEYS) && (rel->flags & RB_INSERT_REPLACES))
 *
 * The work-around for this bug is separate each bitwise AND test using
 * an if/else construct and evaluate only the last test with the
 * BUILD_BUG_ON macro.
 */
-
-   if (rel->flags & RB_INSERT_REPLACES)
-   BUILD_BUG_ON42(!(rel->flags & RB_UNIQUE_KEYS));
+   if (rel->flags & RB_UNIQUE_KEYS)
+   /* only with non-unique keys */
+   BUILD_BUG_ON42(rel->flags & RB_INSERT_DUPE_RIGHT);
+   else
+   /* only with unique keys */
+   BUILD_BUG_ON42(rel->flags & RB_INSERT_REPLACES);
 }
 
 
@@ -840,6 +851,7 @@ struct rb_node *__rb_find_subtree(
 */
break;
else if (!diff) {
+   /* FIXME: non-unique keys broken here on 
inserts */
/* exact match */
*matched = go_left
 ? RB_MATCH_LEFT
@@ -1022,16 +1034,34 @@ struct rb_node *rb_insert(
 
parent = *p;
 
-   if (diff > 0) {
-   p = &(*p)->rb_right;
-   if (rel->flags & RB_HAS_LEFTMOST)
-   leftmost = 0;
-   } else if (!(rel->flags & RB_UNIQUE_KEYS) || diff < 0) {
-   p = &(*p)->rb_left;
-   if (rel->flags & RB_HAS_RIGHTMOST)
-   rightmost = 0;
-   } else
-   break;
+   /* when using non-unique keys, a new same-key objects is
+* inserted at the head of any existing same-key objects unless
+* RB_INSERT_DUPE_RIGHT is specified, which causes them to be
+* inserted at the tail.
+*/
+   if (rel->flags & RB_INSERT_DUPE_RIGHT) {
+   if (diff < 0) {
+   p = &(*p)->rb_left;
+   if (rel->flags & RB_HAS_RIGHTMOST)
+   rightmost = 0;
+   } else if (!(rel->flags & RB_UNIQUE_KEYS) || diff > 0) {
+

[PATCH v5 24/25] selftest: report generation script for test results

2012-09-25 Thread Daniel Santos

A script that uses sqlite to load test results and generates a report
showing differences in performance per compiler used.

Signed-off-by: Daniel Santos 
---
 tools/testing/selftests/grbtree/user/gen_report.sh |  118 
 1 files changed, 118 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/gen_report.sh

diff --git a/tools/testing/selftests/grbtree/user/gen_report.sh 
b/tools/testing/selftests/grbtree/user/gen_report.sh
new file mode 100755
index 000..d6a1d3d
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/gen_report.sh
@@ -0,0 +1,118 @@
+#!/bin/bash
+
+dbfile=results.$$.db
+datafile=runtest.out
+
+die() {
+   echo "ERROR${@:+": "}$@" 1>&2
+   exit -1
+}
+
+find_sqlite() {
+   for suffix in "" 4 3; do
+   which sqlite${suffix} 2> /dev/null && return 0
+   done
+   return 1
+}
+
+sqlite=$(find_sqlite) || die "failed to find sqlite"
+
+${sqlite} "${dbfile}" << asdf
+/* .echo on */
+.headers on
+create table if not exists grbtest_result (
+   compilervarchar(255),
+   key_typevarchar(255),
+   userlandtinyint,
+   use_generic tinyint,
+   use_leftmosttinyint,
+   use_rightmost   tinyint,
+   use_count   tinyint,
+   unique_keys tinyint,
+   insert_replaces tinyint,
+   use_augmented   tinyint,
+   debug   tinyint,
+   debug_validate  tinyint,
+   archvarchar(255),
+   arch_flags  varchar(255),
+   processor   varchar(255),
+   cc  varchar(255),
+   cflags  varchar(255),
+   testtinyint,
+   in_seed bigint,
+   seedbigint,
+   key_maskint,
+   object_countint,
+   pool_count  int,
+   repsbigint,
+   node_size   int,
+   object_size int,
+   pool_size   int,
+   insertions  bigint,
+   insertion_time  bigint,
+   evictions   bigint,
+   deletions   bigint,
+   deletion_time   bigint
+);
+.separator |
+.import ${datafile} grbtest_result
+/* .mode column */
+select distinct
+   key_type,
+   userland,
+   use_leftmost,
+   use_rightmost,
+   use_count,
+   unique_keys,
+   insert_replaces,
+   use_augmented,
+   debug,
+   debug_validate,
+   arch,
+   arch_flags,
+   processor,
+   cc,
+   test,
+   in_seed,
+   seed,
+   key_mask,
+   object_count,
+   pool_count,
+   reps,
+   node_size,
+   object_size,
+   pool_size,
+   insertions,
+   evictions,
+   deletions
+from grbtest_result;
+
+select distinct
+   a.compiler as 'Compiler',
+   a.key_type,
+   (case when a.userland then'U' else 'K' end) ||
+   (case when a.use_leftmost then'L' else '.' end) ||
+   (case when a.use_rightmost then   'R' else '.' end) ||
+   (case when a.use_count then   'C' else '.' end) ||
+   (case when a.unique_keys then 'U' else '.' end) ||
+   (case when a.insert_replaces then 'I' else '.' end) ||
+   (case when a.debug then   'D' else '.' end) ||
+   (case when a.debug_validate then  'V' else '.' end)
+   as config,
+   a.insertion_time as 'Generic Insert Time',
+   b.insertion_time as 'Hand-Coded Insert Time',
+   1.0 * a.insertion_time / b.insertion_time - 1.0 as 'Insert Diff',
+   a.deletion_time as 'Generic Delete Time',
+   b.deletion_time as 'Hand-Coded Delete Time',
+   1.0 * a.deletion_time / b.deletion_time - 1.0 as 'Delete Diff'
+from
+   grbtest_result as a inner join grbtest_result as b on (
+   a.compiler = b.compiler
+   )
+where
+   a.use_generic == 1
+   and b.use_generic = 0;
+asdf
+
+rm "${dbfile}"
+
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 23/25] selftest: Add basic compiler iterator test script

2012-09-25 Thread Daniel Santos

A Gentoo-specific script (to be run by root) to iterate through all
installed compilers, execte runtest.sh script and collect the output
data.

Signed-off-by: Daniel Santos 
---
 tools/testing/selftests/grbtree/user/runtests.sh |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)
 create mode 100755 tools/testing/selftests/grbtree/user/runtests.sh

diff --git a/tools/testing/selftests/grbtree/user/runtests.sh 
b/tools/testing/selftests/grbtree/user/runtests.sh
new file mode 100755
index 000..0192c10
--- /dev/null
+++ b/tools/testing/selftests/grbtree/user/runtests.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+
+# This script is designed for use on Gentoo systems, using gcc-config to
+# change the compiler and must be run as root. I'm lazy, so alter to fit your
+# system.
+
+user=daniel
+outfile=runtests.$$.out
+
+rm -f runtest.log runtest.out
+
+if [[ -e ${outfile} ]]; then
+   echo "File ${outfile} exists, please move it out of the way."
+   exit
+fi
+
+
+for ((gcc_inst_num = 1; gcc_inst_num < 10; ++gcc_inst_num)); do
+   gcc-config $gcc_inst_num || exit
+   . /etc/profile
+   nice -n -3 sudo -Hu ${user} \
+   key_type=u32\
+   use_leftmost=1  \
+   use_rightmost=1 \
+   use_count=1 \
+   unique_keys=1   \
+   insert_replaces=0   \
+   ./runtest.sh >> ${outfile}
+
+done
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ALSA: hda - Add inverted internal mic quirk for Lenovo IdeaPad U310

2012-09-25 Thread Felix Kaechele

The Lenovo IdeaPad U310 has an internal mic where the right channel
is phase inverted.

Signed-off-by: Felix Kaechele 
---
 sound/pci/hda/patch_conexant.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index 5e22a8f..784017e 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -4462,6 +4462,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
SND_PCI_QUIRK(0x17aa, 0x21ce, "Lenovo T420", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21cf, "Lenovo T520", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
+   SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", 
CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),
{}
 };
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 19/25] rbtree.h: add doc comments for struct rb_node

2012-09-25 Thread Daniel Santos

Signed-off-by: Daniel Santos 
---
 include/linux/rbtree.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 3ef30b9..f1fbdea 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -100,6 +100,19 @@ static inline struct page * rb_insert_page_cache(struct 
inode * inode,
 #include 
 #include 
 
+/**
+ * struct rb_node
+ * @rb_parent_color: Contains the color in the lower 2 bits (although only bit
+ *  zero is currently used) and the address of the parent in
+ *  the rest (lower 2 bits of address should always be zero on
+ *  any arch supported).  If the node is initialized and not a
+ *  member of any tree, the parent point to its self.  If the
+ *  node belongs to a tree, but is the root element, the
+ *  parent will be NULL.  Otherwise, parent will always
+ *  point to the parent node in the tree.
+ * @rb_right:Pointer to the right element.
+ * @rb_left: Pointer to the left element.
+ */
 struct rb_node
 {
unsigned long  rb_parent_color;
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 17/25] kernel-doc: Don't mangle whitespace in Example section

2012-09-25 Thread Daniel Santos

A section with the name "Example" (case-insensitive) has a special
meaning to kernel-doc.  These sections are output using mono-type fonts.
However, leading whitespace is stripped, thus robbing a lot of meaning
from this, as indented code examples will be mangled.

This patch preserves the leading whitespace for "Example" sections.
More accurately, it preserves it for all sections, but removes it later
if the section isn't an "Example" section.

Signed-off-by: Daniel Santos 
---
 scripts/kernel-doc |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 69efb2f..976e28c 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -281,9 +281,10 @@ my $doc_special = "\@\%\$\&";
 my $doc_start = '^/\*\*\s*$'; # Allow whitespace at end of comment start.
 my $doc_end = '\*/';
 my $doc_com = '\s*\*\s*';
+my $doc_com_body = '\s*\* ?';
 my $doc_decl = $doc_com . '(\w+)';
 my $doc_sect = $doc_com . '([' . $doc_special . ']?[\w\s]+):(.*)';
-my $doc_content = $doc_com . '(.*)';
+my $doc_content = $doc_com_body . '(.*)';
 my $doc_block = $doc_com . 'DOC:\s*(.*)?';
 
 my %constants;
@@ -460,6 +461,9 @@ sub output_highlight {
 #   print STDERR "contents af:$contents\n";
 
 foreach $line (split "\n", $contents) {
+   if (! $output_preformatted) {
+   $line =~ s/^\s*//;
+   }
if ($line eq ""){
if (! $output_preformatted) {
print $lineprefix, local_unescape($blankline);
@@ -2084,7 +2088,7 @@ sub process_file($) {
$descr= $1;
$descr =~ s/^\s*//;
$descr =~ s/\s*$//;
-   $descr =~ s/\s+/ /;
+   $descr =~ s/\s+/ /g;
$declaration_purpose = xml_escape($descr);
$in_purpose = 1;
} else {
@@ -2176,6 +2180,7 @@ sub process_file($) {
# Continued declaration purpose
chomp($declaration_purpose);
$declaration_purpose .= " " . xml_escape($1);
+   $declaration_purpose =~ s/\s+/ /g;
} else {
$contents .= $1 . "\n";
}
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 12/25] rbtree.h: include kconfig.h

2012-09-25 Thread Daniel Santos

We shouldn't depend upon kernel.h including this for us.  However, this
also fixes some issues with compiling in userland (coming later).

Signed-off-by: Daniel Santos 
---
 include/linux/rbtree.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index f1b53d5..66a99fd 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -98,6 +98,7 @@ static inline struct page * rb_insert_page_cache(struct inode 
* inode,
 #include 
 #include 
 #include 
+#include 
 
 struct rb_node
 {
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/PATCH] zcache2 on PPC64 (Was: [RFC] mm: add support for zsmalloc and zcache)

2012-09-25 Thread Dan Magenheimer

Attached patch applies to staging-next and I _think_ should
fix the reported problem where zbud in zcache2 does not
work on a PPC64 with PAGE_SIZE!=12.  I do not have a machine
to test this so testing by others would be appreciated.

Ideally there should also be a BUILD_BUG_ON to ensure
PAGE_SHIFT * 2 + 2 doesn't exceed BITS_PER_LONG, but
let's see if this fixes the problem first.

Apologies if there are line breaks... I can't send this from
a linux mailer right now.  If it is broken, let me know,
and I will re-post tomorrow... though it should be easy
to apply manually for test purposes.

Signed-off-by: Dan Magenheimer 

diff --git a/drivers/staging/ramster/zbud.c b/drivers/staging/ramster/zbud.c
index a7c4361..6921af3 100644
--- a/drivers/staging/ramster/zbud.c
+++ b/drivers/staging/ramster/zbud.c
@@ -103,8 +103,8 @@ struct zbudpage {
struct {
unsigned long space_for_flags;
struct {
-   unsigned zbud0_size:12;
-   unsigned zbud1_size:12;
+   unsigned zbud0_size:PAGE_SHIFT;
+   unsigned zbud1_size:PAGE_SHIFT;
unsigned unevictable:2;
};
struct list_head budlist;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 11/25] rbtree.h: Generic Red-Black Trees

2012-09-25 Thread Daniel Santos

Add generic red-black tree code to rbtree.h.

Signed-off-by: Daniel Santos 
---
 include/linux/rbtree.h | 1155 +++-
 1 files changed, 1153 insertions(+), 2 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 033b507..f1b53d5 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -1,7 +1,8 @@
 /*
   Red Black Trees
   (C) 1999  Andrea Arcangeli 
-  
+  (C) 2012  Daniel Santos 
+
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
@@ -96,6 +97,7 @@ static inline struct page * rb_insert_page_cache(struct inode 
* inode,
 
 #include 
 #include 
+#include 
 
 struct rb_node
 {
@@ -148,6 +150,7 @@ extern void rb_insert_color(struct rb_node *, struct 
rb_root *);
 extern void rb_erase(struct rb_node *, struct rb_root *);
 
 typedef void (*rb_augment_f)(struct rb_node *node, void *data);
+typedef long (*rb_compare_f)(const void *a, const void *b);
 
 extern void rb_augment_insert(struct rb_node *node,
  rb_augment_f func, void *data);
@@ -162,7 +165,7 @@ extern struct rb_node *rb_first(const struct rb_root *);
 extern struct rb_node *rb_last(const struct rb_root *);
 
 /* Fast replacement of a single node without remove/rebalance/add/rebalance */
-extern void rb_replace_node(struct rb_node *victim, struct rb_node *new, 
+extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
struct rb_root *root);
 
 static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
@@ -174,4 +177,1152 @@ static inline void rb_link_node(struct rb_node * node, 
struct rb_node * parent,
*rb_link = node;
 }
 
+
+#define __JUNK junk,
+#define _iff_empty(test_or_junk, t, f) __iff_empty(test_or_junk, t, f)
+#define __iff_empty(__ignored1, __ignored2, result, ...) result
+
+/**
+ * IFF_EMPTY() - Expands to the second argument when the first is empty, the
+ *   third if non-empty.
+ * @test:An argument to test for emptiness.
+ * @t:   A value to expand to if test is empty.
+ * @f:   A value to expand to if test is non-empty.
+ *
+ * Caveats:
+ * IFF_EMPTY isn't perfect.  The test parameter must either be empty or a valid
+ * pre-processor token as well as result in a valid token when pasted to the
+ * end of a word.
+ *
+ * Valid Examples:
+ * IFF_EMPTY(a, b, c) = c
+ * IFF_EMPTY( , b, c) = b
+ * IFF_EMPTY( ,  , c) = (nothing)
+ *
+ * Invalid Examples:
+ * IFF_EMPTY(.,  b, c)
+ * IFF_EMPTY(+,  b, c)
+ */
+#define IFF_EMPTY(test, t, f) _iff_empty(__JUNK##test, t, f)
+
+/**
+ * IS_EMPTY() - test if a pre-processor argument is empty.
+ * @arg:An argument (empty or non-empty)
+ *
+ * If empty, expands to 1, 0 otherwise.  See IFF_EMPTY() for caveats &
+ * limitations.
+ */
+#define IS_EMPTY(arg)  IFF_EMPTY(arg, 1, 0)
+
+/**
+ * OPT_OFFSETOF() - return the offsetof for the supplied expression, or zero
+ *  if m is an empty argument.
+ * @type:   struct/union type
+ * @member: (optional) struct member name
+ *
+ * Since any offsetof can return zero if the specified member is the first in
+ * the struct/union, you should also check if the argument is empty separately
+ * with IS_EMPTY(m).
+ */
+#define OPT_OFFSETOF(type, member) IFF_EMPTY(member, 0, offsetof(type, member))
+
+/**
+ * enum rb_flags - values for strct rb_relationship's flags
+ * @RB_HAS_LEFTMOST:   The container has a struct rb_node *leftmost member
+ * that will receive a pointer to the leftmost (smallest)
+ * object in the tree that is updated during inserts &
+ * deletions.
+ * @RB_HAS_RIGHTMOST:  Same as above (for right side of tree).
+ * @RB_HAS_COUNT:  The container has an unsigned long field that will
+ * receive updates of the object count in the tree.
+ * @RB_UNIQUE_KEYS:The tree contains only unique values.
+ * @RB_INSERT_REPLACES:When set, the rb_insert() will replace a value 
if it
+ * matches the supplied one (valid only when
+ * RB_UNIQUE_KEYS is set).
+ * @RB_IS_AUGMENTED:   is an augmented tree
+ * @RB_ALL_FLAGS:  (internal use)
+ */
+
+enum rb_flags {
+   RB_HAS_LEFTMOST = 0x0001,
+   RB_HAS_RIGHTMOST= 0x0002,
+   RB_HAS_COUNT= 0x0004,
+   RB_UNIQUE_KEYS  = 0x0008,
+   RB_INSERT_REPLACES  = 0x0010,
+   RB_IS_AUGMENTED = 0x0040,
+   RB_ALL_FLAGS= RB_HAS_LEFTMOST | RB_HAS_RIGHTMOST
+   | RB_HAS_COUNT | RB_UNIQUE_KEYS
+   | RB_INSERT_REPLACES | RB_IS_AUGMENTED,
+};
+
+/**
+ * struct rb_relationship - Defines relationship between a container and the
+ *

[PATCH v5 10/25] bug.h: Add gcc 4.2+ versions of BUILD_BUG_ON_* macros

2012-09-25 Thread Daniel Santos

BUILD_BUG_ON42(arg)
BUILD_BUG_ON_CONST42(arg)

Prior to gcc 4.2, the optimizer was unable to determine that many
constant values stored in structs were indeed compile-time constants and
optimize them out.  Sometimes, it will find an intergral value to be a
compile-time constant, but fail to perform a bit-wise AND at
compile-time.  These two macros provide a mechanism to perform these
build-time checks, but not break on older compilers where we already
know they can't be checked at compile time.

For specific details, consult the doc comments for BUILD_BUG_ON_CONST.
These macros are used in the generic rbtree code.

Signed-off-by: Daniel Santos 
---
 include/linux/bug.h |   36 
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index e30f600..d14c23c 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -2,6 +2,7 @@
 #define _LINUX_BUG_H
 
 #include 
+#include 
 
 enum bug_trap_type {
BUG_TRAP_TYPE_NONE = 0,
@@ -129,6 +130,41 @@ struct pt_regs;
 #define BUILD_BUG_ON_NON_CONST(exp)
 #endif
 
+
+#if GCC_VERSION >= 40200
+/**
+ * BUILD_BUG_ON_NON_CONST42 - break compile if expression cannot be determined
+ *to be a compile-time constant (disabled prior to
+ *gcc 4.2)
+ * @exp: value to test for compile-time constness
+ *
+ * Use this macro instead of BUILD_BUG_ON_NON_CONST when testing struct
+ * members or dereferenced arrays and pointers.  Note that the version checks
+ * for this macro are not perfect.  BUILD_BUG_ON_NON_CONST42 expands to nothing
+ * prior to gcc-4.2, after which it is the same as BUILD_BUG_ON_NON_CONST.
+ * However, there are still many checks that will break with this macro (see
+ * the Gory Details section of BUILD_BUG_ON_NON_CONST for more info).
+ *
+ * See also BUILD_BUG_ON_NON_CONST()
+ */
+# define BUILD_BUG_ON_NON_CONST42(exp) BUILD_BUG_ON_NON_CONST(exp)
+
+/**
+ * BUILD_BUG_ON42 - break compile if expression cannot be determined
+ *   (disabled prior to gcc 4.2)
+ *
+ * This gcc-version check is necessary due to breakages in testing struct
+ * members prior to gcc 4.2.
+ *
+ * See also BUILD_BUG_ON()
+ */
+# define BUILD_BUG_ON42(arg) BUILD_BUG_ON(arg)
+#else
+# define BUILD_BUG_ON_NON_CONST42(exp)
+# define BUILD_BUG_ON42(arg)
+#endif /* GCC_VERSION >= 40200 */
+
+
 #endif /* __CHECKER__ */
 
 #ifdef CONFIG_GENERIC_BUG
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 9/25] bug.h: Add BUILD_BUG_ON_NON_CONST macro

2012-09-25 Thread Daniel Santos

A very common use of __builtin_constant_p is to make sure that a certain
value is a compile time constant and generate a build-time error if it
is not.  However, __builtin_constant_p is broken in a variety of ways in
various situations (on various versions of gcc) and never returns one in
an unoptimized build. This macro provide a mechanism to perform these
build-time checks, but not break unoptimized builds (or modules being
build with -O0), of which there probably aren't many people that care
anyway.

This patch documents all of the relevant quirks I could find in the
"Gory Details" section of the doc-comments.  For almost all cases,
BUILD_BUG_ON_NON_CONST() should never fail on a primitive, non-pointer
type variable declared const.  A subsequent patch provides a separate
macro for performing tests which are known to be broken in older
compilers (pretty much, using __builtin_constant_p on arrays, pointers &
structs as well as testing those values).

Signed-off-by: Daniel Santos 
---
 include/linux/bug.h |   48 
 1 files changed, 48 insertions(+), 0 deletions(-)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index c70b833..e30f600 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -81,6 +81,54 @@ struct pt_regs;
__build_bug_failed();   \
} while (0)
 
+/**
+ * BUILD_BUG_ON_NON_CONST - break compile if expression cannot be determined
+ *  to be a compile-time constant.
+ * @exp: value to test for compile-time constness
+ *
+ * __builtin_constant_p() is a work in progress and is broken in various ways
+ * on various versions of gcc and optimization levels. It can fail, even when
+ * gcc otherwise determines that the expression is compile-time constant when
+ * performing actual optimizations and thus, compile out the value anyway. Do
+ * not use this macro for struct members or dereferenced pointers and arrays,
+ * as these are broken in many versions of gcc -- use BUILD_BUG_ON_NON_CONST42
+ * or another gcc-version-checked macro instead.
+ *
+ * As long as you are passing a variable declared const (and not modified),
+ * this macro should never fail (except for floats).  For information on gcc's
+ * behavior in other cases, see below.
+ *
+ * Gory Details:
+ *
+ * Normal primitive variables
+ * - global non-static non-const values are never compile-time constants (but
+ *   you should already know that)
+ * - all const values (global/local, non/static) should never fail this test
+ *   (3.4+) with one exception (below)
+ * - floats (which we wont use anyway) are broken in various ways until 4.2
+ *   (-O1 broken until 4.4)
+ * - local static non-const broken until 4.2 (-O1 broken until 4.3)
+ * - local non-static non-const broken until 4.0
+ *
+ * Dereferencing pointers & arrays
+ * - all static const derefs broken until 4.4 (except arrays at -O2 or better,
+ *   which are fixed in 4.2)
+ * - global non-static const pointer derefs always fail (<=4.7)
+ * - local non-static const derefs broken until 4.3, except for array derefs
+ *   to a zero value, which works from 4.0+
+ * - local static non-const pointers always fail (<=4.7)
+ * - local static non-const arrays broken until 4.4
+ * - local non-static non-const arrays broken until 4.0 (unless zero deref,
+ *   works in 3.4+)
+
+ */
+#ifdef __OPTIMIZE__
+#define BUILD_BUG_ON_NON_CONST(exp) \
+   BUILD_BUG_ON(!__builtin_constant_p(exp))
+#else
+#define BUILD_BUG_ON_NON_CONST(exp)
+#endif
+
 #endif /* __CHECKER__ */
 
 #ifdef CONFIG_GENERIC_BUG
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1384 matches

Mail list logo