Re: [PATCH] PCI / PM: Fix fallback to PCI_D0 in pci_platform_power_transition()

2013-04-12 Thread Yinghai Lu
On Fri, Apr 12, 2013 at 4:58 PM, Rafael J. Wysocki  wrote:
> From: Rafael J. Wysocki 
>
> Commit b51306c (PCI: Set device power state to PCI_D0 for device
> without native PM support) modified pci_platform_power_transition()
> by adding code causing dev->current_state for devices that don't
> support native PCI PM but are power-manageable by the platform to be
> changed to PCI_D0 regardless of the value returned by the preceding
> platform_pci_set_power_state().  In particular, that also is done
> if the platform_pci_set_power_state() has been successful, which
> causes the correct power state of the device set by
> pci_update_current_state() in that case to be overwritten by PCI_D0.
>
> Fix that mistake by making the fallback to PCI_D0 only happen if
> the platform_pci_set_power_state() has returned an error.
>
> Reported-by: Chris J. Benenati 
> Signed-off-by: Rafael J. Wysocki 
> Cc: 
> ---
>  drivers/pci/pci.c |3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> Index: linux-pm/drivers/pci/pci.c
> ===
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -646,8 +646,7 @@ static int pci_platform_power_transition
> error = platform_pci_set_power_state(dev, state);
> if (!error)
> pci_update_current_state(dev, state);
> -   /* Fall back to PCI_D0 if native PM is not supported */
> -   if (!dev->pm_cap)
> +   else if (!dev->pm_cap) /* Fall back to PCI_D0 */
> dev->current_state = PCI_D0;
> } else {
> error = -ENODEV;
>

Acked-by: Yinghai Lu 

also could simplify it further.

---
 drivers/pci/pci.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/pci/pci.c
===
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -646,15 +646,11 @@ static int pci_platform_power_transition
 error = platform_pci_set_power_state(dev, state);
 if (!error)
 pci_update_current_state(dev, state);
-/* Fall back to PCI_D0 if native PM is not supported */
-if (!dev->pm_cap)
-dev->current_state = PCI_D0;
-} else {
+} else
 error = -ENODEV;
-/* Fall back to PCI_D0 if native PM is not supported */
-if (!dev->pm_cap)
-dev->current_state = PCI_D0;
-}
+
+if (error && !dev->pm_cap) /* Fall back to PCI_D0 */
+dev->current_state = PCI_D0;

 return error;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: vt8500: Add SDHC support to WM8505 DT

2013-04-12 Thread Olof Johansson
On Fri, Apr 12, 2013 at 07:00:29AM +1200, Tony Prisk wrote:
> This patch adds the required node for the SDHC controller on WM8505 SoCs.
> 
> Signed-off-by: Tony Prisk 
> ---
> Arnd,
> 
> Any chance you can apply this for 3.10

Applied to next/dt


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep/fan problem bisected to 73201dbec64aebf6b0dca855b523f437972dc7bb

2013-04-12 Thread auxsvr
Jake Edge wrote:

> 
> Hi Peter,
> 
> I have been having a problem on my laptop (HP Compaq 2510p) over the
> last two months with >= 3.7 kernels.  After the first resume, it turns
> on the fan and leaves it running at top speed no matter what the system
> is doing.
> The output of "sensors" is interesting ... for "temp6" in
> "acpitz-virtual-0" it sits at +100C (near the +110C critical level)
> when things have gone bad (just in Fedora or my kernels >=3.7, not in
> the bisect, cuz those don't come back from the sleep) ... in the "good"
> case, it normally sits around 30, but goes as high as 50 (and sometimes
> reports 0 for a try or two -- frozen motherboard! :) ...
> 
> thoughts on this?  Other info you need or things I should be trying?

This is known, see https://lkml.org/lkml/2012/12/4/428 for more details.

> thanks,
> 
> jake

Regards,
Peter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7] i2c: exynos5: add High Speed I2C controller driver

2013-04-12 Thread Naveen Krishna Ch
On 5 April 2013 10:22, Naveen Krishna Chatradhi
 wrote:
> From: Naveen Krishna Chatradhi 
>
> Adds support for High Speed I2C driver found in Exynos5 and
> later SoCs from Samsung.
> This driver currently supports Auto mode.
>
> Driver only supports Device Tree method.
> Note: Added debugfs support for registers view, not tested.
>
> Signed-off-by: Taekgyun Ko 
> Signed-off-by: Naveen Krishna Chatradhi 
> Reviewed-by: Simon Glass 
> Tested-by: Andrew Bresticker 
> ---
> change since v6:
> 1. clock divisor function hs split to handle the error cases
> 2. Other irq types are handled
> 3. FIFO are handled more efficiently in TX and RX
> 4. More function description added
> 5. handled the return cases in xfer_msg function
>
>  .../devicetree/bindings/i2c/i2c-exynos5.txt|   50 ++
>  drivers/i2c/busses/Kconfig |7 +
>  drivers/i2c/busses/Makefile|1 +
>  drivers/i2c/busses/i2c-exynos5.c   |  934 
> 
>  4 files changed, 992 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/i2c/i2c-exynos5.txt
>  create mode 100644 drivers/i2c/busses/i2c-exynos5.c
>
> diff --git a/Documentation/devicetree/bindings/i2c/i2c-exynos5.txt 
> b/Documentation/devicetree/bindings/i2c/i2c-exynos5.txt
> new file mode 100644
> index 000..0bc9347
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/i2c/i2c-exynos5.txt
> @@ -0,0 +1,50 @@
> +* Samsung's High Speed I2C controller
> +
> +The Samsung's High Speed I2C controller is used to interface with I2C devices
> +at various speeds ranging from 100khz to 3.4Mhz.
> +
> +Required properties:
> +  - compatible: value should be.
> +  (a) "samsung,exynos5-hsi2c", for i2c compatible with exynos5 hsi2c.
> +  - reg: physical base address of the controller and length of memory mapped
> +region.
> +  - interrupts: interrupt number to the cpu.
> +
> +  - Samsung GPIO variant (deprecated):
> +- gpios: The order of the gpios should be the following: .
> +  The gpio specifier depends on the gpio controller.
> +  - Pinctrl variant (preferred, if available):
> +- pinctrl-0: Pin control group to be used for this controller.
> +- pinctrl-names: Should contain only one value - "default".
> +
> +Optional properties:
> +  - samsung,hs-mode: Mode of operation, High speed or Fast speed mode. If not
> +specified, default value is 0.
> +  - samsung,hs-clock-freq: Desired operating frequency in Hz of the bus.
> +If not specified, the default value in Hz is 10.
> +  - samsung,fs-clock-freq: Desired operarting frequency in Hz of the bus.
> +If not specified, the default value in Hz is 10.
> +
> +Example:
> +
> +   hsi2c@12ca {
> +   compatible = "samsung,exynos5-hsi2c";
> +   reg = <0x12ca 0x100>;
> +   interrupts = <56>;
> +   samsung,fs-clock-freq = <10>;
> +   /* Samsung GPIO variant begins here */
> +   gpios = < 2 0 /* SDA */
> + 3 0 /* SCL */>;
> +   /* Samsung GPIO variant ends here */
> +   /* Pinctrl variant begins here */
> +   pinctrl-0 = <_bus>;
> +   pinctrl-names = "default";
> +   /* Pinctrl variant ends here */
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   s2mps11_pmic@66 {
> +   compatible = "samsung,s2mps11-pmic";
> +   reg = <0x66>;
> +   };
> +   };
> diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
> index adfee98..9fbfa01 100644
> --- a/drivers/i2c/busses/Kconfig
> +++ b/drivers/i2c/busses/Kconfig
> @@ -434,6 +434,13 @@ config I2C_EG20T
>   ML7213/ML7223/ML7831 is companion chip for Intel Atom E6xx series.
>   ML7213/ML7223/ML7831 is completely compatible for Intel EG20T PCH.
>
> +config I2C_EXYNOS5
> +   tristate "Exynos5 high-speed I2C driver"
> +   depends on ARCH_EXYNOS5 && OF
> +   help
> + Say Y here to include support for High-speed I2C controller in the
> + Exynos5 based Samsung SoCs.
> +
>  config I2C_GPIO
> tristate "GPIO-based bitbanging I2C"
> depends on GENERIC_GPIO
> diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
> index 8f4fc23..b19366c 100644
> --- a/drivers/i2c/busses/Makefile
> +++ b/drivers/i2c/busses/Makefile
> @@ -42,6 +42,7 @@ i2c-designware-platform-objs := i2c-designware-platdrv.o
>  obj-$(CONFIG_I2C_DESIGNWARE_PCI)   += i2c-designware-pci.o
>  i2c-designware-pci-objs := i2c-designware-pcidrv.o
>  obj-$(CONFIG_I2C_EG20T)+= i2c-eg20t.o
> +obj-$(CONFIG_I2C_EXYNOS5)  += i2c-exynos5.o
>  obj-$(CONFIG_I2C_GPIO) += i2c-gpio.o
>  obj-$(CONFIG_I2C_HIGHLANDER)   += i2c-highlander.o
>  obj-$(CONFIG_I2C_IBM_IIC)  += i2c-ibm_iic.o
> diff --git a/drivers/i2c/busses/i2c-exynos5.c 
> 

Re: [RFC PATCH 0/2] sched: move content out of core files for load average

2013-04-12 Thread Rakib Mullick
On Sat, Apr 13, 2013 at 6:04 AM, Paul Gortmaker
 wrote:
> Recent activity has had a focus on moving functionally related blocks of stuff
> out of sched/core.c into stand-alone files.  The code relating to load average
> calculations has grown significantly enough recently to warrant placing it in
> a separate file.
>
> Here we do that, and in doing so, we shed ~20k of code from sched/core.c 
> (~10%).
>
> A couple small static functions in the core sched.h header were also localized
> to their singular user in sched/fair.c at the same time, with the goal to also
> reduce the amount of "broadcast" content in that sched.h file.
>
> Paul.
> ---
>
> [ Patches sent here are tested on tip's sched/core, i.e. v3.9-rc1-38-gb329fd5
>
>   Assuming that this change is OK with folks, the timing can be whatever is 
> most
>   convenient -- i.e. I can update/respin it close to the end of the merge 
> window
>   for what will be v3.10-rc1, if that is what minimizes the inconvenience to 
> folks
>   who might be changing the code that is relocated here. ]
>
> Paul Gortmaker (2):
>   sched: fork load calculation code from sched/core --> sched/load_avg
>   sched: move update_load_[add/sub/set] from sched.h to fair.c
>
>  kernel/sched/Makefile   |   2 +-
>  kernel/sched/core.c | 569 ---
>  kernel/sched/fair.c |  18 ++
>  kernel/sched/load_avg.c | 577 
> 
>  kernel/sched/sched.h|  26 +--
>  5 files changed, 604 insertions(+), 588 deletions(-)
>  create mode 100644 kernel/sched/load_avg.c
>

Is there any impact positive over vmlinuz size after these changes?

Thanks,
Rakib
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 00/16] Get rid of the ACPI PCI subdriver mechanism

2013-04-12 Thread Jiang Liu
On 04/13/2013 06:17 AM, Bjorn Helgaas wrote:
> On Fri, Apr 12, 2013 at 9:44 AM, Jiang Liu  wrote:
> 
> I applied these to my pci/jiang-subdrivers branch with minor tweaks.
> The most significant is that I folded in the acpiphp.disable option to
> the patch that makes the driver builtin-only.  That way there's no
> window between removing the "edit modules.conf" workaround and adding
> the kernel parameter.
> 
> Take a look and make sure it's what you want:
> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/jiang-subdrivers
> 
> Bjorn
Hi Bjorn,
Thanks for your support, it seems OK to me.
Regards!
Gerry

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] alsa/usb: Add quirk for 192KHz recording on E-Mu devices

2013-04-12 Thread Calvin Owens
When recording at 176.2KHz or 192Khz, the device adds a 32-bit length
header to the capture packets, which obviously needs to be ignored for
recording to work properly.

Userspace expected:  L0 L1 L2 R0 R1 R2
...but actually got: R2 L0 L1 L2 R0 R1

Also, the last byte of the length header being interpreted as L0 of
the first sample caused spikes every 0.5ms, resulting in a loud 16KHz
tone (about the highest 'B' on a piano) being present throughout
captures.

Tested at all sample rates on an E-Mu 0404USB, and tested for
regressions on a generic USB headset.

Signed-off-by: Calvin Owens 
---
 sound/usb/card.h   | 1 +
 sound/usb/pcm.c| 2 +-
 sound/usb/quirks.c | 1 +
 sound/usb/stream.c | 1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/sound/usb/card.h b/sound/usb/card.h
index 8a751b4..d32ea41 100644
--- a/sound/usb/card.h
+++ b/sound/usb/card.h
@@ -116,6 +116,7 @@ struct snd_usb_substream {
unsigned int altset_idx; /* USB data format: index of alternate 
setting */
unsigned int txfr_quirk:1;  /* allow sub-frame alignment */
unsigned int fmt_type;  /* USB audio format type (1-3) */
+   unsigned int pkt_offset_adj;/* Bytes to drop from beginning of 
packets (for non-compliant devices) */
 
unsigned int running: 1;/* running status */
 
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index f94397b..a481fea 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -1170,7 +1170,7 @@ static void retire_capture_urb(struct snd_usb_substream 
*subs,
stride = runtime->frame_bits >> 3;
 
for (i = 0; i < urb->number_of_packets; i++) {
-   cp = (unsigned char *)urb->transfer_buffer + 
urb->iso_frame_desc[i].offset;
+   cp = (unsigned char *)urb->transfer_buffer + 
urb->iso_frame_desc[i].offset + subs->pkt_offset_adj;
if (urb->iso_frame_desc[i].status && printk_ratelimit()) {
snd_printdd(KERN_ERR "frame %d active: %d\n", i, 
urb->iso_frame_desc[i].status);
// continue;
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
index 5325a38..7e292b9 100644
--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -837,6 +837,7 @@ static void set_format_emu_quirk(struct snd_usb_substream 
*subs,
break;
}
snd_emuusb_set_samplerate(subs->stream->chip, emu_samplerate_id);
+   subs->pkt_offset_adj = (emu_samplerate_id >= EMU_QUIRK_SR_176400HZ) ? 4 
: 0;
 }
 
 void snd_usb_set_format_quirk(struct snd_usb_substream *subs,
diff --git a/sound/usb/stream.c b/sound/usb/stream.c
index ad181d5..0927cc6 100644
--- a/sound/usb/stream.c
+++ b/sound/usb/stream.c
@@ -94,6 +94,7 @@ static void snd_usb_init_substream(struct snd_usb_stream *as,
subs->dev = as->chip->dev;
subs->txfr_quirk = as->chip->txfr_quirk;
subs->speed = snd_usb_get_speed(subs->dev);
+   subs->pkt_offset_adj = 0;
 
snd_usb_set_pcm_ops(as->pcm, stream);
 
-- 
1.8.1.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH PART3 v3 2/6] staging: ramster: Move debugfs code out of ramster.c file

2013-04-12 Thread Greg Kroah-Hartman
On Sat, Apr 13, 2013 at 08:36:06AM +0800, Wanpeng Li wrote:
> Note that at this point there is no CONFIG_RAMSTER_DEBUG
> option in the Kconfig. So in effect all of the counters
> are nop until that option gets introduced in patch:
> ramster/debug: Add CONFIG_RAMSTER_DEBUG Kconfig entry

This patch breaks the build again, so of course, I can't take it:

drivers/built-in.o: In function `ramster_flnode_alloc.isra.5':
ramster.c:(.text+0x1b6a6e): undefined reference to `ramster_flnodes_max'
ramster.c:(.text+0x1b6a7e): undefined reference to `ramster_flnodes_max'
drivers/built-in.o: In function `ramster_count_foreign_pages':
(.text+0x1b7205): undefined reference to `ramster_foreign_pers_pages_max'
drivers/built-in.o: In function `ramster_count_foreign_pages':
(.text+0x1b7215): undefined reference to `ramster_foreign_pers_pages_max'
drivers/built-in.o: In function `ramster_count_foreign_pages':
(.text+0x1b7235): undefined reference to `ramster_foreign_eph_pages_max'
drivers/built-in.o: In function `ramster_count_foreign_pages':
(.text+0x1b7249): undefined reference to `ramster_foreign_eph_pages_max'
drivers/built-in.o: In function `ramster_debugfs_init':
(.init.text+0xd620): undefined reference to `ramster_foreign_eph_pages_max'
drivers/built-in.o: In function `ramster_debugfs_init':
(.init.text+0xd656): undefined reference to `ramster_foreign_pers_pages_max'

I thought you fixed this :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH PART2 v2 2/7] staging: ramster: Move debugfs code out of ramster.c file

2013-04-12 Thread Greg Kroah-Hartman
On Sat, Apr 13, 2013 at 08:29:39AM +0800, Wanpeng Li wrote:
> On Fri, Apr 12, 2013 at 03:17:44PM -0700, Greg Kroah-Hartman wrote:
> >On Fri, Apr 12, 2013 at 03:16:03PM -0700, Greg Kroah-Hartman wrote:
> >> On Fri, Apr 12, 2013 at 09:31:22AM +0800, Wanpeng Li wrote:
> >> > Note that at this point there is no CONFIG_RAMSTER_DEBUG
> >> > option in the Kconfig. So in effect all of the counters
> >> > are nop until that option gets re-introduced in:
> >> > zcache/ramster/debug: Add RAMSTE_DEBUG Kconfig entry
> >> 
> >> RAMSTE_DEBUG?  :)
> >> 
> >
> >And I fat-fingered my scripts, and deleted this email, sorry.
> >
> 
> No problem, I will send 2-7 ASAP. ;-)

Thanks.  5 years since my last email deletion, not that bad :)

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tracing: Check result of ring_buffer_read_prepare()

2013-04-12 Thread Steven Rostedt
On Wed, 2013-04-10 at 10:55 +0900, Namhyung Kim wrote:
> From: Namhyung Kim 
> 
> The ring_buffer_read_prepare() can return NULL if memory allocation
> fails.  Fail out in this case instead of succedding and then having
> no output.
> 
> Suggested-by: Steven Rostedt 
> Signed-off-by: Namhyung Kim 
> ---
>  kernel/trace/trace.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 7270460cfe3c..13200de31f0b 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -2826,6 +2826,8 @@ __tracing_open(struct inode *inode, struct file *file, 
> bool snapshot)
>   for_each_tracing_cpu(cpu) {
>   iter->buffer_iter[cpu] =
>   
> ring_buffer_read_prepare(iter->trace_buffer->buffer, cpu);
> + if (!iter->buffer_iter[cpu])
> + goto free;
>   }

OK, this totally fails. I guess we need to allow
ring_buffer_read_prepare() to return NULL, as it will return NULL if a
cpu is offline or the tracing_cpumask has a CPU down.

I'll just pull this patch out of my queue for now.

-- Steve

>   ring_buffer_read_prepare_sync();
>   for_each_tracing_cpu(cpu) {
> @@ -2836,6 +2838,9 @@ __tracing_open(struct inode *inode, struct file *file, 
> bool snapshot)
>   cpu = iter->cpu_file;
>   iter->buffer_iter[cpu] =
>   ring_buffer_read_prepare(iter->trace_buffer->buffer, 
> cpu);
> + if (!iter->buffer_iter[cpu])
> + goto free;
> +
>   ring_buffer_read_prepare_sync();
>   ring_buffer_read_start(iter->buffer_iter[cpu]);
>   tracing_iter_reset(iter, cpu);
> @@ -2847,6 +2852,23 @@ __tracing_open(struct inode *inode, struct file *file, 
> bool snapshot)
>  
>   return iter;
>  
> +free:
> + /*
> +  * For simplicity, just keep single loop without comparing cpu_file.
> +  */
> + for_each_tracing_cpu(cpu) {
> + if (iter->buffer_iter[cpu])
> + ring_buffer_read_finish(iter->buffer_iter[cpu]);
> + }
> +
> + if (iter->trace && iter->trace->close)
> + iter->trace->close(iter);
> +
> + if (!iter->snapshot)
> + tracing_start_tr(tr);
> +
> + mutex_destroy(>mutex);
> + free_cpumask_var(iter->started);
>   fail:
>   mutex_unlock(_types_lock);
>   kfree(iter->trace);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] gcc4: Disable __compiletime_object_size for GCC 4.6+

2013-04-12 Thread Guenter Roeck
__builtin_object_size is known to be broken on gcc 4.6+.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48880 for details.

This causes unnecssary build warnings and errors such as

In function 'copy_from_user', inlined from 'sb16_copy_from_user'
at sound/oss/sb_audio.c:878:22:
arch/x86/include/asm/uaccess_32.h:211:26: error: call to 
'copy_from_user_overflow'
declared with attribute error: copy_from_user() buffer size is not 
provably correct
make[3]: [sound/oss/sb_audio.o] Error 1 (ignored)

Disable it where broken.

Signed-off-by: Guenter Roeck 
---
 include/linux/compiler-gcc4.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 68b162d..842de22 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -13,7 +13,7 @@
 #define __must_check   __attribute__((warn_unused_result))
 #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
 
-#if GCC_VERSION >= 40100
+#if GCC_VERSION >= 40100 && GCC_VERSION < 40600
 # define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
 #endif
 
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Namhyung Kim
2013-04-12 (금), 22:14 -0400, Steven Rostedt:
> On Sat, 2013-04-13 at 11:08 +0900, Namhyung Kim wrote:
> > 2013-04-12 (금), 21:26 -0400, Steven Rostedt:
> 
> > I think it only affects trace-cmd as it's the only user that accesses
> > raw ring-buffer contents for now.  I found it during writing code also
> > accesses the ring buffer (with kbuffer code) - perf ftrace. :)
> > 
> > But the code is in a very early stage and needs to handle so many things
> > before posting to the list.  So I just wanted to post a part of the
> > preparation first.
> 
> I understand. But people tend to not like things in upstream that has no
> user. If Arnaldo wants to include it, that's his decision. I'm just
> worried that if it takes a while before your other work gets mainline,
> people might start sending patches to remove that code and your stuff
> will suddenly break.

Fair enough.

Arnaldo, what do you think?  Do you want me to resend it with new
description now?

Thanks,
Namhyung


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27

2013-04-12 Thread Michel Lespinasse
Sorry for the earlier message getting sent before I was done typing it.

On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal  wrote:
> Hi,
>
> I am writing some code where I lock down a process memory at exec() time.
> My patches were working fine till 3.9-rc4 and suddendly things broke down
> in 3.9-rc5.
>
> Whenever I tried to exec() a process with memory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  
> (t=6 jiffies g=2580 c=2579 q=1085)
> [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> [  174.669002] Call Trace:
> [  174.669002][] rcu_check_callbacks+0x21a/0x760
> [  174.669002]  [] ? acct_account_cputime+0x1c/0x20
> [  174.669002]  [] update_process_times+0x48/0x80
> [  174.669002]  [] tick_sched_handle+0x3d/0x50
> [  174.669002]  [] tick_sched_timer+0x45/0x70
> [  174.669002]  [] __run_hrtimer+0x81/0x220
> [  174.669002]  [] ? tick_nohz_handler+0xa0/0xa0
> [  174.669002]  [] ? ktime_get_update_offsets+0x4c/0xd0
> [  174.669002]  [] hrtimer_interrupt+0xf7/0x250
> [  174.669002]  [] smp_apic_timer_interrupt+0x69/0x99
> [  174.669002]  [] apic_timer_interrupt+0x6a/0x70
> [  174.669002][] ?  
> __mlock_vma_pages_range+0x57/0x70
> [  174.669002]  [] ? __mlock_vma_pages_range+0x68/0x70
> [  174.669002]  [] __mm_populate+0x71/0x140
> [  174.669002]  [] vm_brk+0x7f/0xa0
> [  174.669002]  [] load_elf_binary+0x1a73/0x1b10
> [  174.669002]  [] ? ima_bprm_check+0x55/0x70
> [  174.669002]  [] search_binary_handler+0x12a/0x3b0
> [  174.669002]  [] ? load_elf_library+0x210/0x210
> [  174.669002]  [] do_execve_common+0x500/0x5c0
> [  174.669002]  [] do_execve+0x37/0x40
> [  174.669002]  [] sys_execve+0x3d/0x60
> [  174.669002]  [] stub_execve+0x69/0xa0
>
> I did a git bisect and bisection says that following is first bad
> commit.
>
> commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> Author: Michel Lespinasse 
> Date:   Thu Mar 28 16:26:23 2013 -0700
>
> Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace 
> pr
>
> This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
> better deal with racy userspace programs").
>
> I reverted above commit and problem gets fixed.
>
> Following is my simple patch to lock down a selected process memory.
>
> Index: linux-2.6/fs/binfmt_elf.c
> ===
> --- linux-2.6.orig/fs/binfmt_elf.c  2013-04-13 01:50:26.380184101
> -0400
> +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
>
> /* OK, This is the point of no return */
> current->mm->def_flags = def_flags;
> +   if (!strcmp(bprm->filename, "/sbin/kexec")) {
> +   printk("Memlocking /sbin/kexec\n");
> +   current->mm->def_flags |= VM_LOCKED;
> +   }
>
> /* Do this immediately, since STACK_TOP as used in setup_arg_pages
>may depend on the personality.  */
>
>
> Do you have any thoughts on what's going on. I am wondering if it indicates
> a bigger problem which can then be triggered from other paths too.
>
> Thanks
> Vivek

Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
wouldn't actually cause pages to get mlocked during exec - for this
range of kernels, mlockall would need to set both VM_LOCKED and
VM_POPULATE. I suspect you would see the same crash if you included
VM_POPULATE in your change, too.

That said, I am not sure immediately what's wrong. It looks like a
deadlock situation, does CONFIG_LOCKDEP help here ?

My first guess would be that mmap_sem is held during exec, so you
can't have __mm_populate() try holding it recursively.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Steven Rostedt
On Sat, 2013-04-13 at 11:08 +0900, Namhyung Kim wrote:
> 2013-04-12 (금), 21:26 -0400, Steven Rostedt:

> I think it only affects trace-cmd as it's the only user that accesses
> raw ring-buffer contents for now.  I found it during writing code also
> accesses the ring buffer (with kbuffer code) - perf ftrace. :)
> 
> But the code is in a very early stage and needs to handle so many things
> before posting to the list.  So I just wanted to post a part of the
> preparation first.

I understand. But people tend to not like things in upstream that has no
user. If Arnaldo wants to include it, that's his decision. I'm just
worried that if it takes a while before your other work gets mainline,
people might start sending patches to remove that code and your stuff
will suddenly break.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27

2013-04-12 Thread Michel Lespinasse
On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal  wrote:
> Hi,
>
> I am writing some code where I lock down a process memory at exec() time.
> My patches were working fine till 3.9-rc4 and suddendly things broke down
> in 3.9-rc5.
>
> Whenever I tried to exec() a process with memory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  
> (t=6 jiffies g=2580 c=2579 q=1085)
> [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> [  174.669002] Call Trace:
> [  174.669002][] rcu_check_callbacks+0x21a/0x760
> [  174.669002]  [] ? acct_account_cputime+0x1c/0x20
> [  174.669002]  [] update_process_times+0x48/0x80
> [  174.669002]  [] tick_sched_handle+0x3d/0x50
> [  174.669002]  [] tick_sched_timer+0x45/0x70
> [  174.669002]  [] __run_hrtimer+0x81/0x220
> [  174.669002]  [] ? tick_nohz_handler+0xa0/0xa0
> [  174.669002]  [] ? ktime_get_update_offsets+0x4c/0xd0
> [  174.669002]  [] hrtimer_interrupt+0xf7/0x250
> [  174.669002]  [] smp_apic_timer_interrupt+0x69/0x99
> [  174.669002]  [] apic_timer_interrupt+0x6a/0x70
> [  174.669002][] ?  
> __mlock_vma_pages_range+0x57/0x70
> [  174.669002]  [] ? __mlock_vma_pages_range+0x68/0x70
> [  174.669002]  [] __mm_populate+0x71/0x140
> [  174.669002]  [] vm_brk+0x7f/0xa0
> [  174.669002]  [] load_elf_binary+0x1a73/0x1b10
> [  174.669002]  [] ? ima_bprm_check+0x55/0x70
> [  174.669002]  [] search_binary_handler+0x12a/0x3b0
> [  174.669002]  [] ? load_elf_library+0x210/0x210
> [  174.669002]  [] do_execve_common+0x500/0x5c0
> [  174.669002]  [] do_execve+0x37/0x40
> [  174.669002]  [] sys_execve+0x3d/0x60
> [  174.669002]  [] stub_execve+0x69/0xa0
>
> I did a git bisect and bisection says that following is first bad
> commit.
>
> commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> Author: Michel Lespinasse 
> Date:   Thu Mar 28 16:26:23 2013 -0700
>
> Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace 
> pr
>
> This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
> better deal with racy userspace programs").
>
> I reverted above commit and problem gets fixed.
>
> Following is my simple patch to lock down a selected process memory.
>
> Index: linux-2.6/fs/binfmt_elf.c
> ===
> --- linux-2.6.orig/fs/binfmt_elf.c  2013-04-13 01:50:26.380184101
> -0400
> +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
>
> /* OK, This is the point of no return */
> current->mm->def_flags = def_flags;
> +   if (!strcmp(bprm->filename, "/sbin/kexec")) {
> +   printk("Memlocking /sbin/kexec\n");
> +   current->mm->def_flags |= VM_LOCKED;
> +   }
>
> /* Do this immediately, since STACK_TOP as used in setup_arg_pages
>may depend on the personality.  */
>
>
> Do you have any thoughts on what's going on. I am wondering if it indicates
> a bigger problem which can then be triggered from other paths too.

Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
wouldn't actually cause pages to get mlocked during exec - for this
range of kernels, mlockall would need to set both VM_LOCKED and

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Namhyung Kim
2013-04-12 (금), 21:26 -0400, Steven Rostedt:
> On Sat, 2013-04-13 at 10:17 +0900, Namhyung Kim wrote:
> 
> > > 
> > > I just want to know more about this patch before I ack it.
> > 
> > The page size of traced system can be different than current system's
> > because the recorded data file might be analyzed in a different machine.
> > In this case we should use original page size of traced system when
> > accessing the data file, so this information needs to be saved.
> 
> I understand that, it's just strange that it's not used anywhere in the
> library, or the patch series (that I can find).
> 
> It would make more sense if another patch used this new interface. I'm
> sure new code will, but the patch should be part of a patch series that
> needs it.

I know you knew that already. :)

I think it only affects trace-cmd as it's the only user that accesses
raw ring-buffer contents for now.  I found it during writing code also
accesses the ring buffer (with kbuffer code) - perf ftrace. :)

But the code is in a very early stage and needs to handle so many things
before posting to the list.  So I just wanted to post a part of the
preparation first.

Thanks,
Namhyung


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH, resend] sd: fix infinite kernel/udev loop on non-removable Medium Not Present

2013-04-12 Thread Steve Magnani
Commit eface65c336eff420d70beb0fb6787a732e05ffb (2.6.38) altered
set_media_not_present() in a way that prevents the sd driver from
remembering that a non-removable device has reported "Medium Not Present".
This condition can occur on hotplug of a (i.e.) USB Mass Storage device
whose medium is offline due to an unrecoverable controller error,
but which is otherwise capable of SCSI communication (to download new 
microcode, etc.).

Under these conditions, the changed code results in an infinite loop
between the kernel and udevd. When udevd attempts to open the device
in response to a change notification, a SCSI "Medium Not Present" error
occurs which causes the kernel to signal another change. The cycle
repeats until the device is unplugged, resulting in udevd consuming ever-
increasing amounts of CPU and virtual memory.

Resolve this by remembering "media not present" whether the device has
declared itself "removable" or not.

Signed-off-by: Steven J. Magnani 
---
--- a/drivers/scsi/sd.c 2013-04-12 14:16:12.252531097 -0500
+++ b/drivers/scsi/sd.c 2013-04-12 14:21:55.197216521 -0500
@@ -1298,10 +1298,8 @@ out:
 
 static void set_media_not_present(struct scsi_disk *sdkp)
 {
-   if (sdkp->media_present)
+   if (sdkp->media_present) {
sdkp->device->changed = 1;
-
-   if (sdkp->device->removable) {
sdkp->media_present = 0;
sdkp->capacity = 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc: debugfs: Add debugfs ability to read CID and CSD

2013-04-12 Thread Steven J. Magnani

On Sat, 2013-04-13 at 01:06 +0200, Johan Rudholm wrote: 
> I believe these registers are already available as sysfs nodes? I
> don't have access to the proper hardware right now, but I'm pretty
> sure you'll find something if you do for instance "find /sys -name
> csd".

Ahh. Thanks for the tip; I hadn't thought to look there since I found
the ext_csd under debugfs.

Steven J. Magnani   "I claim this network for MARS!
www.digidescorp.com  Earthling, return my space modulator!"

#include  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: failed to fetch the osd tree

2013-04-12 Thread Stephen Rothwell
Hi Boaz,

On Fri, 12 Apr 2013 09:08:37 +0300 Boaz Harrosh  wrote:
>
> That old server has finally crapped out on me.
> It is being replaced by a new machine. It will take few days
> to set up and run.
> 
> So expect failure for the next few days

OK, I will ignore the errors for a while longer.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpkj3nI5fV7o.pgp
Description: PGP signature


Re: [RESEND][PATCH 1/3] PM / devfreq: exynos4_bus: Fix missing mutex_unlock if opp_find_freq_floor fails

2013-04-12 Thread Axel Lin
2013/4/13 Rafael J. Wysocki :
> On Friday, April 12, 2013 09:11:00 PM myungjoo.ham wrote:
>> > On Friday, April 12, 2013 11:52:01 AM 함명주 wrote:
>> > > > On Friday, April 12, 2013 01:54:18 PM Axel Lin wrote:
>> > > > > We need to call mutex_unlock() in the error path.
>> > > > >
>> > > > > Signed-off-by: Axel Lin 
>> > > >
>> > > > All three patches applied to linux-pm.git/linux-next.
>> > > >
>> > > > Exynos maintainers, if you have any objections, please holler.
>> > > >
>> > > > Thanks,
>> > > > Rafael
>> > >
>> > > This patch was included in the last pull-request patchset
>> > > though the path was updated. (its precedessor patch moved
>> > > exynos drivers to /drivers/devfreq/exynos/* after adding
>> > > Exynos common driver files)
>> >
>> > OK, so do you want me to drop it?
>> >
>> > What about the remaining two?
>>
>> Yes, please drop 1/3. It's duplicated.
>>
>> The patches 2~3/3 can wait. They are actually not bugfixes.
>
> OK, I've dropped all three.
>
> Axel, please push [2-3/3] thorugh the Exynos tree.

I thought I already Cc all devfreq maintainers.
Is there any other thing I need to do?

Regards,
Axel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [NEW DRIVER V4 6/7] DA9058 HWMON driver

2013-04-12 Thread Guenter Roeck
On Fri, Apr 12, 2013 at 05:53:48PM +0200, Lars-Peter Clausen wrote:
> On 04/12/2013 03:05 PM, Anthony Olech wrote:
> > This is the HWMON component driver of the Dialog DA9058 PMIC.
> > This driver is just one component of the whole DA9058 PMIC driver.
> > It depends on the CORE and ADC component drivers of the DA9058 MFD.
> > 
> > Signed-off-by: Anthony Olech 
> > Signed-off-by: David Dajun Chen 
> 
> Hi,
> 
> can't you use the generic IIO to HWMON bridge driver? And if not it's probably
> better to extent the bridge driver than writing this custom driver.
> 
I like that idea.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Fn keys not working on EliteBook 8460p after fabf85e3ca15d5b94058f391dac8df870cdd427a

2013-04-12 Thread Matthew Garrett
On Sat, 2013-04-13 at 03:31 +0200, Pali Rohár wrote:

> all Fn keys, wifi switch, web and mute buttons not working anymore 
> on my notebook HP EliteBook 8460p. I bisected git commit which 
> broke all above keys: fabf85e3ca15d5b94058f391dac8df870cdd427a
> 
> When I reverted that commit after reboot buttons started working 
> again. Can you fix it or revert that broken commit? This is 
> critical problem, which caused my notebook to be unusable...

Sure, I'll revert that. Kyle, can you look into figuring out a way to
only run this on machines that need it?

-- 
Matthew Garrett | mj...@srcf.ucam.org


BUG: Fn keys not working on EliteBook 8460p after fabf85e3ca15d5b94058f391dac8df870cdd427a

2013-04-12 Thread Pali Rohár
Hello,

all Fn keys, wifi switch, web and mute buttons not working anymore 
on my notebook HP EliteBook 8460p. I bisected git commit which 
broke all above keys: fabf85e3ca15d5b94058f391dac8df870cdd427a

When I reverted that commit after reboot buttons started working 
again. Can you fix it or revert that broken commit? This is 
critical problem, which caused my notebook to be unusable...

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Arnaldo Carvalho de Melo
Em Sat, Apr 13, 2013 at 10:17:26AM +0900, Namhyung Kim escreveu:
> > I just want to know more about this patch before I ack it.
> 
> The page size of traced system can be different than current system's
> because the recorded data file might be analyzed in a different machine.
> In this case we should use original page size of traced system when
> accessing the data file, so this information needs to be saved.

Excellent explanation! So good it deserves being in the changelog
commit!

8-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Steven Rostedt
On Sat, 2013-04-13 at 10:17 +0900, Namhyung Kim wrote:

> > 
> > I just want to know more about this patch before I ack it.
> 
> The page size of traced system can be different than current system's
> because the recorded data file might be analyzed in a different machine.
> In this case we should use original page size of traced system when
> accessing the data file, so this information needs to be saved.

I understand that, it's just strange that it's not used anywhere in the
library, or the patch series (that I can find).

It would make more sense if another patch used this new interface. I'm
sure new code will, but the patch should be part of a patch series that
needs it.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] documentation: hwmon: Fix typo in documentation/hwmon

2013-04-12 Thread Guenter Roeck
On Sat, Apr 13, 2013 at 01:22:11AM +0900, Masanari Iida wrote:
> Correct spelling typo in Documentation/hwmon
> 
> Signed-off-by: Masanari Iida 
> ---
>  Documentation/hwmon/adt7410 | 2 +-
>  Documentation/hwmon/sht15   | 2 +-
>  Documentation/hwmon/zl6100  | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)

Partially applied to -next. adt7410 was already taken care of.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Excessive stall times on ext4 in 3.9-rc2

2013-04-12 Thread Dave Chinner
On Fri, Apr 12, 2013 at 11:19:52AM -0400, Theodore Ts'o wrote:
> On Fri, Apr 12, 2013 at 02:50:42PM +1000, Dave Chinner wrote:
> > > If that is the case, one possible solution that comes to mind would be
> > > to mark buffer_heads that contain metadata with a flag, so that the
> > > flusher thread can write them back at the same priority as reads.
> > 
> > Ext4 is already using REQ_META for this purpose.
> 
> We're using REQ_META | REQ_PRIO for reads, not writes.
> 
> > I'm surprised that no-one has suggested "change the IO elevator"
> > yet.
> 
> Well, testing to see if the stalls go away with the noop schedule is a
> good thing to try just to validate the theory.

Exactly.

> The thing is, we do want to make ext4 work well with cfq, and
> prioritizing non-readahead read requests ahead of data writeback does
> make sense.  The issue is with is that metadata writes going through
> the block device could in some cases effectively cause a priority
> inversion when what had previously been an asynchronous writeback
> starts blocking a foreground, user-visible process.

Here's the historic problem with CFQ: it's scheduling algorithms
change from release to release, and so what you tune the filesystem
to for this release is likely to cause different behaviour
in a few releases time.

We've had this problem time and time again with CFQ+XFS, so we
stopped trying to "tune" to a particular elevator long ago.  The
best you can do it tag the Io as appropriately as possible (e.g.
metadata with REQ_META, sync IO with ?_SYNC, etc), and then hope CFQ
hasn't been broken since the last release

> At least, that's the theory; we should confirm that this is indeed
> what is causing the data stalls which Mel is reporting on HDD's before
> we start figuring out how to fix this problem.

*nod*.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/6] tools lib traceevent: Add page_size field to pevent

2013-04-12 Thread Namhyung Kim
Hi Steve,

2013-04-11 (목), 10:40 -0400, Steven Rostedt:
> On Thu, 2013-04-11 at 10:39 -0400, Steven Rostedt wrote:
> > On Thu, 2013-04-11 at 21:04 +0900, Namhyung Kim wrote:
> > > From: Namhyung Kim 
> > > 
> > > It's for saving the page size of traced system.
> > 
> > Can you add a bit more detail in the change log about why this patch is
> > necessary.
> 
> For now, you can add my
> 
> Acked-by: Steven Rostedt 
> 
> for all but this patch.

Thanks!

> 
> I just want to know more about this patch before I ack it.

The page size of traced system can be different than current system's
because the recorded data file might be analyzed in a different machine.
In this case we should use original page size of traced system when
accessing the data file, so this information needs to be saved.

Thanks,
Namhyung


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET v2] arch: unify task dump debug info

2013-04-12 Thread rkuo
> On Mon, Apr 08, 2013 at 08:31:07AM -0700, Tejun Heo wrote:
>> Andrew, ping?
>
> Ping #2.  Workqueue conversion of writeback in the block tree needs
> these patches to avoid losing debug information over the conversion,
> so it'd be great if this can be scheduled for 3.10.
>
> Thanks.
>
> --
> tejun
>

Sorry for the late reply; wasn't able to test this until today.

Hexagon could use the same "don't print into stacktrace machinery", but I
can add that to my tree.

So for the Hexagon bits:

Acked-by: Richard Kuo 


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 026/171 ] sfc: Only use TX push if a single descriptor is to be written

2013-04-12 Thread Steven Rostedt
On Fri, 2013-04-12 at 23:05 +0100, Ben Hutchings wrote:
> On Thu, 2013-04-11 at 22:15 +0100, Ben Hutchings wrote:
> > Aside from #21-26 in this series, and the deadlock fix required on top
> > of #24, there are several more fixes for sfc that I think are suitable
> > for 3.6.11.y.
> > 
> > These commits were cherry-picked for 3.4.38 and can also be
> > cherry-picked cleanly on top of 3.6.11.1 plus the 7 patches you already
> > have:
> > 
> > d5e8cc6c946e sfc: Really disable flow control while flushing
> > bfeed902946a sfc: Convert firmware subtypes to native byte order in 
> > efx_mcdi_get_board_cfg()
> > 9724a8504c87 sfc: Add parentheses around use of bitfield macro arguments
> > 0a6e5008a9df sfc: Fix MCDI structure field lookup
> > 450783747f42 sfc: Avoid generating over-length MC_CMD_FLUSH_RX_QUEUES 
> > request
> > 525d9e824018 sfc: Work-around flush timeout when flushes have completed
> > ef492f11efed sfc: Correctly initialise reset_method in siena_test_chip()
> > ebf98e797b4e sfc: Fix timekeeping in efx_mcdi_poll()
> > 
> > Please let me know whether you're prepared to include these in the
> > current update.  I can then run some automated tests with your selected
> > set of patches applied.
> 
> The test suite found a regression which I'd forgotten about.  It
> was introduced in 3.6 by commit b7f514af7d6f 'sfc: Fix interface
> statistics running backward' and fixed in 3.8 by commit 876be083b669
> 'sfc: Reset driver's MAC stats after MC reboot seen'.
> 
> That latter fix is, again, a clean cherry-pick onto 3.6.y.  I don't
> think I'm going to be able to re-test with this but it's sufficiently
> low-risk that I'd be happy for you to add it anyway.

Thanks!

I included it, and will run some simple tests. If everything works, I'll
just keep it without another spamming of the mailing lists.

I wont post till after my 3.6.11.2-rt tests passes.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] tracing: Fix possible NULL pointer dereferences

2013-04-12 Thread Steven Rostedt
From: Namhyung Kim 

Currently set_ftrace_pid and set_graph_function files use seq_lseek
for their fops.  However seq_open() is called only for FMODE_READ in
the fops->open() so that if an user tries to seek one of those file
when she open it for writing, it sees NULL seq_file and then panic.

It can be easily reproduced with following command:

  $ cd /sys/kernel/debug/tracing
  $ echo 1234 | sudo tee -a set_ftrace_pid

In this example, GNU coreutils' tee opens the file with fopen(, "a")
and then the fopen() internally calls lseek().

Link: 
http://lkml.kernel.org/r/1365663302-2170-1-git-send-email-namhy...@kernel.org

Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
Cc: sta...@vger.kernel.org
Signed-off-by: Namhyung Kim 
Signed-off-by: Steven Rostedt 
---
 include/linux/ftrace.h |2 +-
 kernel/trace/ftrace.c  |   10 +-
 kernel/trace/trace_stack.c |2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 167abf9..eb3ce32 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -396,7 +396,7 @@ ssize_t ftrace_filter_write(struct file *file, const char 
__user *ubuf,
size_t cnt, loff_t *ppos);
 ssize_t ftrace_notrace_write(struct file *file, const char __user *ubuf,
 size_t cnt, loff_t *ppos);
-loff_t ftrace_regex_lseek(struct file *file, loff_t offset, int whence);
+loff_t ftrace_filter_lseek(struct file *file, loff_t offset, int whence);
 int ftrace_regex_release(struct inode *inode, struct file *file);
 
 void __init
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 926ebfb..affc35d 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2697,7 +2697,7 @@ ftrace_notrace_open(struct inode *inode, struct file 
*file)
 }
 
 loff_t
-ftrace_regex_lseek(struct file *file, loff_t offset, int whence)
+ftrace_filter_lseek(struct file *file, loff_t offset, int whence)
 {
loff_t ret;
 
@@ -3570,7 +3570,7 @@ static const struct file_operations ftrace_filter_fops = {
.open = ftrace_filter_open,
.read = seq_read,
.write = ftrace_filter_write,
-   .llseek = ftrace_regex_lseek,
+   .llseek = ftrace_filter_lseek,
.release = ftrace_regex_release,
 };
 
@@ -3578,7 +3578,7 @@ static const struct file_operations ftrace_notrace_fops = 
{
.open = ftrace_notrace_open,
.read = seq_read,
.write = ftrace_notrace_write,
-   .llseek = ftrace_regex_lseek,
+   .llseek = ftrace_filter_lseek,
.release = ftrace_regex_release,
 };
 
@@ -3783,8 +3783,8 @@ static const struct file_operations ftrace_graph_fops = {
.open   = ftrace_graph_open,
.read   = seq_read,
.write  = ftrace_graph_write,
+   .llseek = ftrace_filter_lseek,
.release= ftrace_graph_release,
-   .llseek = seq_lseek,
 };
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 
@@ -4439,7 +4439,7 @@ static const struct file_operations ftrace_pid_fops = {
.open   = ftrace_pid_open,
.write  = ftrace_pid_write,
.read   = seq_read,
-   .llseek = seq_lseek,
+   .llseek = ftrace_filter_lseek,
.release= ftrace_pid_release,
 };
 
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 42ca822..83a8b5b 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -322,7 +322,7 @@ static const struct file_operations stack_trace_filter_fops 
= {
.open = stack_trace_filter_open,
.read = seq_read,
.write = ftrace_filter_write,
-   .llseek = ftrace_regex_lseek,
+   .llseek = ftrace_filter_lseek,
.release = ftrace_regex_release,
 };
 
-- 
1.7.10.4




signature.asc
Description: This is a digitally signed message part


[PATCH 2/2] ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section

2013-04-12 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

As ftrace_filter_lseek is now used with ftrace_pid_fops, it needs to
be moved out of the #ifdef CONFIG_DYNAMIC_FTRACE section as the
ftrace_pid_fops is defined when DYNAMIC_FTRACE is not.

Cc: sta...@vger.kernel.org
Cc: Namhyung Kim 
Signed-off-by: Steven Rostedt 
---
 include/linux/ftrace.h |3 ++-
 kernel/trace/ftrace.c  |   28 ++--
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index eb3ce32..52da2a2 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -396,7 +396,6 @@ ssize_t ftrace_filter_write(struct file *file, const char 
__user *ubuf,
size_t cnt, loff_t *ppos);
 ssize_t ftrace_notrace_write(struct file *file, const char __user *ubuf,
 size_t cnt, loff_t *ppos);
-loff_t ftrace_filter_lseek(struct file *file, loff_t offset, int whence);
 int ftrace_regex_release(struct inode *inode, struct file *file);
 
 void __init
@@ -569,6 +568,8 @@ static inline int
 ftrace_regex_release(struct inode *inode, struct file *file) { return -ENODEV; 
}
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
+loff_t ftrace_filter_lseek(struct file *file, loff_t offset, int whence);
+
 /* totally disable ftrace - can not re-enable after this */
 void ftrace_kill(void);
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index affc35d..2461ede 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1052,6 +1052,19 @@ static __init void ftrace_profile_debugfs(struct dentry 
*d_tracer)
 
 static struct pid * const ftrace_swapper_pid = _struct_pid;
 
+loff_t
+ftrace_filter_lseek(struct file *file, loff_t offset, int whence)
+{
+   loff_t ret;
+
+   if (file->f_mode & FMODE_READ)
+   ret = seq_lseek(file, offset, whence);
+   else
+   file->f_pos = ret = 1;
+
+   return ret;
+}
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 #ifndef CONFIG_FTRACE_MCOUNT_RECORD
@@ -2612,7 +2625,7 @@ static void ftrace_filter_reset(struct ftrace_hash *hash)
  * routine, you can use ftrace_filter_write() for the write
  * routine if @flag has FTRACE_ITER_FILTER set, or
  * ftrace_notrace_write() if @flag has FTRACE_ITER_NOTRACE set.
- * ftrace_regex_lseek() should be used as the lseek routine, and
+ * ftrace_filter_lseek() should be used as the lseek routine, and
  * release must call ftrace_regex_release().
  */
 int
@@ -2696,19 +2709,6 @@ ftrace_notrace_open(struct inode *inode, struct file 
*file)
 inode, file);
 }
 
-loff_t
-ftrace_filter_lseek(struct file *file, loff_t offset, int whence)
-{
-   loff_t ret;
-
-   if (file->f_mode & FMODE_READ)
-   ret = seq_lseek(file, offset, whence);
-   else
-   file->f_pos = ret = 1;
-
-   return ret;
-}
-
 static int ftrace_match(char *str, char *regex, int len, int type)
 {
int matched = 0;
-- 
1.7.10.4




signature.asc
Description: This is a digitally signed message part


[PATCH 0/2] [GIT PULL][v3.9-rc7] tracing: Another fix by Namhyung

2013-04-12 Thread Steven Rostedt

Linus,

Namhyung found and fixed another nasty bug, where you can crash the
kernel with: echo 1234 | tee -a /sys/kernel/debug/tracing/set_ftrace_pid

Luckily, only root has permissions to write to that file.

I also added a fix on top of Namhyung's as his patch added a reference
outside of the DYNAMIC_FTRACE to a function that is only defined
in DYNAMIC_FTRACE. This fixes compiling with FUNCTION_TRACING and
without DYNAMIC_FTRACE (although I don't know who does that anymore).

-- Steve

Please pull the latest trace-fixes-v3.9-rc-v3 tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v3.9-rc-v3

Head SHA1: 0e1bb617b40659414778baf3203c2ea0dcda1ca7


Namhyung Kim (1):
  tracing: Fix possible NULL pointer dereferences

Steven Rostedt (Red Hat) (1):
  ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section


 include/linux/ftrace.h |3 ++-
 kernel/trace/ftrace.c  |   36 ++--
 kernel/trace/trace_stack.c |2 +-
 3 files changed, 21 insertions(+), 20 deletions(-)


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2] tracepoints: prevents null probe from being added

2013-04-12 Thread Steven Rostedt
On Thu, 2013-03-21 at 14:34 +0900, kpark3...@gmail.com wrote:
> From: Sahara 
> 
> Somehow tracepoint_entry_add_probe function allows a null probe function.
> And, this may lead to unexpected result since the number of probe
> functions in an entry can be counted by checking whether probe is null
> or not in for-loop.
> This patch prevents the null probe from being added.
> In tracepoint_entry_remove_probe function, checking probe parameter
> within for-loop is moved out for code efficiency leaving the null probe
> feature which removes all probe functions in the entry.
> 
> Signed-off-by: Sahara 
> Reviewed-by: Steven Rostedt 
> Reviewed-by: Mathieu Desnoyers 
> ---
>  kernel/tracepoint.c |   18 ++
>  1 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 0c05a45..7d69348 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -112,7 +112,8 @@ tracepoint_entry_add_probe(struct tracepoint_entry *entry,
>   int nr_probes = 0;
>   struct tracepoint_func *old, *new;
>  
> - WARN_ON(!probe);
> + if (WARN_ON(!probe))
> + return ERR_PTR(-EINVAL);
>  
>   debug_print_probes(entry);
>   old = entry->funcs;
> @@ -152,13 +153,15 @@ tracepoint_entry_remove_probe(struct tracepoint_entry 
> *entry,
>  
>   debug_print_probes(entry);
>   /* (N -> M), (N > 1, M >= 0) probes */
> - for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> - if (!probe ||
> - (old[nr_probes].func == probe &&
> -  old[nr_probes].data == data))
> - nr_del++;
> + if (probe) {
> + for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> + if (old[nr_probes].func == probe &&
> +  old[nr_probes].data == data)
> + nr_del++;
> + }
>   }
>  
> + /* If probe is NULL, all funcs in the entry will be removed. */

OK, I first thought this was a bug as nr_del would be zero and not match
nr_probes, but then I realized that nr_probes would also be zero. Can
you update the above comment to say something like:

/*
 * If probe is NULL, then nr_probes = nr_del = 0, and then the
 * entire entry will be removed.
 */

Thanks,

-- Steve

>   if (nr_probes - nr_del == 0) {
>   /* N -> 0, (N > 1) */
>   entry->funcs = NULL;
> @@ -173,8 +176,7 @@ tracepoint_entry_remove_probe(struct tracepoint_entry 
> *entry,
>   if (new == NULL)
>   return ERR_PTR(-ENOMEM);
>   for (i = 0; old[i].func; i++)
> - if (probe &&
> - (old[i].func != probe || old[i].data != data))
> + if (old[i].func != probe || old[i].data != data)
>   new[j++] = old[i];
>   new[nr_probes - nr_del].func = NULL;
>   entry->refcount = nr_probes - nr_del;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2] ARM: arch_timer: Silence debug preempt warnings

2013-04-12 Thread Stephen Boyd
On 04/06/13 03:41, Marc Zyngier wrote:
> On Fri,  5 Apr 2013 13:57:29 -0700, Stephen Boyd 
> wrote:
>> Hot-plugging with CONFIG_DEBUG_PREEMPT=y on a device with arm
>> architected timers causes a slew of "using smp_processor_id() in
>> preemptible" warnings:
>>
>>   BUG: using smp_processor_id() in preemptible [] code: sh/111
>>   caller is arch_timer_cpu_notify+0x14/0xc8
>>
>> This happens because sometimes the cpu notifier,
>> arch_timer_cpu_notify(), is called in preemptible context and
>> other times in non-preemptible context but we use this_cpu_ptr()
>> to retrieve the clockevent in all cases. We're only going to
>> actually use the pointer in non-preemptible context though, so
>> push the this_cpu_ptr() access down into the cases to force the
>> checks to occur only in non-preemptible contexts.
>>
>> Cc: Mark Rutland 
>> Cc: Marc Zyngier 
>> Signed-off-by: Stephen Boyd 
>> ---
>>
>> Changes since v1:
>>  * Pushed down this_cpu_ptr and added a comment
>>
>>  drivers/clocksource/arm_arch_timer.c | 10 ++
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c
>> b/drivers/clocksource/arm_arch_timer.c
>> index d7ad425..a65a710 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -248,14 +248,16 @@ static void __cpuinit arch_timer_stop(struct
>> clock_event_device *clk)
>>  static int __cpuinit arch_timer_cpu_notify(struct notifier_block *self,
>> unsigned long action, void *hcpu)
>>  {
>> -struct clock_event_device *evt = this_cpu_ptr(arch_timer_evt);
>> -
>> +/*
>> + * Grab cpu pointer in each case to avoid spurious
>> + * preemptible warnings
>> + */
>>  switch (action & ~CPU_TASKS_FROZEN) {
>>  case CPU_STARTING:
>> -arch_timer_setup(evt);
>> +arch_timer_setup(this_cpu_ptr(arch_timer_evt));
>>  break;
>>  case CPU_DYING:
>> -arch_timer_stop(evt);
>> +arch_timer_stop(this_cpu_ptr(arch_timer_evt));
>>  break;
>>  }
> Looks good to me.
>
> Acked-by: Marc Zyngier 
>

Thanks Marc.

Thomas/John, can you pick this up for 3.10?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] module: Fix race condition between load and unload module

2013-04-12 Thread Anatol Pomozov
I ran the test case for ~30 minutes and no crash. Before the patch it
took ~10 seconds for me to repro the crash.

The only harmless warning I see is


[ 1553.658421] [ cut here ]
[ 1553.663211] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xbb/0xe0()
[ 1553.669571] Hardware name: MCP55
[ 1553.672834] sysfs: cannot create duplicate filename '/module/loop'
[ 1553.679035] Modules linked in: loop(+) sata_mv acpi_cpufreq mperf
freq_table processor msr cpuid genrtc bnx2x libcrc32c mdio ipv6 [last
unloaded: loop]
[ 1553.692983] Pid: 25935, comm: modprobe Tainted: GW
3.9.0-dbg-DEV #1
[ 1553.700221] Call Trace:
[ 1553.702699]  [] warn_slowpath_common+0x7f/0xc0
[ 1553.708724]  [] warn_slowpath_fmt+0x46/0x50
[ 1553.714495]  [] ? strlcat+0x60/0x80
[ 1553.719567]  [] sysfs_add_one+0xbb/0xe0
[ 1553.724988]  [] create_dir+0x7f/0xd0
[ 1553.730150]  [] sysfs_create_dir+0x89/0xe0
[ 1553.735831]  [] kobject_add_internal+0x9d/0x2a0
[ 1553.741945]  [] kobject_init_and_add+0x63/0x90
[ 1553.747973]  [] ? kset_find_obj+0x64/0x90
[ 1553.753571]  [] load_module+0xc2a/0x15b0
[ 1553.759077]  [] ? ddebug_proc_write+0x110/0x110
[ 1553.765191]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1553.771652]  [] sys_init_module+0xf7/0x140
[ 1553.777335]  [] cstar_dispatch+0x7/0x1f
[ 1553.782753] ---[ end trace a5e2ab42bb81f1b3 ]---
[ 1553.787405] [ cut here ]
[ 1553.792041] WARNING: at lib/kobject.c:196 kobject_add_internal+0x234/0x2a0()
[ 1553.799087] Hardware name: MCP55
[ 1553.802316] kobject_add_internal failed for loop with -EEXIST,
don't try to register things with the same name in the same directory.
[ 1553.814308] Modules linked in: loop(+) sata_mv acpi_cpufreq mperf
freq_table processor msr cpuid genrtc bnx2x libcrc32c mdio ipv6 [last
unloaded: loop]
[ 1553.828122] Pid: 25935, comm: modprobe Tainted: GW
3.9.0-dbg-DEV #1
[ 1553.835342] Call Trace:
[ 1553.837799]  [] warn_slowpath_common+0x7f/0xc0
[ 1553.843801]  [] warn_slowpath_fmt+0x46/0x50
[ 1553.849548]  [] kobject_add_internal+0x234/0x2a0
[ 1553.855731]  [] kobject_init_and_add+0x63/0x90
[ 1553.861750]  [] ? kset_find_obj+0x64/0x90
[ 1553.867329]  [] load_module+0xc2a/0x15b0
[ 1553.872821]  [] ? ddebug_proc_write+0x110/0x110
[ 1553.878938]  [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1553.885382]  [] sys_init_module+0xf7/0x140
[ 1553.891037]  [] cstar_dispatch+0x7/0x1f


That happens because kobject in kobject_put() path and its refcounter
== 0 but sysfs is not cleaned yet. kset_find_obj() returned NULL (as
per Linus patch). kobject_init_and_add() tries to add sysfs file with
existing name and fails.

On Fri, Apr 12, 2013 at 5:11 PM, Linus Torvalds
 wrote:
> On Fri, Apr 12, 2013 at 4:53 PM, Greg Kroah-Hartman
>  wrote:
>>
>> Linus, I think your patch will reduce the window the race could happen,
>> but it should still be there, although testing with it would be
>> interesting to see if the original problem can be triggered with it.
>
> Well, with my patch, there's no way you'll ever look up an object with
> a zero refcount, so you'll never release it twice. The atomic
> operations (atomic_inc_nonzero()) do guarantee that.
>
> The "kset->list_lock" means that the list traversal is safe too.
>
> So one particular race is definitely gone.
>
> Now, what people who call "kset_find_obj()" really expect when not
> locked against the last kobject_put(), I don't know. But at least it's
> conceptually safe now. They'll either get NULL (either because the
> object doesn't exist on the list, or because it does exist but is
> about to be removed), or they will get a valid object that has *not*
> started to be torn down yet.
>
>  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH PART3 v3 5/6] staging: ramster/debug: Add CONFIG_RAMSTER_DEBUG Kconfig entry

2013-04-12 Thread Wanpeng Li
Add CONFIG_RAMSTER_DEBUG Kconfig entry.

Acked-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/Kconfig |8 
 drivers/staging/zcache/Makefile|2 +-
 drivers/staging/zcache/ramster/debug.h |2 +-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/Kconfig b/drivers/staging/zcache/Kconfig
index c3b8a10..05e87a1 100644
--- a/drivers/staging/zcache/Kconfig
+++ b/drivers/staging/zcache/Kconfig
@@ -33,6 +33,14 @@ config RAMSTER
  zcache2, compresses swap pages into local RAM, but then remotifies
  the compressed pages to another node in the RAMster cluster.
 
+config RAMSTER_DEBUG
+bool "Enable ramster debug statistics"
+depends on DEBUG_FS && RAMSTER
+default n
+help
+  This is used to provide an debugfs directory with counters of
+  how ramster is doing. You probably want to set this to 'N'.
+
 # Depends on not-yet-upstreamed mm patches to export end_swap_bio_write and
 # __add_to_swap_cache, and implement __swap_writepage (which is swap_writepage
 # without the frontswap call. When these are in-tree, the dependency on
diff --git a/drivers/staging/zcache/Makefile b/drivers/staging/zcache/Makefile
index 4956fa0..845a5c2 100644
--- a/drivers/staging/zcache/Makefile
+++ b/drivers/staging/zcache/Makefile
@@ -1,6 +1,6 @@
 zcache-y   :=  zcache-main.o tmem.o zbud.o
 zcache-$(CONFIG_ZCACHE_DEBUG) += debug.o
-zcache-$(CONFIG_RAMSTER) += ramster/debug.o
+zcache-$(CONFIG_RAMSTER_DEBUG) += ramster/debug.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/ramster.o ramster/r2net.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/nodemanager.o ramster/tcp.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/heartbeat.o ramster/masklog.o
diff --git a/drivers/staging/zcache/ramster/debug.h 
b/drivers/staging/zcache/ramster/debug.h
index 7b2deaa..7f80dd4 100644
--- a/drivers/staging/zcache/ramster/debug.h
+++ b/drivers/staging/zcache/ramster/debug.h
@@ -1,4 +1,4 @@
-#ifdef CONFIG_RAMSTER
+#ifdef CONFIG_RAMSTER_DEBUG
 
 extern long ramster_flnodes;
 static atomic_t ramster_flnodes_atomic = ATOMIC_INIT(0);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH PART2 v2 7/7] staging: ramster: add how-to for ramster

2013-04-12 Thread Wanpeng Li
Add how-to for ramster.

Acked-by: Dan Magenheimer 
Singed-off-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/ramster/HOWTO.txt |  257 ++
 1 file changed, 257 insertions(+)
 create mode 100644 drivers/staging/zcache/ramster/HOWTO.txt

diff --git a/drivers/staging/zcache/ramster/HOWTO.txt 
b/drivers/staging/zcache/ramster/HOWTO.txt
new file mode 100644
index 000..a4ee979
--- /dev/null
+++ b/drivers/staging/zcache/ramster/HOWTO.txt
@@ -0,0 +1,257 @@
+Version: 130309
+ Dan Magenheimer 
+
+This is a how-to document for RAMster.  It applies to the March 9, 2013
+version of RAMster, re-merged with the new zcache codebase, built and tested
+on the 3.9 tree and submitted for the staging tree for 3.9.
+
+Note that this document was created from notes taken earlier.  I would
+appreciate any feedback from anyone who follows the process as described
+to confirm that it works and to clarify any possible misunderstandings,
+or to report problems.
+
+A. PRELIMINARY
+
+1) Install two or more Linux systems that are known to work when upgraded
+   to a recent upstream Linux kernel version (e.g. v3.9).  I used Oracle
+   Linux 6 ("OL6") on two Dell Optiplex 790s.  Note that it should be possible
+   to use ocfs2 as a filesystem on your systems but this hasn't been
+   tested thoroughly, so if you do use ocfs2 and run into problems, please
+   report them.  Up to eight nodes should work, but not much testing has
+   been done with more than three nodes.
+
+On each system:
+
+2) Configure, build and install then boot Linux (e.g. 3.9), just to ensure it
+   can be done with an unmodified upstream kernel.  Confirm you booted
+   the upstream kernel with "uname -a".
+
+3) Install ramster-tools.  The src.rpm and an OL6 rpm are available
+   in this directory.  I'm not very good at userspace stuff and
+   would welcome any help in turning ramster-tools into more
+   distributable rpms/debs for a wider range of distros.
+
+B. BUILDING RAMSTER INTO THE KERNEL
+
+Do the following on each system:
+
+1) Ensure you have the new codebase for drivers/staging/zcache in your source.
+
+2) Change your .config to have:
+
+   CONFIG_CLEANCACHE=y
+   CONFIG_FRONTSWAP=y
+   CONFIG_STAGING=y
+   CONFIG_ZCACHE=y
+   CONFIG_RAMSTER=y
+
+   You may have to reconfigure your kernel multiple times to ensure
+   all of these are set properly.  I use:
+
+   # yes "" | make oldconfig
+
+   and then manually check the .config file to ensure my selections
+   have "taken".
+
+   Do not bother to build the kernel until you are certain all of
+   the above config selections will stick for the build.
+
+3) Build this kernel and "make install" so that you have a new kernel
+   in /etc/grub.conf
+
+4) Add "ramster" to the kernel boot line in /etc/grub.conf.
+
+5) Reboot and check dmesg to ensure there are some messages from ramster
+   and that "ramster_enabled=1" appears.
+
+   # dmesg | grep ramster
+
+   You should also see a lot of files in:
+
+   # ls /sys/kernel/debug/zcache
+   # ls /sys/kernel/debug/ramster
+
+   and a few files in:
+
+   # ls /sys/kernel/mm/ramster
+
+   RAMster now will act as a single-system zcache but doesn't yet
+   know anything about the cluster so can't do anything remotely.
+
+C. BUILDING THE RAMSTER CLUSTER
+
+This is the error prone part unless you are a clustering expert.  We need
+to describe the cluster in /etc/ramster.conf file and the init scripts
+that parse it are extremely picky about the syntax.
+
+1) Create the /etc/ramster.conf file and ensure it is identical
+   on both systems.  There is a good amount of similar documentation
+   for ocfs2 /etc/cluster.conf that can be googled for this, but I use:
+
+   cluster:
+   name = ramster
+   node_count = 2
+   node:
+   name = system1
+   cluster = ramster
+   number = 0
+   ip_address = my.ip.ad.r1
+   ip_port = 
+   node:
+   name = system2
+   cluster = ramster
+   number = 0
+   ip_address = my.ip.ad.r2
+   ip_port = 
+
+   You must ensure that the "name" field in the file exactly matches
+   the output of "hostname" on each system.  The following assumes
+   you use "ramster" as the name of your cluster.
+
+2) Enable the ramster service and configure it:
+
+   # chkconfig --add ramster
+   # service ramster configure
+
+   Set "load on boot" to "y", cluster to start is "ramster" (or whatever
+   name you chose in ramster.conf), heartbeat dead threshold as "500",
+   network idle timeout as "100".  Leave the others as default.
+
+4) Reboot.  After reboot, try:
+
+   # service ramster status
+
+   You should see "Checking ramster cluster ramster: Online".  If you do
+   not, something is wrong and RAMster will not work.  Note that you
+   should also see that the driver for "configfs" is 

[PATCH PART3 v3 6/6] staging: zcache/debug: fix coding style

2013-04-12 Thread Wanpeng Li
Fix coding style issue: ERROR: space prohibited before that '++' (ctx:WxO)
and line beyond 8 characters.

Acked-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/debug.h |   95 
 1 file changed, 76 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/zcache/debug.h b/drivers/staging/zcache/debug.h
index ddad92f..8088d28 100644
--- a/drivers/staging/zcache/debug.h
+++ b/drivers/staging/zcache/debug.h
@@ -174,26 +174,83 @@ extern ssize_t zcache_writtenback_pages;
 extern ssize_t zcache_outstanding_writeback_pages;
 #endif
 
-static inline void inc_zcache_flush_total(void) { zcache_flush_total ++; };
-static inline void inc_zcache_flush_found(void) { zcache_flush_found ++; };
-static inline void inc_zcache_flobj_total(void) { zcache_flobj_total ++; };
-static inline void inc_zcache_flobj_found(void) { zcache_flobj_found ++; };
-static inline void inc_zcache_failed_eph_puts(void) { zcache_failed_eph_puts 
++; };
-static inline void inc_zcache_failed_pers_puts(void) { zcache_failed_pers_puts 
++; };
-static inline void inc_zcache_failed_getfreepages(void) { 
zcache_failed_getfreepages ++; };
-static inline void inc_zcache_failed_alloc(void) { zcache_failed_alloc ++; };
-static inline void inc_zcache_put_to_flush(void) { zcache_put_to_flush ++; };
-static inline void inc_zcache_compress_poor(void) { zcache_compress_poor ++; };
-static inline void inc_zcache_mean_compress_poor(void) { 
zcache_mean_compress_poor ++; };
-static inline void inc_zcache_eph_ate_tail(void) { zcache_eph_ate_tail ++; };
-static inline void inc_zcache_eph_ate_tail_failed(void) { 
zcache_eph_ate_tail_failed ++; };
-static inline void inc_zcache_pers_ate_eph(void) { zcache_pers_ate_eph ++; };
-static inline void inc_zcache_pers_ate_eph_failed(void) { 
zcache_pers_ate_eph_failed ++; };
-static inline void inc_zcache_evicted_eph_zpages(unsigned zpages) { 
zcache_evicted_eph_zpages += zpages; };
-static inline void inc_zcache_evicted_eph_pageframes(void) { 
zcache_evicted_eph_pageframes ++; };
+static inline void inc_zcache_flush_total(void)
+{
+   zcache_flush_total++;
+};
+static inline void inc_zcache_flush_found(void)
+{
+   zcache_flush_found++;
+};
+static inline void inc_zcache_flobj_total(void)
+{
+   zcache_flobj_total++;
+};
+static inline void inc_zcache_flobj_found(void)
+{
+   zcache_flobj_found++;
+};
+static inline void inc_zcache_failed_eph_puts(void)
+{
+   zcache_failed_eph_puts++;
+};
+static inline void inc_zcache_failed_pers_puts(void)
+{
+   zcache_failed_pers_puts++;
+};
+static inline void inc_zcache_failed_getfreepages(void)
+{
+   zcache_failed_getfreepages++;
+};
+static inline void inc_zcache_failed_alloc(void)
+{
+   zcache_failed_alloc++;
+};
+static inline void inc_zcache_put_to_flush(void)
+{
+   zcache_put_to_flush++;
+};
+static inline void inc_zcache_compress_poor(void)
+{
+   zcache_compress_poor++;
+};
+static inline void inc_zcache_mean_compress_poor(void)
+{
+   zcache_mean_compress_poor++;
+};
+static inline void inc_zcache_eph_ate_tail(void)
+{
+   zcache_eph_ate_tail++;
+};
+static inline void inc_zcache_eph_ate_tail_failed(void)
+{
+   zcache_eph_ate_tail_failed++;
+};
+static inline void inc_zcache_pers_ate_eph(void)
+{
+   zcache_pers_ate_eph++;
+};
+static inline void inc_zcache_pers_ate_eph_failed(void)
+{
+   zcache_pers_ate_eph_failed++;
+};
+static inline void inc_zcache_evicted_eph_zpages(unsigned zpages)
+{
+   zcache_evicted_eph_zpages += zpages;
+};
+static inline void inc_zcache_evicted_eph_pageframes(void)
+{
+   zcache_evicted_eph_pageframes++;
+};
 
-static inline void inc_zcache_eph_nonactive_puts_ignored(void) { 
zcache_eph_nonactive_puts_ignored ++; };
-static inline void inc_zcache_pers_nonactive_puts_ignored(void) { 
zcache_pers_nonactive_puts_ignored ++; };
+static inline void inc_zcache_eph_nonactive_puts_ignored(void)
+{
+   zcache_eph_nonactive_puts_ignored++;
+};
+static inline void inc_zcache_pers_nonactive_puts_ignored(void)
+{
+   zcache_pers_nonactive_puts_ignored++;
+};
 
 int zcache_debugfs_init(void);
 #else
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH PART3 v3 2/6] staging: ramster: Move debugfs code out of ramster.c file

2013-04-12 Thread Wanpeng Li
Note that at this point there is no CONFIG_RAMSTER_DEBUG
option in the Kconfig. So in effect all of the counters
are nop until that option gets introduced in patch:
ramster/debug: Add CONFIG_RAMSTER_DEBUG Kconfig entry

Acked-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/Makefile  |1 +
 drivers/staging/zcache/ramster/debug.c   |   70 ++
 drivers/staging/zcache/ramster/debug.h   |   76 
 drivers/staging/zcache/ramster/ramster.c |  115 ++
 4 files changed, 152 insertions(+), 110 deletions(-)
 create mode 100644 drivers/staging/zcache/ramster/debug.c
 create mode 100644 drivers/staging/zcache/ramster/debug.h

diff --git a/drivers/staging/zcache/Makefile b/drivers/staging/zcache/Makefile
index 24fd6aa..4956fa0 100644
--- a/drivers/staging/zcache/Makefile
+++ b/drivers/staging/zcache/Makefile
@@ -1,5 +1,6 @@
 zcache-y   :=  zcache-main.o tmem.o zbud.o
 zcache-$(CONFIG_ZCACHE_DEBUG) += debug.o
+zcache-$(CONFIG_RAMSTER) += ramster/debug.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/ramster.o ramster/r2net.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/nodemanager.o ramster/tcp.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/heartbeat.o ramster/masklog.o
diff --git a/drivers/staging/zcache/ramster/debug.c 
b/drivers/staging/zcache/ramster/debug.c
new file mode 100644
index 000..76861e4
--- /dev/null
+++ b/drivers/staging/zcache/ramster/debug.c
@@ -0,0 +1,70 @@
+#include 
+#include "debug.h"
+
+#ifdef CONFIG_DEBUG_FS
+#include 
+#define zdfsdebugfs_create_size_t
+#define zdfs64  debugfs_create_u64
+
+ssize_t ramster_eph_pages_remoted;
+ssize_t ramster_pers_pages_remoted;
+ssize_t ramster_eph_pages_remote_failed;
+ssize_t ramster_pers_pages_remote_failed;
+ssize_t ramster_remote_eph_pages_succ_get;
+ssize_t ramster_remote_pers_pages_succ_get;
+ssize_t ramster_remote_eph_pages_unsucc_get;
+ssize_t ramster_remote_pers_pages_unsucc_get;
+ssize_t ramster_pers_pages_remote_nomem;
+ssize_t ramster_remote_objects_flushed;
+ssize_t ramster_remote_object_flushes_failed;
+ssize_t ramster_remote_pages_flushed;
+ssize_t ramster_remote_page_flushes_failed;
+
+int __init ramster_debugfs_init(void)
+{
+   struct dentry *root = debugfs_create_dir("ramster", NULL);
+   if (root == NULL)
+   return -ENXIO;
+
+   zdfs("eph_pages_remoted", S_IRUGO, root, _eph_pages_remoted);
+   zdfs("pers_pages_remoted", S_IRUGO, root, _pers_pages_remoted);
+   zdfs("eph_pages_remote_failed", S_IRUGO, root,
+   _eph_pages_remote_failed);
+   zdfs("pers_pages_remote_failed", S_IRUGO, root,
+   _pers_pages_remote_failed);
+   zdfs("remote_eph_pages_succ_get", S_IRUGO, root,
+   _remote_eph_pages_succ_get);
+   zdfs("remote_pers_pages_succ_get", S_IRUGO, root,
+   _remote_pers_pages_succ_get);
+   zdfs("remote_eph_pages_unsucc_get", S_IRUGO, root,
+   _remote_eph_pages_unsucc_get);
+   zdfs("remote_pers_pages_unsucc_get", S_IRUGO, root,
+   _remote_pers_pages_unsucc_get);
+   zdfs("pers_pages_remote_nomem", S_IRUGO, root,
+   _pers_pages_remote_nomem);
+   zdfs("remote_objects_flushed", S_IRUGO, root,
+   _remote_objects_flushed);
+   zdfs("remote_pages_flushed", S_IRUGO, root,
+   _remote_pages_flushed);
+   zdfs("remote_object_flushes_failed", S_IRUGO, root,
+   _remote_object_flushes_failed);
+   zdfs("remote_page_flushes_failed", S_IRUGO, root,
+   _remote_page_flushes_failed);
+   zdfs("foreign_eph_pages", S_IRUGO, root,
+   _foreign_eph_pages);
+   zdfs("foreign_eph_pages_max", S_IRUGO, root,
+   _foreign_eph_pages_max);
+   zdfs("foreign_pers_pages", S_IRUGO, root,
+   _foreign_pers_pages);
+   zdfs("foreign_pers_pages_max", S_IRUGO, root,
+   _foreign_pers_pages_max);
+   return 0;
+}
+#undef  zdebugfs
+#undef  zdfs64
+#else
+static inline int ramster_debugfs_init(void)
+{
+   return 0;
+}
+#endif
diff --git a/drivers/staging/zcache/ramster/debug.h 
b/drivers/staging/zcache/ramster/debug.h
new file mode 100644
index 000..17a8435
--- /dev/null
+++ b/drivers/staging/zcache/ramster/debug.h
@@ -0,0 +1,76 @@
+#ifdef CONFIG_RAMSTER
+
+extern long ramster_flnodes;
+static atomic_t ramster_flnodes_atomic = ATOMIC_INIT(0);
+extern unsigned long ramster_flnodes_max;
+static inline void inc_ramster_flnodes(void)
+{
+   ramster_flnodes = atomic_inc_return(_flnodes_atomic);
+   if (ramster_flnodes > ramster_flnodes_max)
+   ramster_flnodes_max = ramster_flnodes;
+}
+static inline void dec_ramster_flnodes(void)
+{
+   ramster_flnodes = atomic_dec_return(_flnodes_atomic);
+}
+extern ssize_t ramster_foreign_eph_pages;
+static atomic_t ramster_foreign_eph_pages_atomic = ATOMIC_INIT(0);
+extern ssize_t 

[PATCH PART3 v3 0/6] staging: zcache/ramster: fix and ramster/debugfs improvement

2013-04-12 Thread Wanpeng Li
Changelog: 
 v2 -> v3:
  * update patch description of staging: ramster: Move debugfs code out of 
ramster.c file 
  * update patch title of staging: ramster/debug: Add RAMSTER_DEBUG Kconfig 
entry 
 v1 -> v2:  
  * fix bisect issue 
  * fix issue in patch staging: ramster: Provide accessory functions for 
counter decrease
  * drop patch staging: zcache: remove zcache_freeze 
  * Add Dan Acked-by

Fix bugs in zcache and rips out the debug counters out of ramster.c and 
sticks them in a debug.c file. Introduce accessory functions for counters 
increase/decrease, they are available when config RAMSTER_DEBUG, otherwise 
they are empty non-debug functions. Using an array to initialize/use debugfs 
attributes to make them neater. Dan Magenheimer confirm these works 
are needed. http://marc.info/?l=linux-mm=136535713106882=2

Patch 1~2 fix bugs in zcache

Patch 3~8 rips out the debug counters out of ramster.c and sticks them 
  in a debug.c file 

Patch 9 fix coding style issue introduced in zcache2 cleanups 
(s/int/bool + debugfs movement) patchset 

Patch 10 add how-to for ramster 

Dan Magenheimer (1):
staging: ramster: add how-to for ramster

Wanpeng Li (6):
staging: ramster: decrease foregin pers pages when count < 0
staging: ramster: Move debugfs code out of ramster.c files
staging: ramster/debug: Use an array to initialize/use debugfs 
attributes
staging: ramster/debug: Add RAMSTER_DEBUG Kconfig entry
staging: ramster: Add incremental accessory counters
staging: zcache/debug: fix coding style

 drivers/staging/zcache/Kconfig   |8 +
 drivers/staging/zcache/Makefile  |1 +
 drivers/staging/zcache/debug.h   |   95 ---
 drivers/staging/zcache/ramster/HOWTO.txt |  257 ++
 drivers/staging/zcache/ramster/debug.c   |   66 
 drivers/staging/zcache/ramster/debug.h   |  143 +
 drivers/staging/zcache/ramster/ramster.c |  148 +++--
 7 files changed, 573 insertions(+), 145 deletions(-)
 create mode 100644 drivers/staging/zcache/ramster/HOWTO.txt
 create mode 100644 drivers/staging/zcache/ramster/debug.c
 create mode 100644 drivers/staging/zcache/ramster/debug.h

-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH PART3 v3 3/6] staging: ramster/debug: Use an array to initialize/use debugfs attributes

2013-04-12 Thread Wanpeng Li
Use an array to initialize/use debugfs attributes, it makes them 
neater as zcache/debug.c does.

Acked-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/ramster/debug.c |   68 +++-
 1 file changed, 32 insertions(+), 36 deletions(-)

diff --git a/drivers/staging/zcache/ramster/debug.c 
b/drivers/staging/zcache/ramster/debug.c
index 76861e4..bf34133 100644
--- a/drivers/staging/zcache/ramster/debug.c
+++ b/drivers/staging/zcache/ramster/debug.c
@@ -3,8 +3,6 @@
 
 #ifdef CONFIG_DEBUG_FS
 #include 
-#define zdfsdebugfs_create_size_t
-#define zdfs64  debugfs_create_u64
 
 ssize_t ramster_eph_pages_remoted;
 ssize_t ramster_pers_pages_remoted;
@@ -20,48 +18,46 @@ ssize_t ramster_remote_object_flushes_failed;
 ssize_t ramster_remote_pages_flushed;
 ssize_t ramster_remote_page_flushes_failed;
 
+#define ATTR(x)  { .name = #x, .val = _##x, }
+static struct debug_entry {
+   const char *name;
+   ssize_t *val;
+} attrs[] = {
+   ATTR(eph_pages_remoted),
+   ATTR(pers_pages_remoted),
+   ATTR(eph_pages_remote_failed),
+   ATTR(pers_pages_remote_failed),
+   ATTR(remote_eph_pages_succ_get),
+   ATTR(remote_pers_pages_succ_get),
+   ATTR(remote_eph_pages_unsucc_get),
+   ATTR(remote_pers_pages_unsucc_get),
+   ATTR(pers_pages_remote_nomem),
+   ATTR(remote_objects_flushed),
+   ATTR(remote_pages_flushed),
+   ATTR(remote_object_flushes_failed),
+   ATTR(remote_page_flushes_failed),
+   ATTR(foreign_eph_pages),
+   ATTR(foreign_eph_pages_max),
+   ATTR(foreign_pers_pages),
+   ATTR(foreign_pers_pages_max),
+};
+#undef ATTR
+
 int __init ramster_debugfs_init(void)
 {
+   int i;
struct dentry *root = debugfs_create_dir("ramster", NULL);
if (root == NULL)
return -ENXIO;
 
-   zdfs("eph_pages_remoted", S_IRUGO, root, _eph_pages_remoted);
-   zdfs("pers_pages_remoted", S_IRUGO, root, _pers_pages_remoted);
-   zdfs("eph_pages_remote_failed", S_IRUGO, root,
-   _eph_pages_remote_failed);
-   zdfs("pers_pages_remote_failed", S_IRUGO, root,
-   _pers_pages_remote_failed);
-   zdfs("remote_eph_pages_succ_get", S_IRUGO, root,
-   _remote_eph_pages_succ_get);
-   zdfs("remote_pers_pages_succ_get", S_IRUGO, root,
-   _remote_pers_pages_succ_get);
-   zdfs("remote_eph_pages_unsucc_get", S_IRUGO, root,
-   _remote_eph_pages_unsucc_get);
-   zdfs("remote_pers_pages_unsucc_get", S_IRUGO, root,
-   _remote_pers_pages_unsucc_get);
-   zdfs("pers_pages_remote_nomem", S_IRUGO, root,
-   _pers_pages_remote_nomem);
-   zdfs("remote_objects_flushed", S_IRUGO, root,
-   _remote_objects_flushed);
-   zdfs("remote_pages_flushed", S_IRUGO, root,
-   _remote_pages_flushed);
-   zdfs("remote_object_flushes_failed", S_IRUGO, root,
-   _remote_object_flushes_failed);
-   zdfs("remote_page_flushes_failed", S_IRUGO, root,
-   _remote_page_flushes_failed);
-   zdfs("foreign_eph_pages", S_IRUGO, root,
-   _foreign_eph_pages);
-   zdfs("foreign_eph_pages_max", S_IRUGO, root,
-   _foreign_eph_pages_max);
-   zdfs("foreign_pers_pages", S_IRUGO, root,
-   _foreign_pers_pages);
-   zdfs("foreign_pers_pages_max", S_IRUGO, root,
-   _foreign_pers_pages_max);
+   for (i = 0; i < ARRAY_SIZE(attrs); i++)
+   if (!debugfs_create_size_t(attrs[i].name,
+   S_IRUGO, root, attrs[i].val))
+   goto out;
return 0;
+out:
+   return -ENODEV;
 }
-#undef  zdebugfs
-#undef  zdfs64
 #else
 static inline int ramster_debugfs_init(void)
 {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH PART3 v3 4/6] staging: ramster: Add incremental accessory counters

2013-04-12 Thread Wanpeng Li
Add incremental accessory counters that are going to be used for 
debug fs entries.

Acked-by: Dan Magenheimer 
Signed-off-by: Wanpeng Li 
---
 drivers/staging/zcache/ramster/debug.h   |   67 ++
 drivers/staging/zcache/ramster/ramster.c |   32 +++---
 2 files changed, 83 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/zcache/ramster/debug.h 
b/drivers/staging/zcache/ramster/debug.h
index 17a8435..7b2deaa 100644
--- a/drivers/staging/zcache/ramster/debug.h
+++ b/drivers/staging/zcache/ramster/debug.h
@@ -60,6 +60,59 @@ extern ssize_t ramster_remote_page_flushes_failed;
 
 int ramster_debugfs_init(void);
 
+static inline void inc_ramster_eph_pages_remoted(void)
+{
+   ramster_eph_pages_remoted++;
+};
+static inline void inc_ramster_pers_pages_remoted(void)
+{
+   ramster_pers_pages_remoted++;
+};
+static inline void inc_ramster_eph_pages_remote_failed(void)
+{
+   ramster_eph_pages_remote_failed++;
+};
+static inline void inc_ramster_pers_pages_remote_failed(void)
+{
+   ramster_pers_pages_remote_failed++;
+};
+static inline void inc_ramster_remote_eph_pages_succ_get(void)
+{
+   ramster_remote_eph_pages_succ_get++;
+};
+static inline void inc_ramster_remote_pers_pages_succ_get(void)
+{
+   ramster_remote_pers_pages_succ_get++;
+};
+static inline void inc_ramster_remote_eph_pages_unsucc_get(void)
+{
+   ramster_remote_eph_pages_unsucc_get++;
+};
+static inline void inc_ramster_remote_pers_pages_unsucc_get(void)
+{
+   ramster_remote_pers_pages_unsucc_get++;
+};
+static inline void inc_ramster_pers_pages_remote_nomem(void)
+{
+   ramster_pers_pages_remote_nomem++;
+};
+static inline void inc_ramster_remote_objects_flushed(void)
+{
+   ramster_remote_objects_flushed++;
+};
+static inline void inc_ramster_remote_object_flushes_failed(void)
+{
+   ramster_remote_object_flushes_failed++;
+};
+static inline void inc_ramster_remote_pages_flushed(void)
+{
+   ramster_remote_pages_flushed++;
+};
+static inline void inc_ramster_remote_page_flushes_failed(void)
+{
+   ramster_remote_page_flushes_failed++;
+};
+
 #else
 
 static inline void inc_ramster_flnodes(void) { };
@@ -69,6 +122,20 @@ static inline void dec_ramster_foreign_eph_pages(void) { };
 static inline void inc_ramster_foreign_pers_pages(void) { };
 static inline void dec_ramster_foreign_pers_pages(void) { };
 
+static inline void inc_ramster_eph_pages_remoted(void) { };
+static inline void inc_ramster_pers_pages_remoted(void) { };
+static inline void inc_ramster_eph_pages_remote_failed(void) { };
+static inline void inc_ramster_pers_pages_remote_failed(void) { };
+static inline void inc_ramster_remote_eph_pages_succ_get(void) { };
+static inline void inc_ramster_remote_pers_pages_succ_get(void) { };
+static inline void inc_ramster_remote_eph_pages_unsucc_get(void) { };
+static inline void inc_ramster_remote_pers_pages_unsucc_get(void) { };
+static inline void inc_ramster_pers_pages_remote_nomem(void) { };
+static inline void inc_ramster_remote_objects_flushed(void) { };
+static inline void inc_ramster_remote_object_flushes_failed(void) { };
+static inline void inc_ramster_remote_pages_flushed(void) { };
+static inline void inc_ramster_remote_page_flushes_failed(void) { };
+
 static inline int ramster_debugfs_init(void)
 {
return 0;
diff --git a/drivers/staging/zcache/ramster/ramster.c 
b/drivers/staging/zcache/ramster/ramster.c
index 1d29f5b..8781627 100644
--- a/drivers/staging/zcache/ramster/ramster.c
+++ b/drivers/staging/zcache/ramster/ramster.c
@@ -156,9 +156,9 @@ int ramster_localify(int pool_id, struct tmem_oid *oidp, 
uint32_t index,
pr_err("UNTESTED pampd==NULL in ramster_localify\n");
 #endif
if (eph)
-   ramster_remote_eph_pages_unsucc_get++;
+   inc_ramster_remote_eph_pages_unsucc_get();
else
-   ramster_remote_pers_pages_unsucc_get++;
+   inc_ramster_remote_pers_pages_unsucc_get();
obj = NULL;
goto finish;
} else if (unlikely(!pampd_is_remote(pampd))) {
@@ -167,9 +167,9 @@ int ramster_localify(int pool_id, struct tmem_oid *oidp, 
uint32_t index,
pr_err("UNTESTED dup while waiting in ramster_localify\n");
 #endif
if (eph)
-   ramster_remote_eph_pages_unsucc_get++;
+   inc_ramster_remote_eph_pages_unsucc_get();
else
-   ramster_remote_pers_pages_unsucc_get++;
+   inc_ramster_remote_pers_pages_unsucc_get();
obj = NULL;
pampd = NULL;
ret = -EEXIST;
@@ -178,7 +178,7 @@ int ramster_localify(int pool_id, struct tmem_oid *oidp, 
uint32_t index,
/* no remote data, delete the local is_remote pampd */
pampd = NULL;
if (eph)
-   

[PATCHv2 2/4] ARM: arch_timers: Pass clock event to set_mode callback

2013-04-12 Thread Stephen Boyd
There isn't any reason why we don't pass the event here and we'll
need it in the near future for memory mapped arch timers anyway.

Cc: Mark Rutland 
Cc: Marc Zyngier 
Signed-off-by: Stephen Boyd 
---
 drivers/clocksource/arm_arch_timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 2abb861..545891b 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -140,7 +140,7 @@ static int __cpuinit arch_timer_setup(struct 
clock_event_device *clk)
 
clk->cpumask = cpumask_of(smp_processor_id());
 
-   clk->set_mode(CLOCK_EVT_MODE_SHUTDOWN, NULL);
+   clk->set_mode(CLOCK_EVT_MODE_SHUTDOWN, clk);
 
clockevents_config_and_register(clk, arch_timer_rate,
0xf, 0x7fff);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 3/4] clocksource: arch_timer: Push the read/write wrappers deeper

2013-04-12 Thread Stephen Boyd
We're going to introduce support to read and write the memory
mapped timer registers in the next patch, so push the cp15
read/write functions one level deeper. This simplifies the next
patch and makes it clearer what's going on.

Cc: Mark Rutland 
Cc: Marc Zyngier 
Signed-off-by: Stephen Boyd 
---
 arch/arm/include/asm/arch_timer.h|  5 ++--
 arch/arm64/include/asm/arch_timer.h  |  4 ++--
 drivers/clocksource/arm_arch_timer.c | 44 
 3 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/arch_timer.h 
b/arch/arm/include/asm/arch_timer.h
index 35fea17..23d65f5 100644
--- a/arch/arm/include/asm/arch_timer.h
+++ b/arch/arm/include/asm/arch_timer.h
@@ -18,7 +18,8 @@ int arch_timer_sched_clock_init(void);
  * nicely work out which register we want, and chuck away the rest of
  * the code. At least it does so with a recent GCC (4.6.3).
  */
-static inline void arch_timer_reg_write(const int access, const int reg, u32 
val)
+static inline void arch_timer_reg_write_cp15(const int access, const int reg,
+ u32 val)
 {
if (access == ARCH_TIMER_PHYS_ACCESS) {
switch (reg) {
@@ -45,7 +46,7 @@ static inline void arch_timer_reg_write(const int access, 
const int reg, u32 val
isb();
 }
 
-static inline u32 arch_timer_reg_read(const int access, const int reg)
+static inline u32 arch_timer_reg_read_cp15(const int access, const int reg)
 {
u32 val = 0;
 
diff --git a/arch/arm64/include/asm/arch_timer.h 
b/arch/arm64/include/asm/arch_timer.h
index 5307737..95db1a9 100644
--- a/arch/arm64/include/asm/arch_timer.h
+++ b/arch/arm64/include/asm/arch_timer.h
@@ -26,7 +26,7 @@
 
 #include 
 
-static inline void arch_timer_reg_write(int access, int reg, u32 val)
+static inline void arch_timer_reg_write_cp15(int access, int reg, u32 val)
 {
if (access == ARCH_TIMER_PHYS_ACCESS) {
switch (reg) {
@@ -57,7 +57,7 @@ static inline void arch_timer_reg_write(int access, int reg, 
u32 val)
isb();
 }
 
-static inline u32 arch_timer_reg_read(int access, int reg)
+static inline u32 arch_timer_reg_read_cp15(int access, int reg)
 {
u32 val;
 
diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 545891b..4f1f002 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -43,14 +43,26 @@ static bool arch_timer_use_virtual = true;
  * Architected system timer support.
  */
 
+static inline void arch_timer_reg_write(int access, int reg, u32 val,
+   struct clock_event_device *clk)
+{
+   arch_timer_reg_write_cp15(access, reg, val);
+}
+
+static inline u32 arch_timer_reg_read(int access, int reg,
+ struct clock_event_device *clk)
+{
+   return arch_timer_reg_read_cp15(access, reg);
+}
+
 static inline irqreturn_t timer_handler(const int access,
struct clock_event_device *evt)
 {
unsigned long ctrl;
-   ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL);
+   ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL, evt);
if (ctrl & ARCH_TIMER_CTRL_IT_STAT) {
ctrl |= ARCH_TIMER_CTRL_IT_MASK;
-   arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl);
+   arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, evt);
evt->event_handler(evt);
return IRQ_HANDLED;
}
@@ -72,15 +84,16 @@ static irqreturn_t arch_timer_handler_phys(int irq, void 
*dev_id)
return timer_handler(ARCH_TIMER_PHYS_ACCESS, evt);
 }
 
-static inline void timer_set_mode(const int access, int mode)
+static inline void timer_set_mode(const int access, int mode,
+ struct clock_event_device *clk)
 {
unsigned long ctrl;
switch (mode) {
case CLOCK_EVT_MODE_UNUSED:
case CLOCK_EVT_MODE_SHUTDOWN:
-   ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL);
+   ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL, clk);
ctrl &= ~ARCH_TIMER_CTRL_ENABLE;
-   arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl);
+   arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
break;
default:
break;
@@ -90,36 +103,37 @@ static inline void timer_set_mode(const int access, int 
mode)
 static void arch_timer_set_mode_virt(enum clock_event_mode mode,
 struct clock_event_device *clk)
 {
-   timer_set_mode(ARCH_TIMER_VIRT_ACCESS, mode);
+   timer_set_mode(ARCH_TIMER_VIRT_ACCESS, mode, clk);
 }
 
 static void arch_timer_set_mode_phys(enum clock_event_mode mode,
 struct clock_event_device *clk)
 {
-   timer_set_mode(ARCH_TIMER_PHYS_ACCESS, mode);
+   

[PATCHv2 4/4] clocksource: arch_timer: Add support for memory mapped timers

2013-04-12 Thread Stephen Boyd
Add support for the memory mapped timers by filling in the
read/write functions and adding some parsing code. Note that we
only register one clocksource, preferring the cp15 based
clocksource over the mmio one.

To keep things simple we register one global clockevent. This
covers the case of UP and SMP systems with only mmio hardware and
systems where the memory mapped timers are used as the broadcast
timer in low power modes.

The DT binding allows for per-CPU memory mapped timers in case we
want to support that in the future, but the code isn't added
here. We also don't do much for hypervisor support, although it
should be possible to support it by searching for at least two
frames where one frame has the virtual capability and then
updating KVM timers to support it.

Cc: Mark Rutland 
Cc: Marc Zyngier 
Signed-off-by: Stephen Boyd 
---
 drivers/clocksource/arm_arch_timer.c | 415 ++-
 include/clocksource/arm_arch_timer.h |   4 +-
 2 files changed, 361 insertions(+), 58 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 4f1f002..7385fca 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -16,13 +16,39 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
 
 #include 
 
+#define CNTTIDR0x08
+#define CNTTIDR_VIRT(n)(BIT(1) << ((n) * 4))
+
+#define CNTVCT_LO  0x08
+#define CNTVCT_HI  0x0c
+#define CNTFRQ 0x10
+#define CNTP_TVAL  0x28
+#define CNTP_CTL   0x2c
+#define CNTV_TVAL  0x38
+#define CNTV_CTL   0x3c
+
+#define ARCH_CP15_TIMERBIT(0)
+#define ARCH_MEM_TIMER BIT(1)
+static unsigned arch_timers_present;
+
+static void __iomem *arch_counter_base;
+
+struct arch_timer {
+   void __iomem *base;
+   struct clock_event_device evt;
+};
+
+#define to_arch_timer(e) container_of(e, struct arch_timer, evt)
+
 static u32 arch_timer_rate;
 
 enum ppi_nr {
@@ -38,6 +64,7 @@ static int arch_timer_ppi[MAX_TIMER_PPI];
 static struct clock_event_device __percpu *arch_timer_evt;
 
 static bool arch_timer_use_virtual = true;
+static bool arch_timer_mem_use_virtual = false;
 
 /*
  * Architected system timer support.
@@ -46,13 +73,69 @@ static bool arch_timer_use_virtual = true;
 static inline void arch_timer_reg_write(int access, int reg, u32 val,
struct clock_event_device *clk)
 {
-   arch_timer_reg_write_cp15(access, reg, val);
+   if (access == ARCH_TIMER_MEM_PHYS_ACCESS) {
+   struct arch_timer *timer = to_arch_timer(clk);
+   switch (reg) {
+   case ARCH_TIMER_REG_CTRL:
+   writel_relaxed(val, timer->base + CNTP_CTL);
+   break;
+   case ARCH_TIMER_REG_TVAL:
+   writel_relaxed(val, timer->base + CNTP_TVAL);
+   break;
+   default:
+   BUILD_BUG();
+   }
+   } else if (access == ARCH_TIMER_MEM_VIRT_ACCESS) {
+   struct arch_timer *timer = to_arch_timer(clk);
+   switch (reg) {
+   case ARCH_TIMER_REG_CTRL:
+   writel_relaxed(val, timer->base + CNTV_CTL);
+   break;
+   case ARCH_TIMER_REG_TVAL:
+   writel_relaxed(val, timer->base + CNTV_TVAL);
+   break;
+   default:
+   BUILD_BUG();
+   }
+   } else {
+   arch_timer_reg_write_cp15(access, reg, val);
+   }
 }
 
 static inline u32 arch_timer_reg_read(int access, int reg,
  struct clock_event_device *clk)
 {
-   return arch_timer_reg_read_cp15(access, reg);
+   u32 val;
+
+   if (access == ARCH_TIMER_MEM_PHYS_ACCESS) {
+   struct arch_timer *timer = to_arch_timer(clk);
+   switch (reg) {
+   case ARCH_TIMER_REG_CTRL:
+   val = readl_relaxed(timer->base + CNTP_CTL);
+   break;
+   case ARCH_TIMER_REG_TVAL:
+   val = readl_relaxed(timer->base + CNTP_TVAL);
+   break;
+   default:
+   BUILD_BUG();
+   }
+   } else if (access == ARCH_TIMER_MEM_VIRT_ACCESS) {
+   struct arch_timer *timer = to_arch_timer(clk);
+   switch (reg) {
+   case ARCH_TIMER_REG_CTRL:
+   val = readl_relaxed(timer->base + CNTV_CTL);
+   break;
+   case ARCH_TIMER_REG_TVAL:
+   val = readl_relaxed(timer->base + CNTV_TVAL);
+   break;
+   default:
+   BUILD_BUG();
+   }
+   } else {
+   val = arch_timer_reg_read_cp15(access, reg);
+   }
+
+   

[PATCHv2 1/4] Documentation: Add memory mapped ARM architected timer binding

2013-04-12 Thread Stephen Boyd
Add a binding for the arm architected timer hardware's memory
mapped interface. The mmio timer hardware is made up of one base
frame and a collection of up to 8 timer frames, where each of the
8 timer frames can have either one or two views. A frame
typically maps to a privilege level (user/kernel, hypervisor,
secure). The first view has full access to the registers within a
frame, while the second view can be restricted to particular
registers within a frame. Each frame must support a physical
timer. It's optional for a frame to support a virtual timer.

Cc: devicetree-disc...@lists.ozlabs.org
Cc: Mark Rutland 
Cc: Marc Zyngier 
Signed-off-by: Stephen Boyd 
---
 .../devicetree/bindings/arm/arch_timer.txt | 59 --
 1 file changed, 56 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/arch_timer.txt 
b/Documentation/devicetree/bindings/arm/arch_timer.txt
index 20746e5..ac20cde 100644
--- a/Documentation/devicetree/bindings/arm/arch_timer.txt
+++ b/Documentation/devicetree/bindings/arm/arch_timer.txt
@@ -1,10 +1,14 @@
 * ARM architected timer
 
-ARM cores may have a per-core architected timer, which provides per-cpu timers.
+ARM cores may have a per-core architected timer, which provides per-cpu timers,
+or a memory mapped architected timer, which provides up to 8 frames with a
+physical and optional virtual timer per frame.
 
-The timer is attached to a GIC to deliver its per-processor interrupts.
+The per-core architected timer is attached to a GIC to deliver its
+per-processor interrupts via PPIs. The memory mapped timer is attached to a GIC
+to deliver its interrupts via SPIs.
 
-** Timer node properties:
+** CP15 Timer node properties:
 
 - compatible : Should at least contain one of
"arm,armv7-timer"
@@ -26,3 +30,52 @@ Example:
 <1 10 0xf08>;
clock-frequency = <1>;
};
+
+** Memory mapped timer node properties
+
+- compatible : Should at least contain "arm,armv7-timer-mem".
+
+- clock-frequency : The frequency of the main counter, in Hz. Optional.
+
+- reg : The control frame base address.
+
+Note that #address-cells, #size-cells, and ranges shall be present to ensure
+the CPU can address a frame's registers.
+
+Frame:
+
+- frame-number: 0 to 7.
+
+- interrupts : Interrupt list for physical and virtual timers in that order.
+  The virtual timer interrupt is optional.
+
+- reg : The first and second view base addresses in that order. The second view
+  base address is optional.
+
+- status : "disabled" indicates the frame is not available for use.
+
+Example:
+
+   timer@f000 {
+   compatible = "arm,armv7-timer-mem";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+   reg = <0xf000 0x1000>;
+   clock-frequency = <5000>;
+
+   frame@f0001000 {
+   frame-number = <0>
+   interrupts = <0 13 0x8>,
+<0 14 0x8>;
+   reg = <0xf0001000 0x1000>,
+ <0xf0002000 0x1000>;
+   };
+
+   frame@f0003000 {
+   frame-number = <1>
+   interrupts = <0 15 0x8>;
+   reg = <0xf0003000 0x1000>;
+   status = "disabled";
+   };
+   };
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 0/4] Memory mapped architected timers

2013-04-12 Thread Stephen Boyd
This patchset adds support for memory mapped architected timers. We
don't have any other global broadcast timer in our system, so we use the
mmio timer during low power modes. The first patch is the binding.
The next two patches lay some groundwork so that the last patch is simpler.
The final patch adds support for mmio timers.

Patches are based on a recent patch from Mark that removes the
physical count reading (clocksource: arch_timer: use virtual counter,
message id <1364404312-4427-4-git-send-email-mark.rutl...@arm.com>).

Updates since v1:
 * Assigned counter reading function and commented why for arm64
 * Updated DT binding to replace frame-id with frame-number and use status
   property

Stephen Boyd (4):
  Documentation: Add memory mapped ARM architected timer binding
  ARM: arch_timers: Pass clock event to set_mode callback
  clocksource: arch_timer: Push the read/write wrappers deeper
  clocksource: arch_timer: Add support for memory mapped timers

 .../devicetree/bindings/arm/arch_timer.txt |  59 ++-
 arch/arm/include/asm/arch_timer.h  |   5 +-
 arch/arm64/include/asm/arch_timer.h|   4 +-
 drivers/clocksource/arm_arch_timer.c   | 455 +
 include/clocksource/arm_arch_timer.h   |   4 +-
 5 files changed, 449 insertions(+), 78 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] module: Fix race condition between load and unload module

2013-04-12 Thread Linus Torvalds
On Fri, Apr 12, 2013 at 4:53 PM, Greg Kroah-Hartman
 wrote:
>
> Linus, I think your patch will reduce the window the race could happen,
> but it should still be there, although testing with it would be
> interesting to see if the original problem can be triggered with it.

Well, with my patch, there's no way you'll ever look up an object with
a zero refcount, so you'll never release it twice. The atomic
operations (atomic_inc_nonzero()) do guarantee that.

The "kset->list_lock" means that the list traversal is safe too.

So one particular race is definitely gone.

Now, what people who call "kset_find_obj()" really expect when not
locked against the last kobject_put(), I don't know. But at least it's
conceptually safe now. They'll either get NULL (either because the
object doesn't exist on the list, or because it does exist but is
about to be removed), or they will get a valid object that has *not*
started to be torn down yet.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/2] SRCU changes for 3.11

2013-04-12 Thread Josh Triplett
On Fri, Apr 12, 2013 at 04:38:15PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> This series provides some SRCU changes:
> 
> 1.Remove srcu_read_lock_raw() and srcu_read_unlock_raw().  These
>   never did get used, and have not been used for some time, so
>   it is time for them to go.
> 
> 2.Fix a bug where srcu_read_lock() is not released upon return
>   from kvmppc_hv_setup_htab_rma().

For both:
Reviewed-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] sched: move update_load_[add/sub/set] from sched.h to fair.c

2013-04-12 Thread Paul Gortmaker
These inlines are only used by kernel/sched/fair.c so they do not
need to be present in the main kernel/sched/sched.h file.

Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Signed-off-by: Paul Gortmaker 
---
 kernel/sched/fair.c  | 18 ++
 kernel/sched/sched.h | 18 --
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 155783b..aeac57e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -113,6 +113,24 @@ unsigned int __read_mostly sysctl_sched_shares_window = 
1000UL;
 unsigned int sysctl_sched_cfs_bandwidth_slice = 5000UL;
 #endif
 
+static inline void update_load_add(struct load_weight *lw, unsigned long inc)
+{
+   lw->weight += inc;
+   lw->inv_weight = 0;
+}
+
+static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
+{
+   lw->weight -= dec;
+   lw->inv_weight = 0;
+}
+
+static inline void update_load_set(struct load_weight *lw, unsigned long w)
+{
+   lw->weight = w;
+   lw->inv_weight = 0;
+}
+
 /*
  * Increase the granularity value when there are more CPUs,
  * because with more CPUs the 'effective latency' as visible
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index bfb0e37..ff5bf3b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -888,24 +888,6 @@ static inline void finish_lock_switch(struct rq *rq, 
struct task_struct *prev)
 #define WF_FORK0x02/* child wakeup after fork */
 #define WF_MIGRATED0x4 /* internal use, task got migrated */
 
-static inline void update_load_add(struct load_weight *lw, unsigned long inc)
-{
-   lw->weight += inc;
-   lw->inv_weight = 0;
-}
-
-static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
-{
-   lw->weight -= dec;
-   lw->inv_weight = 0;
-}
-
-static inline void update_load_set(struct load_weight *lw, unsigned long w)
-{
-   lw->weight = w;
-   lw->inv_weight = 0;
-}
-
 /*
  * To aid in avoiding the subversion of "niceness" due to uneven distribution
  * of tasks with abnormal "nice" values across CPUs the contribution that
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/2] sched: move content out of core files for load average

2013-04-12 Thread Paul Gortmaker
Recent activity has had a focus on moving functionally related blocks of stuff
out of sched/core.c into stand-alone files.  The code relating to load average
calculations has grown significantly enough recently to warrant placing it in
a separate file.

Here we do that, and in doing so, we shed ~20k of code from sched/core.c (~10%).

A couple small static functions in the core sched.h header were also localized
to their singular user in sched/fair.c at the same time, with the goal to also
reduce the amount of "broadcast" content in that sched.h file.

Paul.
---

[ Patches sent here are tested on tip's sched/core, i.e. v3.9-rc1-38-gb329fd5

  Assuming that this change is OK with folks, the timing can be whatever is most
  convenient -- i.e. I can update/respin it close to the end of the merge window
  for what will be v3.10-rc1, if that is what minimizes the inconvenience to 
folks
  who might be changing the code that is relocated here. ]

Paul Gortmaker (2):
  sched: fork load calculation code from sched/core --> sched/load_avg
  sched: move update_load_[add/sub/set] from sched.h to fair.c

 kernel/sched/Makefile   |   2 +-
 kernel/sched/core.c | 569 ---
 kernel/sched/fair.c |  18 ++
 kernel/sched/load_avg.c | 577 
 kernel/sched/sched.h|  26 +--
 5 files changed, 604 insertions(+), 588 deletions(-)
 create mode 100644 kernel/sched/load_avg.c

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] sched: fork load calculation code from sched/core --> sched/load_avg

2013-04-12 Thread Paul Gortmaker
This large chunk of load calculation code can be easily divorced from
the main core.c scheduler file, with only a couple prototypes and
externs added to a kernel/sched header.

Some recent commits expanded the code and the documentation of it,
making it large enough to warrant separation.  For example, see:

  556061b, "sched/nohz: Fix rq->cpu_load[] calculations"
  5aaa0b7, "sched/nohz: Fix rq->cpu_load calculations some more"
  5167e8d, "sched/nohz: Rewrite and fix load-avg computation -- again"

More importantly, it helps reduce the size of the main sched/core.c
by yet another significant amount (~600 lines).

Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Frederic Weisbecker 
Cc: Thomas Gleixner 
Signed-off-by: Paul Gortmaker 
---
 kernel/sched/Makefile   |   2 +-
 kernel/sched/core.c | 569 ---
 kernel/sched/load_avg.c | 577 
 kernel/sched/sched.h|   8 +
 4 files changed, 586 insertions(+), 570 deletions(-)
 create mode 100644 kernel/sched/load_avg.c

diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index deaf90e..0efc670 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
 CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
 endif
 
-obj-y += core.o clock.o cputime.o idle_task.o fair.o rt.o stop_task.o
+obj-y += core.o load_avg.o clock.o cputime.o idle_task.o fair.o rt.o 
stop_task.o
 obj-$(CONFIG_SMP) += cpupri.o
 obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
 obj-$(CONFIG_SCHEDSTATS) += stats.o
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ee8c1bd..136f013 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2019,575 +2019,6 @@ unsigned long nr_iowait_cpu(int cpu)
return atomic_read(>nr_iowait);
 }
 
-unsigned long this_cpu_load(void)
-{
-   struct rq *this = this_rq();
-   return this->cpu_load[0];
-}
-
-
-/*
- * Global load-average calculations
- *
- * We take a distributed and async approach to calculating the global load-avg
- * in order to minimize overhead.
- *
- * The global load average is an exponentially decaying average of nr_running +
- * nr_uninterruptible.
- *
- * Once every LOAD_FREQ:
- *
- *   nr_active = 0;
- *   for_each_possible_cpu(cpu)
- * nr_active += cpu_of(cpu)->nr_running + cpu_of(cpu)->nr_uninterruptible;
- *
- *   avenrun[n] = avenrun[0] * exp_n + nr_active * (1 - exp_n)
- *
- * Due to a number of reasons the above turns in the mess below:
- *
- *  - for_each_possible_cpu() is prohibitively expensive on machines with
- *serious number of cpus, therefore we need to take a distributed approach
- *to calculating nr_active.
- *
- *\Sum_i x_i(t) = \Sum_i x_i(t) - x_i(t_0) | x_i(t_0) := 0
- *  = \Sum_i { \Sum_j=1 x_i(t_j) - x_i(t_j-1) }
- *
- *So assuming nr_active := 0 when we start out -- true per definition, we
- *can simply take per-cpu deltas and fold those into a global accumulate
- *to obtain the same result. See calc_load_fold_active().
- *
- *Furthermore, in order to avoid synchronizing all per-cpu delta folding
- *across the machine, we assume 10 ticks is sufficient time for every
- *cpu to have completed this task.
- *
- *This places an upper-bound on the IRQ-off latency of the machine. Then
- *again, being late doesn't loose the delta, just wrecks the sample.
- *
- *  - cpu_rq()->nr_uninterruptible isn't accurately tracked per-cpu because
- *this would add another cross-cpu cacheline miss and atomic operation
- *to the wakeup path. Instead we increment on whatever cpu the task ran
- *when it went into uninterruptible state and decrement on whatever cpu
- *did the wakeup. This means that only the sum of nr_uninterruptible over
- *all cpus yields the correct result.
- *
- *  This covers the NO_HZ=n code, for extra head-aches, see the comment below.
- */
-
-/* Variables and functions for calc_load */
-static atomic_long_t calc_load_tasks;
-static unsigned long calc_load_update;
-unsigned long avenrun[3];
-EXPORT_SYMBOL(avenrun); /* should be removed */
-
-/**
- * get_avenrun - get the load average array
- * @loads: pointer to dest load array
- * @offset:offset to add
- * @shift: shift count to shift the result left
- *
- * These values are estimates at best, so no need for locking.
- */
-void get_avenrun(unsigned long *loads, unsigned long offset, int shift)
-{
-   loads[0] = (avenrun[0] + offset) << shift;
-   loads[1] = (avenrun[1] + offset) << shift;
-   loads[2] = (avenrun[2] + offset) << shift;
-}
-
-static long calc_load_fold_active(struct rq *this_rq)
-{
-   long nr_active, delta = 0;
-
-   nr_active = this_rq->nr_running;
-   nr_active += (long) this_rq->nr_uninterruptible;
-
-   if (nr_active != this_rq->calc_load_active) {
-   delta = nr_active - this_rq->calc_load_active;
-   

Re: [PATCH] module: Fix race condition between load and unload module

2013-04-12 Thread Anatol Pomozov
Hi

On Fri, Apr 12, 2013 at 4:53 PM, Greg Kroah-Hartman
 wrote:
> On Fri, Apr 12, 2013 at 04:47:50PM -0700, Linus Torvalds wrote:
>> On Fri, Apr 12, 2013 at 3:32 PM, Anatol Pomozov
>>  wrote:
>> >
>> > Here is timeline for the crash in case if kset_find_obj() searches for
>> > an object tht nobody holds and other thread is doing kobject_put()
>> > on the same kobject:
>> >
>> > THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put())
>> > splin_lock()
>> >  atomic_dec_return(kobj->kref), 
>> > counter gets zero here
>> >  ... starts kobject cleanup 
>> >  spin_lock() // WAIT thread A in 
>> > kobj_kset_leave()
>> > iterate over kset->list
>> > atomic_inc(kobj->kref) (counter becomes 1)
>> > spin_unlock()
>> >  spin_lock() // taken
>> >  // it does not know that thread A 
>> > increased counter so it
>> >  remove obj from list
>> >  spin_unlock()
>> >  vfree(module) // frees module object 
>> > with containing kobj
>> >
>> > // kobj points to freed memory area!!
>> > koubject_put(kobj) // OOPS
>>
>> This is a much more generic bug in kobjects, and I would hate to add
>> some random workaround for just one case of this bug like you do. The
>> more fundamental bug needs to be fixed too.
>>
>> I think the more fundamental bugfix is to just fix kobject_get() to
>> return NULL if the refcount was zero, because in that case the kobject
>> no longer really exists.
>>
>> So instead of having
>>
>> kref_get(>kref);
>>
>> it should do
>>
>> if (!atomic_inc_not_zero(>kref.refcount))
>> kobj = NULL;
>>
>> and I think that should fix your race automatically, no? Proper patch
>> attached (but TOTALLY UNTESTED - it seems to compile, though).
>>
>> The problem is that we lose the warning for when the refcount is zero
>> and somebody does a kobject_get(), but that is ok *assuming* that
>> people actually check the return value of kobject_get() rather than
>> just "know" that if they passed in a non-NULL kobj, they'll get it
>> right back.
>>
>> Greg - please take a look... I'm adding Al to the discussion too,
>> because Al just *loooves* these kinds of races ;)
>
> We "should" have some type of "higher-up" lock to prevent the
> release/get races from happening, we have that in the driver core, and I
> thought we had such a lock already in the module subsystem as well,
> which will prevent any of this from being needed.
>
> Rusty, don't we have a lock for this somewhere?
>
> Linus, I think your patch will reduce the window the race could happen,
> but it should still be there, although testing with it would be
> interesting to see if the original problem can be triggered with it.

Linus patch should fix the module race condition. vfree(module) cannot
be called while we keep kobj->kset->lock. vfree() is called in
THREAD_B only after it acquires lock, removes kobj from list. So if
kobj is found by THREAD_A in kset->list and we did not release lock
then memory is not freed.

>
> I'll look at it some more tomorrow, about to go to dinner now...
>
> thanks,
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/7] RCU fixes for 3.11

2013-04-12 Thread Josh Triplett
On Fri, Apr 12, 2013 at 04:18:47PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> This series contains the following fixes for RCU:
> 
> 1-2.  Convert remaining printk() calls to pr_*().
> 
> 3.Kick adaptive-ticks CPUs that are holding up RCU grace periods.
> 
> 4.Don't allocate bootmem from rcu_init(), courtesy of Sasha Levin.
> 
> 5.Remove "Experimental" flags from old RCU Kconfig options.
> 
> 6.Automatically tune defaults for delays between attempts to
>   force quiescent states.
> 
> 7.Merge adjacent identical #ifdefs.

For 1-5 and 7:
Reviewed-by: Josh Triplett 

Responded to 6 separately.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 6/7] rcu: Drive quiescent-state-forcing delay from HZ

2013-04-12 Thread Josh Triplett
On Fri, Apr 12, 2013 at 04:19:13PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" 
> 
> Systems with HZ=100 can have slow bootup times due to the default
> three-jiffy delays between quiescent-state forcing attempts.  This
> commit therefore auto-tunes the RCU_JIFFIES_TILL_FORCE_QS value based
> on the value of HZ.  However, this would break very large systems that
> require more time between quiescent-state forcing attempts.  This
> commit therefore also ups the default delay by one jiffy for each
> 256 CPUs that might be on the system (based off of nr_cpu_ids at
> runtime, -not- NR_CPUS at build time).
> 
> Reported-by: Paul Mackerras 
> Signed-off-by: Paul E. McKenney 

Something seems very wrong if RCU regularly hits the fqs code during
boot; feels like there's some more straightforward solution we're
missing.  What causes these CPUs to fall under RCU's scrutiny during
boot yet not actually hit the RCU codepaths naturally?

Also, a comment below.

> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -342,7 +342,17 @@ struct rcu_data {
>  #define RCU_FORCE_QS 3   /* Need to force quiescent state. */
>  #define RCU_SIGNAL_INIT  RCU_SAVE_DYNTICK
>  
> -#define RCU_JIFFIES_TILL_FORCE_QS 3  /* for rsp->jiffies_force_qs */
> +#if HZ > 500
> +#define RCU_JIFFIES_TILL_FORCE_QS 3  /* for jiffies_till_first_fqs */
> +#elif HZ > 250
> +#define RCU_JIFFIES_TILL_FORCE_QS 2
> +#else
> +#define RCU_JIFFIES_TILL_FORCE_QS 1
> +#endif

This seems like it really wants to use a duration calculated directly
from HZ; perhaps (HZ/100)?

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] module: Fix race condition between load and unload module

2013-04-12 Thread Greg Kroah-Hartman
On Fri, Apr 12, 2013 at 04:47:50PM -0700, Linus Torvalds wrote:
> On Fri, Apr 12, 2013 at 3:32 PM, Anatol Pomozov
>  wrote:
> >
> > Here is timeline for the crash in case if kset_find_obj() searches for
> > an object tht nobody holds and other thread is doing kobject_put()
> > on the same kobject:
> >
> > THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put())
> > splin_lock()
> >  atomic_dec_return(kobj->kref), counter 
> > gets zero here
> >  ... starts kobject cleanup 
> >  spin_lock() // WAIT thread A in 
> > kobj_kset_leave()
> > iterate over kset->list
> > atomic_inc(kobj->kref) (counter becomes 1)
> > spin_unlock()
> >  spin_lock() // taken
> >  // it does not know that thread A 
> > increased counter so it
> >  remove obj from list
> >  spin_unlock()
> >  vfree(module) // frees module object 
> > with containing kobj
> >
> > // kobj points to freed memory area!!
> > koubject_put(kobj) // OOPS
> 
> This is a much more generic bug in kobjects, and I would hate to add
> some random workaround for just one case of this bug like you do. The
> more fundamental bug needs to be fixed too.
> 
> I think the more fundamental bugfix is to just fix kobject_get() to
> return NULL if the refcount was zero, because in that case the kobject
> no longer really exists.
> 
> So instead of having
> 
> kref_get(>kref);
> 
> it should do
> 
> if (!atomic_inc_not_zero(>kref.refcount))
> kobj = NULL;
> 
> and I think that should fix your race automatically, no? Proper patch
> attached (but TOTALLY UNTESTED - it seems to compile, though).
> 
> The problem is that we lose the warning for when the refcount is zero
> and somebody does a kobject_get(), but that is ok *assuming* that
> people actually check the return value of kobject_get() rather than
> just "know" that if they passed in a non-NULL kobj, they'll get it
> right back.
> 
> Greg - please take a look... I'm adding Al to the discussion too,
> because Al just *loooves* these kinds of races ;)

We "should" have some type of "higher-up" lock to prevent the
release/get races from happening, we have that in the driver core, and I
thought we had such a lock already in the module subsystem as well,
which will prevent any of this from being needed.

Rusty, don't we have a lock for this somewhere?

Linus, I think your patch will reduce the window the race could happen,
but it should still be there, although testing with it would be
interesting to see if the original problem can be triggered with it.

I'll look at it some more tomorrow, about to go to dinner now...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 11/12] rcu: Remove TINY_PREEMPT_RCU tracing documentation

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Because TINY_PREEMPT_RCU is no more, this commit removes its tracing
formats from the documentation.

Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/trace.txt | 100 ++--
 1 file changed, 4 insertions(+), 96 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index c776968..f3778f8 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -530,113 +530,21 @@ o"nos" counts the number of times we balked for 
other
reasons, e.g., the grace period ended first.
 
 
-CONFIG_TINY_RCU and CONFIG_TINY_PREEMPT_RCU debugfs Files and Formats
+CONFIG_TINY_RCU debugfs Files and Formats
 
 These implementations of RCU provides a single debugfs file under the
 top-level directory RCU, namely rcu/rcudata, which displays fields in
-rcu_bh_ctrlblk, rcu_sched_ctrlblk and, for CONFIG_TINY_PREEMPT_RCU,
-rcu_preempt_ctrlblk.
+rcu_bh_ctrlblk and rcu_sched_ctrlblk.
 
 The output of "cat rcu/rcudata" is as follows:
 
-rcu_preempt: qlen=24 gp=1097669 g197/p197/c197 tasks=...
- ttb=. btg=no ntb=184 neb=0 nnb=183 j=01f7 bt=0274
- normal balk: nt=1097669 gt=0 bt=371 b=0 ny=25073378 nos=0
- exp balk: bt=0 nos=0
 rcu_sched: qlen: 0
 rcu_bh: qlen: 0
 
-This is split into rcu_preempt, rcu_sched, and rcu_bh sections, with the
-rcu_preempt section appearing only in CONFIG_TINY_PREEMPT_RCU builds.
-The last three lines of the rcu_preempt section appear only in
-CONFIG_RCU_BOOST kernel builds.  The fields are as follows:
+This is split into rcu_sched and rcu_bh sections.  The field is as
+follows:
 
 o  "qlen" is the number of RCU callbacks currently waiting either
for an RCU grace period or waiting to be invoked.  This is the
only field present for rcu_sched and rcu_bh, due to the
short-circuiting of grace period in those two cases.
-
-o  "gp" is the number of grace periods that have completed.
-
-o  "g197/p197/c197" displays the grace-period state, with the
-   "g" number being the number of grace periods that have started
-   (mod 256), the "p" number being the number of grace periods
-   that the CPU has responded to (also mod 256), and the "c"
-   number being the number of grace periods that have completed
-   (once again mode 256).
-
-   Why have both "gp" and "g"?  Because the data flowing into
-   "gp" is only present in a CONFIG_RCU_TRACE kernel.
-
-o  "tasks" is a set of bits.  The first bit is "T" if there are
-   currently tasks that have recently blocked within an RCU
-   read-side critical section, the second bit is "N" if any of the
-   aforementioned tasks are blocking the current RCU grace period,
-   and the third bit is "E" if any of the aforementioned tasks are
-   blocking the current expedited grace period.  Each bit is "."
-   if the corresponding condition does not hold.
-
-o  "ttb" is a single bit.  It is "B" if any of the blocked tasks
-   need to be priority boosted and "." otherwise.
-
-o  "btg" indicates whether boosting has been carried out during
-   the current grace period, with "exp" indicating that boosting
-   is in progress for an expedited grace period, "no" indicating
-   that boosting has not yet started for a normal grace period,
-   "begun" indicating that boosting has bebug for a normal grace
-   period, and "done" indicating that boosting has completed for
-   a normal grace period.
-
-o  "ntb" is the total number of tasks subjected to RCU priority boosting
-   periods since boot.
-
-o  "neb" is the number of expedited grace periods that have had
-   to resort to RCU priority boosting since boot.
-
-o  "nnb" is the number of normal grace periods that have had
-   to resort to RCU priority boosting since boot.
-
-o  "j" is the low-order 16 bits of the jiffies counter in hexadecimal.
-
-o  "bt" is the low-order 16 bits of the value that the jiffies counter
-   will have at the next time that boosting is scheduled to begin.
-
-o  In the line beginning with "normal balk", the fields are as follows:
-
-   o   "nt" is the number of times that the system balked from
-   boosting because there were no blocked tasks to boost.
-   Note that the system will balk from boosting even if the
-   grace period is overdue when the currently running task
-   is looping within an RCU read-side critical section.
-   There is no point in boosting in this case, because
-   boosting a running task won't make it run any faster.
-
-   o   "gt" is the number of times that the system balked
-   from boosting because, although there were blocked tasks,
-   none of them were preventing the current grace period
-   from completing.
-
-   o   "bt" is the 

[PATCH tip/core/rcu 05/12] rcu: Remove rcu_preempt_process_callbacks()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_process_callbacks()
is now an empty function.  This commit therefore eliminates it by
inlining it.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny.c| 1 -
 kernel/rcutiny_plugin.h | 8 
 2 files changed, 9 deletions(-)

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 6f5a2a6..7fc2339 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -314,7 +314,6 @@ static void rcu_process_callbacks(struct softirq_action 
*unused)
 {
__rcu_process_callbacks(_sched_ctrlblk);
__rcu_process_callbacks(_bh_ctrlblk);
-   rcu_preempt_process_callbacks();
 }
 
 /*
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 8b835b9..bfe9924 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -102,14 +102,6 @@ static void check_cpu_stalls(void)
RCU_TRACE(check_cpu_stall_preempt());
 }
 
-/*
- * Because preemptible RCU does not exist, it never has any callbacks
- * to process.
- */
-static void rcu_preempt_process_callbacks(void)
-{
-}
-
 /* Hold off callback invocation until early_initcall() time. */
 static int rcu_scheduler_fully_active __read_mostly;
 
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 03/12] rcu: Remove rcu_preempt_check_callbacks()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_check_callbacks()
is now an empty function.  This commit therefore eliminates it by
inlining it.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny.c| 1 -
 kernel/rcutiny_plugin.h | 8 
 2 files changed, 9 deletions(-)

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index a0714a5..9178282 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -257,7 +257,6 @@ void rcu_check_callbacks(int cpu, int user)
rcu_sched_qs(cpu);
else if (!in_softirq())
rcu_bh_qs(cpu);
-   rcu_preempt_check_callbacks();
 }
 
 /*
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index cf0bc22..404b3a3 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -104,14 +104,6 @@ static void check_cpu_stalls(void)
 
 /*
  * Because preemptible RCU does not exist, it never has any callbacks
- * to check.
- */
-static void rcu_preempt_check_callbacks(void)
-{
-}
-
-/*
- * Because preemptible RCU does not exist, it never has any callbacks
  * to remove.
  */
 static void rcu_preempt_remove_callbacks(struct rcu_ctrlblk *rcp)
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 09/12] rcu: Remove rcu_preempt_note_context_switch()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_note_context_switch()
is now an empty function.  This commit therefore eliminates it by inlining it.

Signed-off-by: Paul E. McKenney 
---
 include/linux/rcutiny.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 07b5aff..51230b6 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -68,10 +68,6 @@ static inline void kfree_call_rcu(struct rcu_head *head,
call_rcu(head, func);
 }
 
-static inline void rcu_preempt_note_context_switch(void)
-{
-}
-
 static inline int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies)
 {
*delta_jiffies = ULONG_MAX;
@@ -81,7 +77,6 @@ static inline int rcu_needs_cpu(int cpu, unsigned long 
*delta_jiffies)
 static inline void rcu_note_context_switch(int cpu)
 {
rcu_sched_qs(cpu);
-   rcu_preempt_note_context_switch();
 }
 
 /*
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] PCI / PM: Fix fallback to PCI_D0 in pci_platform_power_transition()

2013-04-12 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Commit b51306c (PCI: Set device power state to PCI_D0 for device
without native PM support) modified pci_platform_power_transition()
by adding code causing dev->current_state for devices that don't
support native PCI PM but are power-manageable by the platform to be
changed to PCI_D0 regardless of the value returned by the preceding
platform_pci_set_power_state().  In particular, that also is done
if the platform_pci_set_power_state() has been successful, which
causes the correct power state of the device set by
pci_update_current_state() in that case to be overwritten by PCI_D0.

Fix that mistake by making the fallback to PCI_D0 only happen if
the platform_pci_set_power_state() has returned an error.

Reported-by: Chris J. Benenati 
Signed-off-by: Rafael J. Wysocki 
Cc: 
---
 drivers/pci/pci.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-pm/drivers/pci/pci.c
===
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -646,8 +646,7 @@ static int pci_platform_power_transition
error = platform_pci_set_power_state(dev, state);
if (!error)
pci_update_current_state(dev, state);
-   /* Fall back to PCI_D0 if native PM is not supported */
-   if (!dev->pm_cap)
+   else if (!dev->pm_cap) /* Fall back to PCI_D0 */
dev->current_state = PCI_D0;
} else {
error = -ENODEV;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 10/12] rcu: Consolidate rcutiny_plugin.h ifdefs

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

This commit rearranges code in order to allow ifdefs to be consolidated
in kernel/rcutiny_plugin.h, simplifying the code.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny_plugin.h | 86 +++--
 1 file changed, 40 insertions(+), 46 deletions(-)

diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index bac3a6e..65ef180 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -53,54 +53,10 @@ static struct rcu_ctrlblk rcu_bh_ctrlblk = {
 };
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
+#include 
+
 int rcu_scheduler_active __read_mostly;
 EXPORT_SYMBOL_GPL(rcu_scheduler_active);
-#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
-
-#ifdef CONFIG_RCU_TRACE
-
-static void check_cpu_stall(struct rcu_ctrlblk *rcp)
-{
-   unsigned long j;
-   unsigned long js;
-
-   if (rcu_cpu_stall_suppress)
-   return;
-   rcp->ticks_this_gp++;
-   j = jiffies;
-   js = rcp->jiffies_stall;
-   if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
-   pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx 
(t=%lu jiffies q=%ld)\n",
-  rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
-  jiffies - rcp->gp_start, rcp->qlen);
-   dump_stack();
-   }
-   if (*rcp->curtail && ULONG_CMP_GE(j, js))
-   rcp->jiffies_stall = jiffies +
-   3 * rcu_jiffies_till_stall_check() + 3;
-   else if (ULONG_CMP_GE(j, js))
-   rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
-}
-
-#endif /* #ifdef CONFIG_RCU_TRACE */
-
-static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp)
-{
-#ifdef CONFIG_RCU_TRACE
-   rcp->ticks_this_gp = 0;
-   rcp->gp_start = jiffies;
-   rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
-#endif /* #ifdef CONFIG_RCU_TRACE */
-}
-
-static void check_cpu_stalls(void)
-{
-   RCU_TRACE(check_cpu_stall(_bh_ctrlblk));
-   RCU_TRACE(check_cpu_stall(_sched_ctrlblk));
-}
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-#include 
 
 /*
  * During boot, we forgive RCU lockdep issues.  After this function is
@@ -179,4 +135,42 @@ MODULE_AUTHOR("Paul E. McKenney");
 MODULE_DESCRIPTION("Read-Copy Update tracing for tiny implementation");
 MODULE_LICENSE("GPL");
 
+static void check_cpu_stall(struct rcu_ctrlblk *rcp)
+{
+   unsigned long j;
+   unsigned long js;
+
+   if (rcu_cpu_stall_suppress)
+   return;
+   rcp->ticks_this_gp++;
+   j = jiffies;
+   js = rcp->jiffies_stall;
+   if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
+   pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx 
(t=%lu jiffies q=%ld)\n",
+  rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
+  jiffies - rcp->gp_start, rcp->qlen);
+   dump_stack();
+   }
+   if (*rcp->curtail && ULONG_CMP_GE(j, js))
+   rcp->jiffies_stall = jiffies +
+   3 * rcu_jiffies_till_stall_check() + 3;
+   else if (ULONG_CMP_GE(j, js))
+   rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
+}
+
 #endif /* #ifdef CONFIG_RCU_TRACE */
+
+static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp)
+{
+#ifdef CONFIG_RCU_TRACE
+   rcp->ticks_this_gp = 0;
+   rcp->gp_start = jiffies;
+   rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
+#endif /* #ifdef CONFIG_RCU_TRACE */
+}
+
+static void check_cpu_stalls(void)
+{
+   RCU_TRACE(check_cpu_stall(_bh_ctrlblk));
+   RCU_TRACE(check_cpu_stall(_sched_ctrlblk));
+}
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 07/12] rcu: Remove check_cpu_stall_preempt()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, check_cpu_stall_preempt()
is now an empty function.  This commit therefore eliminates it by
inlining it.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny_plugin.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 36fd83c..bac3a6e 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -82,8 +82,6 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
 }
 
-static void check_cpu_stall_preempt(void);
-
 #endif /* #ifdef CONFIG_RCU_TRACE */
 
 static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp)
@@ -99,7 +97,6 @@ static void check_cpu_stalls(void)
 {
RCU_TRACE(check_cpu_stall(_bh_ctrlblk));
RCU_TRACE(check_cpu_stall(_sched_ctrlblk));
-   RCU_TRACE(check_cpu_stall_preempt());
 }
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -182,8 +179,4 @@ MODULE_AUTHOR("Paul E. McKenney");
 MODULE_DESCRIPTION("Read-Copy Update tracing for tiny implementation");
 MODULE_LICENSE("GPL");
 
-static void check_cpu_stall_preempt(void)
-{
-}
-
 #endif /* #ifdef CONFIG_RCU_TRACE */
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 01/12] rcu: Remove TINY_PREEMPT_RCU

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

TINY_PREEMPT_RCU adds significant code and complexity, but does not
offer commensurate benefits.  People currently using TINY_PREEMPT_RCU
can get much better memory footprint with TINY_RCU, or, if they really
need preemptible RCU, they can use TREE_PREEMPT_RCU with a relatively
minor degradation in memory footprint.  Please note that this move
has been widely publicized on LKML (https://lkml.org/lkml/2012/11/12/545)
and on LWN (http://lwn.net/Articles/541037/).

This commit therefore removes TINY_PREEMPT_RCU.

Signed-off-by: Paul E. McKenney 
---
 include/linux/hardirq.h  |   2 +-
 include/linux/rcupdate.h |   2 +-
 init/Kconfig |  10 +-
 kernel/rcutiny_plugin.h  | 854 ---
 4 files changed, 3 insertions(+), 865 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index c1d6555..05bcc09 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -128,7 +128,7 @@ extern void synchronize_irq(unsigned int irq);
 # define synchronize_irq(irq)  barrier()
 #endif
 
-#if defined(CONFIG_TINY_RCU) || defined(CONFIG_TINY_PREEMPT_RCU)
+#if defined(CONFIG_TINY_RCU)
 
 static inline void rcu_nmi_enter(void)
 {
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9ed2c9a..0a8276d 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -277,7 +277,7 @@ void wait_rcu_gp(call_rcu_func_t crf);
 
 #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
 #include 
-#elif defined(CONFIG_TINY_RCU) || defined(CONFIG_TINY_PREEMPT_RCU)
+#elif defined(CONFIG_TINY_RCU)
 #include 
 #else
 #error "Unknown RCU implementation specified to kernel configuration"
diff --git a/init/Kconfig b/init/Kconfig
index a3a2304..406b1a5 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -462,18 +462,10 @@ config TINY_RCU
  is not required.  This option greatly reduces the
  memory footprint of RCU.
 
-config TINY_PREEMPT_RCU
-   bool "Preemptible UP-only small-memory-footprint RCU"
-   depends on PREEMPT && !SMP
-   help
- This option selects the RCU implementation that is designed
- for real-time UP systems.  This option greatly reduces the
- memory footprint of RCU.
-
 endchoice
 
 config PREEMPT_RCU
-   def_bool ( TREE_PREEMPT_RCU || TINY_PREEMPT_RCU )
+   def_bool TREE_PREEMPT_RCU
help
  This option enables preemptible-RCU code that is common between
  the TREE_PREEMPT_RCU and TINY_PREEMPT_RCU implementations.
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 8a23300..29a4dd7 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -102,763 +102,6 @@ static void check_cpu_stalls(void)
RCU_TRACE(check_cpu_stall_preempt());
 }
 
-#ifdef CONFIG_TINY_PREEMPT_RCU
-
-#include 
-
-/* Global control variables for preemptible RCU. */
-struct rcu_preempt_ctrlblk {
-   struct rcu_ctrlblk rcb; /* curtail: ->next ptr of last CB for GP. */
-   struct rcu_head **nexttail;
-   /* Tasks blocked in a preemptible RCU */
-   /*  read-side critical section while an */
-   /*  preemptible-RCU grace period is in */
-   /*  progress must wait for a later grace */
-   /*  period.  This pointer points to the */
-   /*  ->next pointer of the last task that */
-   /*  must wait for a later grace period, or */
-   /*  to &->rcb.rcucblist if there is no */
-   /*  such task. */
-   struct list_head blkd_tasks;
-   /* Tasks blocked in RCU read-side critical */
-   /*  section.  Tasks are placed at the head */
-   /*  of this list and age towards the tail. */
-   struct list_head *gp_tasks;
-   /* Pointer to the first task blocking the */
-   /*  current grace period, or NULL if there */
-   /*  is no such task. */
-   struct list_head *exp_tasks;
-   /* Pointer to first task blocking the */
-   /*  current expedited grace period, or NULL */
-   /*  if there is no such task.  If there */
-   /*  is no current expedited grace period, */
-   /*  then there cannot be any such task. */
-#ifdef CONFIG_RCU_BOOST
-   struct list_head *boost_tasks;
-   /* Pointer to first task that needs to be */
-   /*  priority-boosted, or NULL if no priority */
-   /*  boosting is needed.  If there is no */
-   /*  current or expedited grace 

[PATCH tip/core/rcu 02/12] rcu: Remove show_tiny_preempt_stats()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, show_tiny_preempt_stats()
is now an empty function.  This commit therefore eliminates it by
inlining it.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny_plugin.h | 13 -
 1 file changed, 13 deletions(-)

diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 29a4dd7..cf0bc22 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -102,18 +102,6 @@ static void check_cpu_stalls(void)
RCU_TRACE(check_cpu_stall_preempt());
 }
 
-#ifdef CONFIG_RCU_TRACE
-
-/*
- * Because preemptible RCU does not exist, it is not necessary to
- * dump out its statistics.
- */
-static void show_tiny_preempt_stats(struct seq_file *m)
-{
-}
-
-#endif /* #ifdef CONFIG_RCU_TRACE */
-
 /*
  * Because preemptible RCU does not exist, it never has any callbacks
  * to check.
@@ -202,7 +190,6 @@ static void rcu_trace_sub_qlen(struct rcu_ctrlblk *rcp, int 
n)
  */
 static int show_tiny_stats(struct seq_file *m, void *unused)
 {
-   show_tiny_preempt_stats(m);
seq_printf(m, "rcu_sched: qlen: %ld\n", rcu_sched_ctrlblk.qlen);
seq_printf(m, "rcu_bh: qlen: %ld\n", rcu_bh_ctrlblk.qlen);
return 0;
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 06/12] rcu: Simplify RCU_TINY RCU callback invocation

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

TINY_PREEMPT_RCU could use a kthread to handle RCU callback invocation,
which required an API to abstract kthread vs. softirq invocation.
Now that TINY_PREEMPT_RCU is no longer with us, this commit retires
this API in favor of direct use of the relevant softirq primitives.

Signed-off-by: Paul E. McKenney 
---
 include/linux/rcupdate.h |  1 +
 include/linux/rcutiny.h  |  4 
 include/linux/rcutree.h  |  1 -
 kernel/rcutiny.c | 14 +-
 kernel/rcutiny_plugin.h  | 33 -
 5 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 0a8276d..1c53a9f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -216,6 +216,7 @@ static inline int rcu_preempt_depth(void)
 #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
 
 /* Internal to kernel */
+extern void rcu_init(void);
 extern void rcu_sched_qs(int cpu);
 extern void rcu_bh_qs(int cpu);
 extern void rcu_check_callbacks(int cpu, int user);
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 4e56a9c..592fad8 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -27,10 +27,6 @@
 
 #include 
 
-static inline void rcu_init(void)
-{
-}
-
 static inline void rcu_barrier_bh(void)
 {
wait_rcu_gp(call_rcu_bh);
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 952b793..3f1aa8f 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -30,7 +30,6 @@
 #ifndef __LINUX_RCUTREE_H
 #define __LINUX_RCUTREE_H
 
-extern void rcu_init(void);
 extern void rcu_note_context_switch(int cpu);
 extern int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies);
 extern void rcu_cpu_stall_reset(void);
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 7fc2339..4adc9e2 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -44,7 +44,6 @@
 
 /* Forward declarations for rcutiny_plugin.h. */
 struct rcu_ctrlblk;
-static void invoke_rcu_callbacks(void);
 static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp);
 static void rcu_process_callbacks(struct softirq_action *unused);
 static void __call_rcu(struct rcu_head *head,
@@ -227,7 +226,7 @@ void rcu_sched_qs(int cpu)
local_irq_save(flags);
if (rcu_qsctr_help(_sched_ctrlblk) +
rcu_qsctr_help(_bh_ctrlblk))
-   invoke_rcu_callbacks();
+   raise_softirq(RCU_SOFTIRQ);
local_irq_restore(flags);
 }
 
@@ -240,7 +239,7 @@ void rcu_bh_qs(int cpu)
 
local_irq_save(flags);
if (rcu_qsctr_help(_bh_ctrlblk))
-   invoke_rcu_callbacks();
+   raise_softirq(RCU_SOFTIRQ);
local_irq_restore(flags);
 }
 
@@ -277,7 +276,7 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
  ACCESS_ONCE(rcp->rcucblist),
  need_resched(),
  is_idle_task(current),
- rcu_is_callbacks_kthread()));
+ false));
return;
}
 
@@ -307,7 +306,7 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
RCU_TRACE(rcu_trace_sub_qlen(rcp, cb_count));
RCU_TRACE(trace_rcu_batch_end(rcp->name, cb_count, 0, need_resched(),
  is_idle_task(current),
- rcu_is_callbacks_kthread()));
+ false));
 }
 
 static void rcu_process_callbacks(struct softirq_action *unused)
@@ -379,3 +378,8 @@ void call_rcu_bh(struct rcu_head *head, void (*func)(struct 
rcu_head *rcu))
__call_rcu(head, func, _bh_ctrlblk);
 }
 EXPORT_SYMBOL_GPL(call_rcu_bh);
+
+void rcu_init(void)
+{
+   open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
+}
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index bfe9924..36fd83c 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -102,39 +102,6 @@ static void check_cpu_stalls(void)
RCU_TRACE(check_cpu_stall_preempt());
 }
 
-/* Hold off callback invocation until early_initcall() time. */
-static int rcu_scheduler_fully_active __read_mostly;
-
-/*
- * Start up softirq processing of callbacks.
- */
-void invoke_rcu_callbacks(void)
-{
-   if (rcu_scheduler_fully_active)
-   raise_softirq(RCU_SOFTIRQ);
-}
-
-#ifdef CONFIG_RCU_TRACE
-
-/*
- * There is no callback kthread, so this thread is never it.
- */
-static bool rcu_is_callbacks_kthread(void)
-{
-   return false;
-}
-
-#endif /* #ifdef CONFIG_RCU_TRACE */
-
-static int __init rcu_scheduler_really_started(void)
-{
-   rcu_scheduler_fully_active = 1;
-   open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
-   raise_softirq(RCU_SOFTIRQ);  /* Invoke any callbacks from early boot. */
-   return 0;
-}
-early_initcall(rcu_scheduler_really_started);
-
 #ifdef 

[PATCH tip/core/rcu 04/12] rcu: Remove rcu_preempt_remove_callbacks()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_remove_callbacks()
is now an empty function.  This commit therefore eliminates it by
inlining it.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny.c| 1 -
 kernel/rcutiny_plugin.h | 8 
 2 files changed, 9 deletions(-)

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 9178282..6f5a2a6 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -289,7 +289,6 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
*rcp->donetail = NULL;
if (rcp->curtail == rcp->donetail)
rcp->curtail = >rcucblist;
-   rcu_preempt_remove_callbacks(rcp);
rcp->donetail = >rcucblist;
local_irq_restore(flags);
 
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 404b3a3..8b835b9 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -104,14 +104,6 @@ static void check_cpu_stalls(void)
 
 /*
  * Because preemptible RCU does not exist, it never has any callbacks
- * to remove.
- */
-static void rcu_preempt_remove_callbacks(struct rcu_ctrlblk *rcp)
-{
-}
-
-/*
- * Because preemptible RCU does not exist, it never has any callbacks
  * to process.
  */
 static void rcu_preempt_process_callbacks(void)
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 08/12] rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.h

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Now that CONFIG_TINY_PREEMPT_RCU is no more, this commit removes
the CONFIG_TINY_RCU ifdefs from include/linux/rcutiny.h in favor of
unconditionally compiling the CONFIG_TINY_RCU legs of those ifdefs.

Signed-off-by: Paul E. McKenney 
---
 include/linux/rcutiny.h | 28 
 1 file changed, 28 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 592fad8..07b5aff 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -37,8 +37,6 @@ static inline void rcu_barrier_sched(void)
wait_rcu_gp(call_rcu_sched);
 }
 
-#ifdef CONFIG_TINY_RCU
-
 static inline void synchronize_rcu_expedited(void)
 {
synchronize_sched();/* Only one CPU, so pretty fast anyway!!! */
@@ -49,17 +47,6 @@ static inline void rcu_barrier(void)
rcu_barrier_sched();  /* Only one CPU, so only one list of callbacks! */
 }
 
-#else /* #ifdef CONFIG_TINY_RCU */
-
-void synchronize_rcu_expedited(void);
-
-static inline void rcu_barrier(void)
-{
-   wait_rcu_gp(call_rcu);
-}
-
-#endif /* #else #ifdef CONFIG_TINY_RCU */
-
 static inline void synchronize_rcu_bh(void)
 {
synchronize_sched();
@@ -81,8 +68,6 @@ static inline void kfree_call_rcu(struct rcu_head *head,
call_rcu(head, func);
 }
 
-#ifdef CONFIG_TINY_RCU
-
 static inline void rcu_preempt_note_context_switch(void)
 {
 }
@@ -93,19 +78,6 @@ static inline int rcu_needs_cpu(int cpu, unsigned long 
*delta_jiffies)
return 0;
 }
 
-#else /* #ifdef CONFIG_TINY_RCU */
-
-void rcu_preempt_note_context_switch(void);
-int rcu_preempt_needs_cpu(void);
-
-static inline int rcu_needs_cpu(int cpu, unsigned long *delta_jiffies)
-{
-   *delta_jiffies = ULONG_MAX;
-   return rcu_preempt_needs_cpu();
-}
-
-#endif /* #else #ifdef CONFIG_TINY_RCU */
-
 static inline void rcu_note_context_switch(int cpu)
 {
rcu_sched_qs(cpu);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 0/12] TINY_RCU changes for 3.11

2013-04-12 Thread Paul E. McKenney
Hello!

This series removes TINY_PREEMPT_RCU, as promised/threatened at
http://lwn.net/Articles/541037/ and https://lkml.org/lkml/2012/11/12/545.

1.  Remove TINY_PREEMPT_RCU.  This is a straight syntactic removal,
with no attempt at cleanup.  The remaining patches do the cleanup.

2.  Inline the now-empty show_tiny_preempt_stats() function.

3.  Inline the now-empty rcu_preempt_check_callbacks() function.

4.  Inline the now-empty rcu_preempt_remove_callbacks() function.

5.  Inline the now-empty rcu_preempt_process_callbacks() function.

6.  Because TINY_RCU no longer has kthreads, remove the code that
used to abstract away kthread vs. softirq invocation.

7.  Inline the now-empty check_cpu_stall_preempt() function.

8.  Remove CONFIG_TINY_RCU ifdefs from include/linux/rcutiny.h

9.  Inline the now-empty rcu_preempt_note_context_switch() function.

10. Move code to allow consolidating ifdefs in kernel/rcutiny_plugin.h.

11. Remove TINY_PREEMPT_RCU's tracing formats from documentation.

12. Shrink TINY_RCU a bit by moving exit_rcu() to TREE_RCU, leaving
TINY_RCU with a static inline empty function.

Thanx, Paul


 b/Documentation/RCU/trace.txt |  100 
 b/include/linux/hardirq.h |2 
 b/include/linux/rcupdate.h|5 
 b/include/linux/rcutiny.h |   41 -
 b/include/linux/rcutree.h |3 
 b/init/Kconfig|   10 
 b/kernel/rcupdate.c   |   26 -
 b/kernel/rcutiny.c|   17 
 b/kernel/rcutiny_plugin.h | 1017 +-
 b/kernel/rcutree_plugin.h |   26 +
 10 files changed, 90 insertions(+), 1157 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 12/12] rcu: Shrink TINY_RCU by moving exit_rcu()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Now that TINY_PREEMPT_RCU is no more, exit_rcu() is always an empty
function.  But if TINY_RCU is going to have an empty function, it should
be in include/linux/rcutiny.h, where it does not bloat the kernel.
This commit therefore moves exit_rcu() out of kernel/rcupdate.c to
kernel/rcutree_plugin.h, and places a static inline empty function in
include/linux/rcutiny.h in order to shrink TINY_RCU a bit.

Signed-off-by: Paul E. McKenney 
---
 include/linux/rcupdate.h |  2 --
 include/linux/rcutiny.h  |  4 
 include/linux/rcutree.h  |  2 ++
 kernel/rcupdate.c| 26 +-
 kernel/rcutree_plugin.h  | 26 ++
 5 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 1c53a9f..1d0145a 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -240,8 +240,6 @@ static inline void rcu_user_hooks_switch(struct task_struct 
*prev,
 struct task_struct *next) { }
 #endif /* CONFIG_RCU_USER_QS */
 
-extern void exit_rcu(void);
-
 /**
  * RCU_NONIDLE - Indicate idle-loop code that needs RCU readers
  * @a: Code that RCU needs to pay attention to.
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 51230b6..e31005e 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -119,6 +119,10 @@ static inline void rcu_cpu_stall_reset(void)
 {
 }
 
+static inline void exit_rcu(void)
+{
+}
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 extern int rcu_scheduler_active __read_mostly;
 extern void rcu_scheduler_starting(void);
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 3f1aa8f..226169d 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -85,6 +85,8 @@ extern void rcu_force_quiescent_state(void);
 extern void rcu_bh_force_quiescent_state(void);
 extern void rcu_sched_force_quiescent_state(void);
 
+extern void exit_rcu(void);
+
 extern void rcu_scheduler_starting(void);
 extern int rcu_scheduler_active __read_mostly;
 
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 48ab703..0be1fa2 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -104,31 +104,7 @@ void __rcu_read_unlock(void)
 }
 EXPORT_SYMBOL_GPL(__rcu_read_unlock);
 
-/*
- * Check for a task exiting while in a preemptible-RCU read-side
- * critical section, clean up if so.  No need to issue warnings,
- * as debug_check_no_locks_held() already does this if lockdep
- * is enabled.
- */
-void exit_rcu(void)
-{
-   struct task_struct *t = current;
-
-   if (likely(list_empty(>rcu_node_entry)))
-   return;
-   t->rcu_read_lock_nesting = 1;
-   barrier();
-   t->rcu_read_unlock_special = RCU_READ_UNLOCK_BLOCKED;
-   __rcu_read_unlock();
-}
-
-#else /* #ifdef CONFIG_PREEMPT_RCU */
-
-void exit_rcu(void)
-{
-}
-
-#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
+#endif /* #ifdef CONFIG_PREEMPT_RCU */
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 static struct lock_class_key rcu_lock_key;
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e6cf7e5..086daaf 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -931,6 +931,24 @@ static void __init __rcu_init_preempt(void)
rcu_init_one(_preempt_state, _preempt_data);
 }
 
+/*
+ * Check for a task exiting while in a preemptible-RCU read-side
+ * critical section, clean up if so.  No need to issue warnings,
+ * as debug_check_no_locks_held() already does this if lockdep
+ * is enabled.
+ */
+void exit_rcu(void)
+{
+   struct task_struct *t = current;
+
+   if (likely(list_empty(>rcu_node_entry)))
+   return;
+   t->rcu_read_lock_nesting = 1;
+   barrier();
+   t->rcu_read_unlock_special = RCU_READ_UNLOCK_BLOCKED;
+   __rcu_read_unlock();
+}
+
 #else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
 
 static struct rcu_state *rcu_state = _sched_state;
@@ -1099,6 +1117,14 @@ static void __init __rcu_init_preempt(void)
 {
 }
 
+/*
+ * Because preemptible RCU does not exist, tasks cannot possibly exit
+ * while in preemptible RCU read-side critical sections.
+ */
+void exit_rcu(void)
+{
+}
+
 #endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
 
 #ifdef CONFIG_RCU_BOOST
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] module: Fix race condition between load and unload module

2013-04-12 Thread Linus Torvalds
On Fri, Apr 12, 2013 at 3:32 PM, Anatol Pomozov
 wrote:
>
> Here is timeline for the crash in case if kset_find_obj() searches for
> an object tht nobody holds and other thread is doing kobject_put()
> on the same kobject:
>
> THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put())
> splin_lock()
>  atomic_dec_return(kobj->kref), counter 
> gets zero here
>  ... starts kobject cleanup 
>  spin_lock() // WAIT thread A in 
> kobj_kset_leave()
> iterate over kset->list
> atomic_inc(kobj->kref) (counter becomes 1)
> spin_unlock()
>  spin_lock() // taken
>  // it does not know that thread A 
> increased counter so it
>  remove obj from list
>  spin_unlock()
>  vfree(module) // frees module object 
> with containing kobj
>
> // kobj points to freed memory area!!
> koubject_put(kobj) // OOPS

This is a much more generic bug in kobjects, and I would hate to add
some random workaround for just one case of this bug like you do. The
more fundamental bug needs to be fixed too.

I think the more fundamental bugfix is to just fix kobject_get() to
return NULL if the refcount was zero, because in that case the kobject
no longer really exists.

So instead of having

kref_get(>kref);

it should do

if (!atomic_inc_not_zero(>kref.refcount))
kobj = NULL;

and I think that should fix your race automatically, no? Proper patch
attached (but TOTALLY UNTESTED - it seems to compile, though).

The problem is that we lose the warning for when the refcount is zero
and somebody does a kobject_get(), but that is ok *assuming* that
people actually check the return value of kobject_get() rather than
just "know" that if they passed in a non-NULL kobj, they'll get it
right back.

Greg - please take a look... I'm adding Al to the discussion too,
because Al just *loooves* these kinds of races ;)

  Linus


patch.diff
Description: Binary data


Re: [PATCH V4 1/6] clk: OMAP: introduce device tree binding to kernel clock data

2013-04-12 Thread Nishanth Menon
On 16:31-20130412, Tony Lindgren wrote:
> * Nishanth Menon  [130412 15:59]:
> > --- /dev/null
> > +++ b/drivers/clk/omap/clk.c
> > +/**
> > + * omap_clk_src_get() - Get OMAP clock from node name when needed
> > + * @clkspec:   clkspec argument
> > + * @data:  unused
> > + *
> > + * REVISIT: We assume the following:
> > + * 1. omap clock names end with _ck
> > + * 2. omap clock names are under 32 characters in length
> > + */
> > +static struct clk *omap_clk_src_get(struct of_phandle_args *clkspec, void 
> > *data)
> > +{
> > +   struct clk *clk;
> > +   char clk_name[32];
> > +   struct device_node *np = clkspec->np;
> > +
> > +   /* Set up things so consumer can call clk_get() with name */
> 
> I would leave out the comment above, it's a leftover from
> the clk_add_alias() version that we don't need because of
> of_clk_get().
> 
> > +   snprintf(clk_name, 32, "%s_ck", np->name);
> > +   clk = clk_get(NULL, clk_name);
> > +   if (IS_ERR(clk)) {
> > +   pr_err("%s: could not get clock %s(%ld)\n", __func__,
> > +  clk_name, PTR_ERR(clk));
> > +   goto out;
> > +   }
> > +   clk_put(clk);
> > +
> > +out:
> > +   return clk;
> > +}
> > +
> > +/**
> > + * omap_clk_probe() - create link from DT definition to clock data
> > + * @pdev:  device node
> > + *
> > + * NOTE: We assume that omap clocks are not removed.
> > + */
> 
> How about drop the comment on clocks being removed above.
> It no longer an issue, so maybe something like this instead:
> 
> * Note that we look up the clock lazily when the consumer
> * driver does of_clk_get() and initialize a NULL clock here.
> 
> > +static int omap_clk_probe(struct platform_device *pdev)
> > +{
> > +   int res;
> > +   struct device_node *np = pdev->dev.of_node;
> > +
> > +   /* This allows the driver to of_clk_get() */
> > +   res = of_clk_add_provider(np, omap_clk_src_get, NULL);
> > +   if (res)
> > +   dev_err(>dev, "could not add provider(%d)\n", res);
> > +
> > +   return res;
> > +}
> > +
> > +/* We assume here that OMAP clocks will not be removed */
> 
> Then the above comment can be removed too.
Thanks for checking up. Fixed all of them below, will post part of
series again, only if I need to address further comments in other
patches..

>From f96c04860794f9bbfe240a8661641a7c90dd1640 Mon Sep 17 00:00:00 2001
From: Nishanth Menon 
Date: Tue, 9 Apr 2013 19:26:40 -0500
Subject: [PATCH V5 1/6] clk: OMAP: introduce device tree binding to kernel clock
 data

OMAP clock data is located in arch/arm/mach-omap2/cclockXYZ_data.c.
However, this presents an obstacle for using these clock nodes in
Device Tree definitions. This is especially true for board specific
clocks initially. The fixed clocks are currently found via clock
aliases table. There are many possible approaches to this problem as
discussed in the following thread:
http://marc.info/?t=13637032569=1=2.
Highlights of the options:
a) device specific clk_add_alias:
   cons: driver handling required
b) using an generic clk node and indexing to reach the clock required.
   This is similar in approach taken by tegra and few other platforms.
   Example usage: clock = < 5>;
   cons: potential to have mismatches in indexed table and associated
   dtb data. In addition, managing continued documentation in bindings
   as clock indexing increases. Even though readability angle could be
   improved by using preprocessing of DT using macros, indexed
   approach is inherently risky from cases like the following:
   clk indexes in kernel:
   1 - mpu_dpll
   2 - aux_clk1
   3 - core_clk
   DT entry for peripheral X uses < 2> to reach aux_clk1. Now, let's
   say kernel updates indices to:
   1 - mpu_dpll
   2 - per_dpll
   3 - aux_clk1
   4 - core_clk
   using the old dtb(or dts missing an update), on new kernel which
   has updated indices will result in per_dpll now controlled for
   peripheral X without warning or any potential error detection.

   Even though we could claim this is user error, such errors are hard
   to track down and fix.

An alternate approach introduced here is to introduce device tree
bindings corresponding to the clock nodes required in DT definition
for SoC which automatically maps back to the definitions in
cclockXYZ_data.c.

The driver introduced here to do this mapping will eventually be the
place where the clock handling will migrate to. We need to consider
this angle as well so that the solution will be an valid transition
point for moving the clock data out of kernel image (into device tree
or firmware load etc..).

Overall strategy int

[PATCH tip/core/rcu 1/2] rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

These interfaces never did get used, so this commit removes them,
their rcutorture tests, and documentation referencing them.

Signed-off-by: Paul E. McKenney 
Reviewed-by: Lai Jiangshan 
---
 Documentation/RCU/checklist.txt |  6 --
 Documentation/RCU/torture.txt   |  6 --
 Documentation/RCU/whatisRCU.txt | 22 +++--
 include/linux/srcu.h| 43 -
 kernel/rcutorture.c | 39 -
 5 files changed, 7 insertions(+), 109 deletions(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 79e789b8..7703ec7 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -354,12 +354,6 @@ over a rather long period of time, but improvements are 
always welcome!
using RCU rather than SRCU, because RCU is almost always faster
and easier to use than is SRCU.
 
-   If you need to enter your read-side critical section in a
-   hardirq or exception handler, and then exit that same read-side
-   critical section in the task that was interrupted, then you need
-   to srcu_read_lock_raw() and srcu_read_unlock_raw(), which avoid
-   the lockdep checking that would otherwise this practice illegal.
-
Also unlike other forms of RCU, explicit initialization
and cleanup is required via init_srcu_struct() and
cleanup_srcu_struct().  These are passed a "struct srcu_struct"
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 7dce8a1..d8a5023 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -182,12 +182,6 @@ torture_type   The type of RCU to test, with string 
values as follows:
"srcu_expedited": srcu_read_lock(), srcu_read_unlock() and
synchronize_srcu_expedited().
 
-   "srcu_raw": srcu_read_lock_raw(), srcu_read_unlock_raw(),
-   and call_srcu().
-
-   "srcu_raw_sync": srcu_read_lock_raw(), srcu_read_unlock_raw(),
-   and synchronize_srcu().
-
"sched": preempt_disable(), preempt_enable(), and
call_rcu_sched().
 
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 10df0b8..0f0fb7c 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -842,9 +842,7 @@ SRCU:   Critical sections   Grace period
Barrier
 
srcu_read_lock  synchronize_srcusrcu_barrier
srcu_read_unlockcall_srcu
-   srcu_read_lock_raw  synchronize_srcu_expedited
-   srcu_read_unlock_raw
-   srcu_dereference
+   srcu_dereferencesynchronize_srcu_expedited
 
 SRCU:  Initialization/cleanup
init_srcu_struct
@@ -865,38 +863,32 @@ list can be helpful:
 
 a. Will readers need to block?  If so, you need SRCU.
 
-b. Is it necessary to start a read-side critical section in a
-   hardirq handler or exception handler, and then to complete
-   this read-side critical section in the task that was
-   interrupted?  If so, you need SRCU's srcu_read_lock_raw() and
-   srcu_read_unlock_raw() primitives.
-
-c. What about the -rt patchset?  If readers would need to block
+b. What about the -rt patchset?  If readers would need to block
in an non-rt kernel, you need SRCU.  If readers would block
in a -rt kernel, but not in a non-rt kernel, SRCU is not
necessary.
 
-d. Do you need to treat NMI handlers, hardirq handlers,
+c. Do you need to treat NMI handlers, hardirq handlers,
and code segments with preemption disabled (whether
via preempt_disable(), local_irq_save(), local_bh_disable(),
or some other mechanism) as if they were explicit RCU readers?
If so, RCU-sched is the only choice that will work for you.
 
-e. Do you need RCU grace periods to complete even in the face
+d. Do you need RCU grace periods to complete even in the face
of softirq monopolization of one or more of the CPUs?  For
example, is your code subject to network-based denial-of-service
attacks?  If so, you need RCU-bh.
 
-f. Is your workload too update-intensive for normal use of
+e. Is your workload too update-intensive for normal use of
RCU, but inappropriate for other synchronization mechanisms?
If so, consider SLAB_DESTROY_BY_RCU.  But please be careful!
 
-g. Do you need read-side critical sections that are respected
+f. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU?  If so, SRCU is the
only choice that will work for you.
 
-h. Otherwise, use RCU.
+g. Otherwise, use RCU.
 
 Of course, this all assumes 

[PATCH tip/core/rcu 2/2] powerpc,kvm: Fix unbalanced srcu_read_[un]lock()

2013-04-12 Thread Paul E. McKenney
From: Lai Jiangshan 

At the up_out: label in kvmppc_hv_setup_htab_rma(), srcu_read_lock()
is still held.  This commit therefore have to releases it before return.

Signed-off-by: Lai Jiangshan 
Cc: Marcelo Tosatti 
Cc: Gleb Natapov 
Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: k...@vger.kernel.org
Cc: kvm-...@vger.kernel.org
Signed-off-by: Paul E. McKenney 
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 80dcc53..c26740e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1799,7 +1799,7 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
 
  up_out:
up_read(>mm->mmap_sem);
-   goto out;
+   goto out_srcu;
 }
 
 int kvmppc_core_init_vm(struct kvm *kvm)
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 0/2] SRCU changes for 3.11

2013-04-12 Thread Paul E. McKenney
Hello!

This series provides some SRCU changes:

1.  Remove srcu_read_lock_raw() and srcu_read_unlock_raw().  These
never did get used, and have not been used for some time, so
it is time for them to go.

2.  Fix a bug where srcu_read_lock() is not released upon return
from kvmppc_hv_setup_htab_rma().

Thanx, Paul


 b/Documentation/RCU/checklist.txt |6 -
 b/Documentation/RCU/torture.txt   |6 -
 b/Documentation/RCU/whatisRCU.txt |   22 ++-
 b/arch/powerpc/kvm/book3s_hv.c|2 -
 b/include/linux/srcu.h|   43 --
 b/kernel/rcutorture.c |   39 --
 6 files changed, 8 insertions(+), 110 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 4/8] rcu: Switch callers from rcu_process_gp_end() to note_gp_changes()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Because note_gp_changes() now incorporates rcu_process_gp_end() function,
this commit switches to the former and eliminates the latter.  In
addition, this commit changes external calls from __rcu_process_gp_end()
to __note_gp_changes().

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c| 31 +++
 kernel/rcutree_plugin.h |  2 +-
 2 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index dded193..9040e0f 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1334,28 +1334,6 @@ static void note_gp_changes(struct rcu_state *rsp, 
struct rcu_data *rdp)
 }
 
 /*
- * Advance this CPU's callbacks, but only if the current grace period
- * has ended.  This may be called only from the CPU to whom the rdp
- * belongs.
- */
-static void
-rcu_process_gp_end(struct rcu_state *rsp, struct rcu_data *rdp)
-{
-   unsigned long flags;
-   struct rcu_node *rnp;
-
-   local_irq_save(flags);
-   rnp = rdp->mynode;
-   if (rdp->completed == ACCESS_ONCE(rnp->completed) || /* outside lock. */
-   !raw_spin_trylock(>lock)) { /* irqs already off, so later. */
-   local_irq_restore(flags);
-   return;
-   }
-   __rcu_process_gp_end(rsp, rnp, rdp);
-   raw_spin_unlock_irqrestore(>lock, flags);
-}
-
-/*
  * Did someone else start a new RCU grace period start since we last
  * checked?  Update local state appropriately if so.  Must be called
  * on the CPU corresponding to rdp.
@@ -1383,9 +1361,6 @@ check_for_new_grace_period(struct rcu_state *rsp, struct 
rcu_data *rdp)
 static void
 rcu_start_gp_per_cpu(struct rcu_state *rsp, struct rcu_node *rnp, struct 
rcu_data *rdp)
 {
-   /* Prior grace period ended, so advance callbacks for current CPU. */
-   __rcu_process_gp_end(rsp, rnp, rdp);
-
/* Set state so that this CPU will detect the next quiescent state. */
__note_gp_changes(rsp, rnp, rdp);
 }
@@ -1521,7 +1496,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
ACCESS_ONCE(rnp->completed) = rsp->gpnum;
rdp = this_cpu_ptr(rsp->rda);
if (rnp == rdp->mynode)
-   __rcu_process_gp_end(rsp, rnp, rdp);
+   __note_gp_changes(rsp, rnp, rdp);
nocb += rcu_future_gp_cleanup(rsp, rnp);
raw_spin_unlock_irq(>lock);
cond_resched();
@@ -2254,7 +2229,7 @@ __rcu_process_callbacks(struct rcu_state *rsp)
WARN_ON_ONCE(rdp->beenonline == 0);
 
/* Handle the end of a grace period that some other CPU ended.  */
-   rcu_process_gp_end(rsp, rdp);
+   note_gp_changes(rsp, rdp);
 
/* Update RCU state based on any recent quiescent states. */
rcu_check_quiescent_state(rsp, rdp);
@@ -2340,7 +2315,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct 
rcu_data *rdp,
if (unlikely(rdp->qlen > rdp->qlen_last_fqs_check + qhimark)) {
 
/* Are we ignoring a completed grace period? */
-   rcu_process_gp_end(rsp, rdp);
+   note_gp_changes(rsp, rdp);
check_for_new_grace_period(rsp, rdp);
 
/* Start a new grace period if one not already started. */
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e6cf7e5..69af628 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1627,7 +1627,7 @@ static bool rcu_try_advance_all_cbs(void)
 */
if (rdp->completed != rnp->completed &&
rdp->nxttail[RCU_DONE_TAIL] != rdp->nxttail[RCU_NEXT_TAIL])
-   rcu_process_gp_end(rsp, rdp);
+   note_gp_changes(rsp, rdp);
 
if (cpu_has_callbacks_ready_to_invoke(rdp))
cbs_ready = true;
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 1/8] rcu: Move code to apply callback-numbering simplifications

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

The addition of callback numbering allows combining the detection of the
ends of old grace periods and the beginnings of new grace periods.  This
commit moves code to set the stage for this combining.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 118 +++
 1 file changed, 59 insertions(+), 59 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index bc3eac5..8ebc3ff 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -975,65 +975,6 @@ void rcu_cpu_stall_reset(void)
 }
 
 /*
- * Update CPU-local rcu_data state to record the newly noticed grace period.
- * This is used both when we started the grace period and when we notice
- * that someone else started the grace period.  The caller must hold the
- * ->lock of the leaf rcu_node structure corresponding to the current CPU,
- *  and must have irqs disabled.
- */
-static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
-{
-   if (rdp->gpnum != rnp->gpnum) {
-   /*
-* If the current grace period is waiting for this CPU,
-* set up to detect a quiescent state, otherwise don't
-* go looking for one.
-*/
-   rdp->gpnum = rnp->gpnum;
-   trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpustart");
-   rdp->passed_quiesce = 0;
-   rdp->qs_pending = !!(rnp->qsmask & rdp->grpmask);
-   zero_cpu_stall_ticks(rdp);
-   }
-}
-
-static void note_new_gpnum(struct rcu_state *rsp, struct rcu_data *rdp)
-{
-   unsigned long flags;
-   struct rcu_node *rnp;
-
-   local_irq_save(flags);
-   rnp = rdp->mynode;
-   if (rdp->gpnum == ACCESS_ONCE(rnp->gpnum) || /* outside lock. */
-   !raw_spin_trylock(>lock)) { /* irqs already off, so later. */
-   local_irq_restore(flags);
-   return;
-   }
-   __note_new_gpnum(rsp, rnp, rdp);
-   raw_spin_unlock_irqrestore(>lock, flags);
-}
-
-/*
- * Did someone else start a new RCU grace period start since we last
- * checked?  Update local state appropriately if so.  Must be called
- * on the CPU corresponding to rdp.
- */
-static int
-check_for_new_grace_period(struct rcu_state *rsp, struct rcu_data *rdp)
-{
-   unsigned long flags;
-   int ret = 0;
-
-   local_irq_save(flags);
-   if (rdp->gpnum != rsp->gpnum) {
-   note_new_gpnum(rsp, rdp);
-   ret = 1;
-   }
-   local_irq_restore(flags);
-   return ret;
-}
-
-/*
  * Initialize the specified rcu_data structure's callback list to empty.
  */
 static void init_callback_list(struct rcu_data *rdp)
@@ -1350,6 +1291,45 @@ __rcu_process_gp_end(struct rcu_state *rsp, struct 
rcu_node *rnp, struct rcu_dat
 }
 
 /*
+ * Update CPU-local rcu_data state to record the newly noticed grace period.
+ * This is used both when we started the grace period and when we notice
+ * that someone else started the grace period.  The caller must hold the
+ * ->lock of the leaf rcu_node structure corresponding to the current CPU,
+ *  and must have irqs disabled.
+ */
+static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
+{
+   if (rdp->gpnum != rnp->gpnum) {
+   /*
+* If the current grace period is waiting for this CPU,
+* set up to detect a quiescent state, otherwise don't
+* go looking for one.
+*/
+   rdp->gpnum = rnp->gpnum;
+   trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpustart");
+   rdp->passed_quiesce = 0;
+   rdp->qs_pending = !!(rnp->qsmask & rdp->grpmask);
+   zero_cpu_stall_ticks(rdp);
+   }
+}
+
+static void note_new_gpnum(struct rcu_state *rsp, struct rcu_data *rdp)
+{
+   unsigned long flags;
+   struct rcu_node *rnp;
+
+   local_irq_save(flags);
+   rnp = rdp->mynode;
+   if (rdp->gpnum == ACCESS_ONCE(rnp->gpnum) || /* outside lock. */
+   !raw_spin_trylock(>lock)) { /* irqs already off, so later. */
+   local_irq_restore(flags);
+   return;
+   }
+   __note_new_gpnum(rsp, rnp, rdp);
+   raw_spin_unlock_irqrestore(>lock, flags);
+}
+
+/*
  * Advance this CPU's callbacks, but only if the current grace period
  * has ended.  This may be called only from the CPU to whom the rdp
  * belongs.
@@ -1372,6 +1352,26 @@ rcu_process_gp_end(struct rcu_state *rsp, struct 
rcu_data *rdp)
 }
 
 /*
+ * Did someone else start a new RCU grace period start since we last
+ * checked?  Update local state appropriately if so.  Must be called
+ * on the CPU corresponding to rdp.
+ */
+static int
+check_for_new_grace_period(struct rcu_state *rsp, struct rcu_data *rdp)
+{
+   unsigned long flags;
+   int ret = 0;
+
+   local_irq_save(flags);
+   if (rdp->gpnum != 

[PATCH tip/core/rcu 3/8] rcu: Rename note_new_gpnum() to note_gp_changes()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Because note_new_gpnum() now also checks for the ends of old grace periods,
this commit changes its name to note_gp_changes().  Later commits will merge
rcu_process_gp_end() into note_gp_changes().

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index a57bac3..dded193 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1297,7 +1297,7 @@ __rcu_process_gp_end(struct rcu_state *rsp, struct 
rcu_node *rnp, struct rcu_dat
  * ->lock of the leaf rcu_node structure corresponding to the current CPU,
  *  and must have irqs disabled.
  */
-static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
+static void __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
 {
/* Handle the ends of any preceding grace periods first. */
__rcu_process_gp_end(rsp, rnp, rdp);
@@ -1316,19 +1316,20 @@ static void __note_new_gpnum(struct rcu_state *rsp, 
struct rcu_node *rnp, struct
}
 }
 
-static void note_new_gpnum(struct rcu_state *rsp, struct rcu_data *rdp)
+static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp)
 {
unsigned long flags;
struct rcu_node *rnp;
 
local_irq_save(flags);
rnp = rdp->mynode;
-   if (rdp->gpnum == ACCESS_ONCE(rnp->gpnum) || /* outside lock. */
+   if ((rdp->gpnum == ACCESS_ONCE(rnp->gpnum) &&
+rdp->completed == ACCESS_ONCE(rnp->completed)) || /* w/out lock. */
!raw_spin_trylock(>lock)) { /* irqs already off, so later. */
local_irq_restore(flags);
return;
}
-   __note_new_gpnum(rsp, rnp, rdp);
+   __note_gp_changes(rsp, rnp, rdp);
raw_spin_unlock_irqrestore(>lock, flags);
 }
 
@@ -1367,7 +1368,7 @@ check_for_new_grace_period(struct rcu_state *rsp, struct 
rcu_data *rdp)
 
local_irq_save(flags);
if (rdp->gpnum != rsp->gpnum) {
-   note_new_gpnum(rsp, rdp);
+   note_gp_changes(rsp, rdp);
ret = 1;
}
local_irq_restore(flags);
@@ -1386,7 +1387,7 @@ rcu_start_gp_per_cpu(struct rcu_state *rsp, struct 
rcu_node *rnp, struct rcu_dat
__rcu_process_gp_end(rsp, rnp, rdp);
 
/* Set state so that this CPU will detect the next quiescent state. */
-   __note_new_gpnum(rsp, rnp, rdp);
+   __note_gp_changes(rsp, rnp, rdp);
 }
 
 /*
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 7/8] rcu: Inline trivial wrapper function rcu_start_gp_per_cpu()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Given the changes that introduce note_gp_change(), rcu_start_gp_per_cpu()
is now a trivial wrapper function with only one caller.  This commit
therefore inlines it into its sole call site.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f6cf5e1..0848341 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1298,18 +1298,6 @@ static void note_gp_changes(struct rcu_state *rsp, 
struct rcu_data *rdp)
 }
 
 /*
- * Do per-CPU grace-period initialization for running CPU.  The caller
- * must hold the lock of the leaf rcu_node structure corresponding to
- * this CPU.
- */
-static void
-rcu_start_gp_per_cpu(struct rcu_state *rsp, struct rcu_node *rnp, struct 
rcu_data *rdp)
-{
-   /* Set state so that this CPU will detect the next quiescent state. */
-   __note_gp_changes(rsp, rnp, rdp);
-}
-
-/*
  * Initialize a new grace period.
  */
 static int rcu_gp_init(struct rcu_state *rsp)
@@ -1357,7 +1345,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
WARN_ON_ONCE(rnp->completed != rsp->completed);
ACCESS_ONCE(rnp->completed) = rsp->completed;
if (rnp == rdp->mynode)
-   rcu_start_gp_per_cpu(rsp, rnp, rdp);
+   __note_gp_changes(rsp, rnp, rdp);
rcu_preempt_boost_start_gp(rnp);
trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
rnp->level, rnp->grplo,
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 8/8] rcu: Move redundant call to note_gp_changes() into called function

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

The __rcu_process_callbacks() invokes note_gp_changes() immediately
before invoking rcu_check_quiescent_state(), which conditionally
invokes that same function.  This commit therefore eliminates the
call to note_gp_changes() in __rcu_process_callbacks() in favor of
making unconditional to call from rcu_check_quiescent_state() to
note_gp_changes().

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 0848341..12094bd 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1694,11 +1694,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct 
rcu_data *rdp)
 static void
 rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
 {
-   /* If there is now a new grace period, record and return. */
-   if (rdp->gpnum != rsp->gpnum) {
-   note_gp_changes(rsp, rdp);
-   return;
-   }
+   /* Check for grace-period ends and beginnings. */
+   note_gp_changes(rsp, rdp);
 
/*
 * Does this CPU still need to do its part for current grace period?
@@ -2162,9 +2159,6 @@ __rcu_process_callbacks(struct rcu_state *rsp)
 
WARN_ON_ONCE(rdp->beenonline == 0);
 
-   /* Handle the end of a grace period that some other CPU ended.  */
-   note_gp_changes(rsp, rdp);
-
/* Update RCU state based on any recent quiescent states. */
rcu_check_quiescent_state(rsp, rdp);
 
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 6/8] rcu: Eliminate check_for_new_grace_period() wrapper function

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

One of the calls to check_for_new_grace_period() is now redundant due to
an immediately preceding call to note_gp_changes().  Eliminating this
redundant call leaves a single caller, which is simpler if inlined.
This commit therefore eliminates the redundant call and inlines the
body of check_for_new_grace_period() into the single remaining call site.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 25 +++--
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ca07f2d..f6cf5e1 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1298,26 +1298,6 @@ static void note_gp_changes(struct rcu_state *rsp, 
struct rcu_data *rdp)
 }
 
 /*
- * Did someone else start a new RCU grace period start since we last
- * checked?  Update local state appropriately if so.  Must be called
- * on the CPU corresponding to rdp.
- */
-static int
-check_for_new_grace_period(struct rcu_state *rsp, struct rcu_data *rdp)
-{
-   unsigned long flags;
-   int ret = 0;
-
-   local_irq_save(flags);
-   if (rdp->gpnum != rsp->gpnum) {
-   note_gp_changes(rsp, rdp);
-   ret = 1;
-   }
-   local_irq_restore(flags);
-   return ret;
-}
-
-/*
  * Do per-CPU grace-period initialization for running CPU.  The caller
  * must hold the lock of the leaf rcu_node structure corresponding to
  * this CPU.
@@ -1727,8 +1707,10 @@ static void
 rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
 {
/* If there is now a new grace period, record and return. */
-   if (check_for_new_grace_period(rsp, rdp))
+   if (rdp->gpnum != rsp->gpnum) {
+   note_gp_changes(rsp, rdp);
return;
+   }
 
/*
 * Does this CPU still need to do its part for current grace period?
@@ -2280,7 +2262,6 @@ static void __call_rcu_core(struct rcu_state *rsp, struct 
rcu_data *rdp,
 
/* Are we ignoring a completed grace period? */
note_gp_changes(rsp, rdp);
-   check_for_new_grace_period(rsp, rdp);
 
/* Start a new grace period if one not already started. */
if (!rcu_gp_in_progress(rsp)) {
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 5/8] rcu: Merge __rcu_process_gp_end() into __note_gp_changes()

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

This commit eliminates some duplicated code by merging
__rcu_process_gp_end() into __note_gp_changes().

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 48 ++--
 1 file changed, 6 insertions(+), 42 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 9040e0f..ca07f2d 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1244,18 +1244,16 @@ static void rcu_advance_cbs(struct rcu_state *rsp, 
struct rcu_node *rnp,
 }
 
 /*
- * Advance this CPU's callbacks, but only if the current grace period
- * has ended.  This may be called only from the CPU to whom the rdp
- * belongs.  In addition, the corresponding leaf rcu_node structure's
- * ->lock must be held by the caller, with irqs disabled.
+ * Update CPU-local rcu_data state to record the beginnings and ends of
+ * grace periods.  The caller must hold the ->lock of the leaf rcu_node
+ * structure corresponding to the current CPU, and must have irqs disabled.
  */
-static void
-__rcu_process_gp_end(struct rcu_state *rsp, struct rcu_node *rnp, struct 
rcu_data *rdp)
+static void __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
 {
-   /* Did another grace period end? */
+   /* Handle the ends of any preceding grace periods first. */
if (rdp->completed == rnp->completed) {
 
-   /* No, so just accelerate recent callbacks. */
+   /* No grace period end, so just accelerate recent callbacks. */
rcu_accelerate_cbs(rsp, rnp, rdp);
 
} else {
@@ -1266,41 +1264,7 @@ __rcu_process_gp_end(struct rcu_state *rsp, struct 
rcu_node *rnp, struct rcu_dat
/* Remember that we saw this grace-period completion. */
rdp->completed = rnp->completed;
trace_rcu_grace_period(rsp->name, rdp->gpnum, "cpuend");
-
-   /*
-* If we were in an extended quiescent state, we may have
-* missed some grace periods that others CPUs handled on
-* our behalf. Catch up with this state to avoid noting
-* spurious new grace periods.  If another grace period
-* has started, then rnp->gpnum will have advanced, so
-* we will detect this later on.  Of course, any quiescent
-* states we found for the old GP are now invalid.
-*/
-   if (ULONG_CMP_LT(rdp->gpnum, rdp->completed)) {
-   rdp->gpnum = rdp->completed;
-   rdp->passed_quiesce = 0;
-   }
-
-   /*
-* If RCU does not need a quiescent state from this CPU,
-* then make sure that this CPU doesn't go looking for one.
-*/
-   if ((rnp->qsmask & rdp->grpmask) == 0)
-   rdp->qs_pending = 0;
}
-}
-
-/*
- * Update CPU-local rcu_data state to record the newly noticed grace period.
- * This is used both when we started the grace period and when we notice
- * that someone else started the grace period.  The caller must hold the
- * ->lock of the leaf rcu_node structure corresponding to the current CPU,
- *  and must have irqs disabled.
- */
-static void __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
-{
-   /* Handle the ends of any preceding grace periods first. */
-   __rcu_process_gp_end(rsp, rnp, rdp);
 
if (rdp->gpnum != rnp->gpnum) {
/*
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 2/8] rcu: Make __note_new_gpnum() check for ends of prior grace periods

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

The current implementation can detect the beginning of a new grace period
before noting the end of a previous grace period.  Although the current
implementation correctly handles this sort of nonsense, it would be
good to reduce RCU's state space by making such nonsense unnecessary,
which is now possible thanks to the fact that RCU's callback groups are
now numbered.

This commit therefore makes __note_new_gpnum() invoke
__rcu_process_gp_end() in order to note the ends of prior grace
periods before noting the beginnings of new grace periods.
Of course, this now means that note_new_gpnum() notes both the
beginnings and ends of grace periods, and could therefore be
used in place of rcu_process_gp_end().  But that is a job for
later commits.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8ebc3ff..a57bac3 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1299,6 +1299,9 @@ __rcu_process_gp_end(struct rcu_state *rsp, struct 
rcu_node *rnp, struct rcu_dat
  */
 static void __note_new_gpnum(struct rcu_state *rsp, struct rcu_node *rnp, 
struct rcu_data *rdp)
 {
+   /* Handle the ends of any preceding grace periods first. */
+   __rcu_process_gp_end(rsp, rnp, rdp);
+
if (rdp->gpnum != rnp->gpnum) {
/*
 * If the current grace period is waiting for this CPU,
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 0/8] RCU callback-numbering simplifications for 3.11

2013-04-12 Thread Paul E. McKenney
Hello!

This series takes advantage of callback numbering to simplify RCU's
grace-period machinery, in some cases also reducing the number of
lock acquisitions (though the resulting change in performance is not
perceptible).  The individual patches are as follows:

1.  Move code to make way for the code-combining in later patches.
This commit makes no changes, just moves code.

2.  Make __note_new_gpnum() also check for the ends of prior grace
periods, thus eliminating the earlier possibility of a given
CPU becoming aware of the start of the next grace period before
becoming aware of the end of the previous grace period.  Yes,
the code did handle this correctly, but now it doesn't need to.
More important, now I don't need to think about how it handles
this correctly.

3.  Rename note_new_gpnum() to note_gp_changes() in preparation for
later merge of rcu_process_gp_end() into this function.

4.  Change calls to rcu_process_gp_end() to instead call
note_gp_changes(), and also remove the now-used rcu_process_gp_end().

5.  Remove duplicate code by merging __rcu_process_gp_end() into
__note_gp_changes().

6.  Eliminate now-redundant call to check_for_new_grace_period().  This
leaves only a single caller, so inline check_for_new_grace_period().

7.  Given that rcu_start_gp_per_cpu() is a trivial wrapper function
with only one caller, inline it into its sole remaining call site.

8.  Eliminate now-redundant call to note_gp_changes().

Thanx, Paul

 b/kernel/rcutree.c|  262 ++
 b/kernel/rcutree_plugin.h |2 
 2 files changed, 85 insertions(+), 179 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 1/6] clk: OMAP: introduce device tree binding to kernel clock data

2013-04-12 Thread Tony Lindgren
* Nishanth Menon  [130412 15:59]:
> --- /dev/null
> +++ b/drivers/clk/omap/clk.c
> +/**
> + * omap_clk_src_get() - Get OMAP clock from node name when needed
> + * @clkspec: clkspec argument
> + * @data:unused
> + *
> + * REVISIT: We assume the following:
> + * 1. omap clock names end with _ck
> + * 2. omap clock names are under 32 characters in length
> + */
> +static struct clk *omap_clk_src_get(struct of_phandle_args *clkspec, void 
> *data)
> +{
> + struct clk *clk;
> + char clk_name[32];
> + struct device_node *np = clkspec->np;
> +
> + /* Set up things so consumer can call clk_get() with name */

I would leave out the comment above, it's a leftover from
the clk_add_alias() version that we don't need because of
of_clk_get().

> + snprintf(clk_name, 32, "%s_ck", np->name);
> + clk = clk_get(NULL, clk_name);
> + if (IS_ERR(clk)) {
> + pr_err("%s: could not get clock %s(%ld)\n", __func__,
> +clk_name, PTR_ERR(clk));
> + goto out;
> + }
> + clk_put(clk);
> +
> +out:
> + return clk;
> +}
> +
> +/**
> + * omap_clk_probe() - create link from DT definition to clock data
> + * @pdev:device node
> + *
> + * NOTE: We assume that omap clocks are not removed.
> + */

How about drop the comment on clocks being removed above.
It no longer an issue, so maybe something like this instead:

* Note that we look up the clock lazily when the consumer
* driver does of_clk_get() and initialize a NULL clock here.

> +static int omap_clk_probe(struct platform_device *pdev)
> +{
> + int res;
> + struct device_node *np = pdev->dev.of_node;
> +
> + /* This allows the driver to of_clk_get() */
> + res = of_clk_add_provider(np, omap_clk_src_get, NULL);
> + if (res)
> + dev_err(>dev, "could not add provider(%d)\n", res);
> +
> + return res;
> +}
> +
> +/* We assume here that OMAP clocks will not be removed */

Then the above comment can be removed too.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[v3] Support Elan Touchscreen eKTF product.

2013-04-12 Thread Tony Prisk

Hi Scott,

Not sure what is happening with the patch - seems to have been sitting 
for quite a while. Is it still being worked on?
I integrated it into our arm/arch-vt8500 testing tree (based on 3.9-rc6) 
as we have tablets with eKTF2127 controllers and noticed a few problems.


1) __dev attributes cause unused function warnings because they are 
deprecated.


2) elants_get_power_state() is only called from elants_resume() which is 
contained within a #ifdef CONFIG_PM_SLEEP. You should also move the 
elants_get_power_state() inside the #ifdef to remove the unused_function 
warning when compiled without CONFIG_PM.


Regards
Tony Prisk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] PCI/IA64: fix pci_dev->enable_cnt balance when doing pci hotplug

2013-04-12 Thread Bjorn Helgaas
On Mon, Apr 1, 2013 at 2:42 AM, Yijing Wang  wrote:
> In IA64 platform, we don't call pci_enable_bridges()
> when scan all pci buses during system boot up. But in
> X86 we do it in
>
> pcibios_assign_resources()
>pci_assign_unassigned_resources()
>...
>pci_enable_bridges()
>
> Then when we doing hot remove
>
> acpiphp_disable_slot()
>pci_stop_and_remove_bus_device()
>pci_stop_bus_device()
>   .
>   pcie_portdrv_remove()
>   pcie_port_device_remove()
>   pci_disable_device()   first decrease enable_cnt  here
>   pci_disable_device() second decrease enable_cnt
> So pci_dev->enable_cnt is unbalanced in IA64.
>
> Following Warning info found under IA64 when doing pci hotplug.
>
> [ cut here ]
> WARNING: at drivers/pci/pci.c:1397 pci_disable_device+0x1c0/0x220()
> Hardware name: MH8900
> Device pcieport
> disabling already-disabled device
> Modules linked in: acpiphp ipv6 ipmi_si(+) ipmi_devintf ipmi_msghandler fuse 
> vfaa
> t fat dm_mod iTCO_wdt iTCO_vendor_support lpc_ich i2c_i801 mfd_core i2c_core 
> sg
> sd_mod crc_t10dif ext3 mbcache jbd ata_piix
>
> Call Trace:
>  [] show_stack+0x80/0xa0
> sp=e00fd629fc00 bsp=e00fd62996e0
>  [] dump_stack+0x30/0x50
> sp=e00fd629fdd0 bsp=e00fd62996c8
>  [] warn_slowpath_common+0xc0/0x100
> sp=e00fd629fdd0 bsp=e00fd6299688
>  [] warn_slowpath_fmt+0x90/0xc0
> sp=e00fd629fdd0 bsp=e00fd6299628
>  [] pci_disable_device+0x1c0/0x220
> sp=e00fd629fe10 bsp=e00fd62995e8
>  [] pcie_portdrv_remove+0xc0/0xe0
> sp=e00fd629fe10 bsp=e00fd62995c8
>  [] pci_device_remove+0x90/0x1e0
> sp=e00fd629fe10 bsp=e00fd6299598
>  [] __device_release_driver+0x150/0x280
> sp=e00fd629fe10 bsp=e00fd6299560
>  [] device_release_driver+0x30/0x60
> sp=e00fd629fe10 bsp=e00fd6299538
>  [] bus_remove_device+0x2c0/0x3c0
> sp=e00fd629fe10 bsp=e00fd62994f0
>  [] device_del+0x290/0x440
> sp=e00fd629fe10 bsp=e00fd62994a8
>  [] pci_stop_bus_device+0x150/0x200
> sp=e00fd629fe10 bsp=e00fd6299478
>  [] pci_stop_bus_device+0x70/0x200
> sp=e00fd629fe10 bsp=e00fd6299448
>  [] pci_stop_bus_device+0x70/0x200
> sp=e00fd629fe10 bsp=e00fd6299418
>  [] pci_stop_and_remove_bus_device+0x20/0x60
> sp=e00fd629fe10 bsp=e00fd62993f0
>  [] acpiphp_disable_slot+0x240/0x4e0 [acpiphp]
> sp=e00fd629fe10 bsp=e00fd62993a0
>  [] disable_slot+0x50/0x160 [acpiphp]
> sp=e00fd629fe20 bsp=e00fd6299378
>  [] power_write_file+0x140/0x2a0
> sp=e00fd629fe20 bsp=e00fd6299348
>  [] pci_slot_attr_store+0x60/0xa0
> sp=e00fd629fe20 bsp=e00fd6299310
>  [] sysfs_write_file+0x240/0x340
> sp=e00fd629fe20 bsp=e00fd62992b8
>  [] vfs_write+0x1b0/0x3a0
> sp=e00fd629fe20 bsp=e00fd6299270
>  [] sys_write+0x90/0xe0
> sp=e00fd629fe20 bsp=e00fd62991f0
>  [] ia64_ret_from_syscall+0x0/0x20
> sp=e00fd629fe30 bsp=e00fd62991f0
>  [] __kernel_syscall_via_break+0x0/0x20
> sp=e00fd62a bsp=e00fd62991f0
> ---[ end trace 34d87c78dbff78ce ]---
> GSI 37 (level, low) -> CPU 15 (0x01e0) vector 68 unregistered
> pcie_pme :00:07.0:pcie01: unloading service driver pcie_pme
> aer :00:07.0:pcie02: unloading service driver aer
>
> Signed-off-by: Yijing Wang 
> Cc: Fenghua Yu 
> Cc: Yinghai Lu 
> Cc: Greg Kroah-Hartman 
> Cc: Thierry Reding 
> Cc: "Rafael J. Wysocki" 
> ---
>  arch/ia64/pci/pci.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
> index 60532ab..a557096 100644
> --- a/arch/ia64/pci/pci.c
> +++ b/arch/ia64/pci/pci.c
> @@ -383,6 +383,7 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
> *root)
> }
>
> pci_scan_child_bus(pbus);
> +   pci_enable_bridges(pbus);
> return pbus;
>
>  out3:

I think that with this patch, if you hot-add a PCI host bridge, you
will call pci_enable_bridges() twice (once in pci_acpi_scan_root() and
again in acpi_pci_root_add()), so there will be an enable_cnt error in
the opposite direction.

I'd like to see the pci_enable_bridges() 

[PATCH tip/core/rcu 4/7] rcu: Don't allocate bootmem from rcu_init()

2013-04-12 Thread Paul E. McKenney
From: Sasha Levin 

When rcu_init() is called we already have slab working, allocating
bootmem at that point results in warnings and an allocation from
slab.  This commit therefore changes alloc_bootmem_cpumask_var() to
alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called
from rcu_init().

Signed-off-by: Sasha Levin 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index ca6e39c..44b0998 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -86,7 +86,7 @@ static void __init rcu_bootup_announce_oddness(void)
 #ifdef CONFIG_RCU_NOCB_CPU
 #ifndef CONFIG_RCU_NOCB_CPU_NONE
if (!have_rcu_nocb_mask) {
-   alloc_bootmem_cpumask_var(_nocb_mask);
+   alloc_cpumask_var(_nocb_mask, GFP_KERNEL);
have_rcu_nocb_mask = true;
}
 #ifdef CONFIG_RCU_NOCB_CPU_ZERO
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 3/7] rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Adaptive-ticks CPUs inform RCU when they enter kernel mode, but they do
not necessarily turn the scheduler-clock tick back on.  This state of
affairs could result in RCU waiting on an adaptive-ticks CPU running
for an extended period in kernel mode.  Such a CPU will never run the
RCU state machine, and could therefore indefinitely extend the RCU state
machine, sooner or later resulting in an OOM condition.

This patch, inspired by an earlier patch by Frederic Weisbecker, therefore
causes RCU's force-quiescent-state processing to check for this condition
and to send an IPI to CPUs that remain in that state for too long.
"Too long" currently means about three jiffies by default, which is
quite some time for a CPU to remain in the kernel without blocking.
The rcu_tree.jiffies_till_first_fqs and rcutree.jiffies_till_next_fqs
sysfs variables may be used to tune "too long" if needed.

Reported-by: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c| 10 ++
 kernel/rcutree.h|  1 +
 kernel/rcutree_plugin.h | 17 +
 3 files changed, 28 insertions(+)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index bc3eac5..3710d74 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -799,6 +799,16 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
rdp->offline_fqs++;
return 1;
}
+
+   /*
+* There is a possibility that a CPU in adaptive-ticks state
+* might run in the kernel with the scheduling-clock tick disabled
+* for an extended time period.  Invoke rcu_kick_nohz_cpu() to
+* force the CPU to restart the scheduling-clock tick in this
+* CPU is in this state.
+*/
+   rcu_kick_nohz_cpu(rdp->cpu);
+
return 0;
 }
 
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 14ee407..08972c9 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -538,6 +538,7 @@ static bool rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
 static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp);
 static void rcu_spawn_nocb_kthreads(struct rcu_state *rsp);
 static bool init_nocb_callback_list(struct rcu_data *rdp);
+static void rcu_kick_nohz_cpu(int cpu);
 
 #endif /* #ifndef RCU_TREE_NONCORE */
 
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e6cf7e5..ca6e39c 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2336,3 +2336,20 @@ static bool init_nocb_callback_list(struct rcu_data *rdp)
 }
 
 #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
+
+/*
+ * An adaptive-ticks CPU can potentially execute in kernel mode for an
+ * arbitrarily long period of time with the scheduling-clock tick turned
+ * off.  RCU will be paying attention to this CPU because it is in the
+ * kernel, but the CPU cannot be guaranteed to be executing the RCU state
+ * machine because the scheduling-clock tick has been disabled.  Therefore,
+ * if an adaptive-ticks CPU is failing to respond to the current grace
+ * period and has not be idle from an RCU perspective, kick it.
+ */
+static void rcu_kick_nohz_cpu(int cpu)
+{
+#ifdef CONFIG_NO_HZ_EXTENDED
+   if (tick_nohz_full_cpu(cpu))
+   smp_send_reschedule(cpu);
+#endif /* #ifdef CONFIG_NO_HZ_EXTENDED */
+}
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 6/7] rcu: Drive quiescent-state-forcing delay from HZ

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Systems with HZ=100 can have slow bootup times due to the default
three-jiffy delays between quiescent-state forcing attempts.  This
commit therefore auto-tunes the RCU_JIFFIES_TILL_FORCE_QS value based
on the value of HZ.  However, this would break very large systems that
require more time between quiescent-state forcing attempts.  This
commit therefore also ups the default delay by one jiffy for each
256 CPUs that might be on the system (based off of nr_cpu_ids at
runtime, -not- NR_CPUS at build time).

Reported-by: Paul Mackerras 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 18 --
 kernel/rcutree.h | 12 +++-
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3710d74..cbfb4ee 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -218,8 +218,8 @@ module_param(blimit, long, 0444);
 module_param(qhimark, long, 0444);
 module_param(qlowmark, long, 0444);
 
-static ulong jiffies_till_first_fqs = RCU_JIFFIES_TILL_FORCE_QS;
-static ulong jiffies_till_next_fqs = RCU_JIFFIES_TILL_FORCE_QS;
+static ulong jiffies_till_first_fqs = ULONG_MAX;
+static ulong jiffies_till_next_fqs = ULONG_MAX;
 
 module_param(jiffies_till_first_fqs, ulong, 0644);
 module_param(jiffies_till_next_fqs, ulong, 0644);
@@ -3252,11 +3252,25 @@ static void __init rcu_init_one(struct rcu_state *rsp,
  */
 static void __init rcu_init_geometry(void)
 {
+   ulong d;
int i;
int j;
int n = nr_cpu_ids;
int rcu_capacity[MAX_RCU_LVLS + 1];
 
+   /*
+* Initialize any unspecified boot parameters.
+* The default values of jiffies_till_first_fqs and
+* jiffies_till_next_fqs are set to the RCU_JIFFIES_TILL_FORCE_QS
+* value, which is a function of HZ, then adding one for each
+* RCU_JIFFIES_FQS_DIV CPUs that might be on the system.
+*/
+   d = RCU_JIFFIES_TILL_FORCE_QS + nr_cpu_ids / RCU_JIFFIES_FQS_DIV;
+   if (jiffies_till_first_fqs == ULONG_MAX)
+   jiffies_till_first_fqs = d;
+   if (jiffies_till_next_fqs == ULONG_MAX)
+   jiffies_till_next_fqs = d;
+
/* If the compile-time values are accurate, just leave. */
if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF &&
nr_cpu_ids == NR_CPUS)
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 08972c9..7d5f876 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -342,7 +342,17 @@ struct rcu_data {
 #define RCU_FORCE_QS   3   /* Need to force quiescent state. */
 #define RCU_SIGNAL_INITRCU_SAVE_DYNTICK
 
-#define RCU_JIFFIES_TILL_FORCE_QS   3  /* for rsp->jiffies_force_qs */
+#if HZ > 500
+#define RCU_JIFFIES_TILL_FORCE_QS   3  /* for jiffies_till_first_fqs */
+#elif HZ > 250
+#define RCU_JIFFIES_TILL_FORCE_QS   2
+#else
+#define RCU_JIFFIES_TILL_FORCE_QS   1
+#endif
+#define RCU_JIFFIES_FQS_DIV256 /* Very large systems need */
+   /*  more delay between bouts */
+   /*  of quiescent-state */
+   /*  forcing. */
 
 #define RCU_STALL_RAT_DELAY2   /* Allow other CPUs time */
/*  to take at least one */
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 5/7] rcu: Remove "Experimental" flags

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

After a release or two, features are no longer experimental.  Therefore,
this commit removes the "Experimental" tag from them.

Reported-by: Paul Gortmaker 
Signed-off-by: Paul E. McKenney 
---
 init/Kconfig|  2 +-
 kernel/rcutree_plugin.h | 10 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index a3a2304..bac0483 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -658,7 +658,7 @@ config RCU_BOOST_DELAY
  Accept the default if unsure.
 
 config RCU_NOCB_CPU
-   bool "Offload RCU callback processing from boot-selected CPUs 
(EXPERIMENTAL"
+   bool "Offload RCU callback processing from boot-selected CPUs"
depends on TREE_RCU || TREE_PREEMPT_RCU
default n
help
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 44b0998..dcab269 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -80,7 +80,7 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tFour-level hierarchy is enabled.\n");
 #endif
if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
-   pr_info("\tExperimental boot-time adjustment of leaf fanout to 
%d.\n", rcu_fanout_leaf);
+   pr_info("\tBoot-time adjustment of leaf fanout to %d.\n", 
rcu_fanout_leaf);
if (nr_cpu_ids != NR_CPUS)
pr_info("\tRCU restricting CPUs from NR_CPUS=%d to 
nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
 #ifdef CONFIG_RCU_NOCB_CPU
@@ -90,19 +90,19 @@ static void __init rcu_bootup_announce_oddness(void)
have_rcu_nocb_mask = true;
}
 #ifdef CONFIG_RCU_NOCB_CPU_ZERO
-   pr_info("\tExperimental no-CBs CPU 0\n");
+   pr_info("\tOffload RCU callbacks from CPU 0\n");
cpumask_set_cpu(0, rcu_nocb_mask);
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ZERO */
 #ifdef CONFIG_RCU_NOCB_CPU_ALL
-   pr_info("\tExperimental no-CBs for all CPUs\n");
+   pr_info("\tOffload RCU callbacks from all CPUs\n");
cpumask_setall(rcu_nocb_mask);
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_NONE */
if (have_rcu_nocb_mask) {
cpulist_scnprintf(nocb_buf, sizeof(nocb_buf), rcu_nocb_mask);
-   pr_info("\tExperimental no-CBs CPUs: %s.\n", nocb_buf);
+   pr_info("\tOffload RCU callbacks from CPUs: %s.\n", nocb_buf);
if (rcu_nocb_poll)
-   pr_info("\tExperimental polled no-CBs CPUs.\n");
+   pr_info("\tPoll for callbacks from no-CBs CPUs.\n");
}
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
 }
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 2/7] rcu: Convert rcutree_plugin.h printk calls

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

This commit converts printk() calls to the corresponding pr_*() calls.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h | 45 ++---
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index d084ae3..e6cf7e5 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -52,38 +52,37 @@ static char __initdata nocb_buf[NR_CPUS * 5];
 static void __init rcu_bootup_announce_oddness(void)
 {
 #ifdef CONFIG_RCU_TRACE
-   printk(KERN_INFO "\tRCU debugfs-based tracing is enabled.\n");
+   pr_info("\tRCU debugfs-based tracing is enabled.\n");
 #endif
 #if (defined(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 64) || 
(!defined(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 32)
-   printk(KERN_INFO "\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
+   pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
   CONFIG_RCU_FANOUT);
 #endif
 #ifdef CONFIG_RCU_FANOUT_EXACT
-   printk(KERN_INFO "\tHierarchical RCU autobalancing is disabled.\n");
+   pr_info("\tHierarchical RCU autobalancing is disabled.\n");
 #endif
 #ifdef CONFIG_RCU_FAST_NO_HZ
-   printk(KERN_INFO
-  "\tRCU dyntick-idle grace-period acceleration is enabled.\n");
+   pr_info("\tRCU dyntick-idle grace-period acceleration is enabled.\n");
 #endif
 #ifdef CONFIG_PROVE_RCU
-   printk(KERN_INFO "\tRCU lockdep checking is enabled.\n");
+   pr_info("\tRCU lockdep checking is enabled.\n");
 #endif
 #ifdef CONFIG_RCU_TORTURE_TEST_RUNNABLE
-   printk(KERN_INFO "\tRCU torture testing starts during boot.\n");
+   pr_info("\tRCU torture testing starts during boot.\n");
 #endif
 #if defined(CONFIG_TREE_PREEMPT_RCU) && !defined(CONFIG_RCU_CPU_STALL_VERBOSE)
-   printk(KERN_INFO "\tDump stacks of tasks blocking RCU-preempt GP.\n");
+   pr_info("\tDump stacks of tasks blocking RCU-preempt GP.\n");
 #endif
 #if defined(CONFIG_RCU_CPU_STALL_INFO)
-   printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
+   pr_info("\tAdditional per-CPU info printed with stalls.\n");
 #endif
 #if NUM_RCU_LVL_4 != 0
-   printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");
+   pr_info("\tFour-level hierarchy is enabled.\n");
 #endif
if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
-   printk(KERN_INFO "\tExperimental boot-time adjustment of leaf 
fanout to %d.\n", rcu_fanout_leaf);
+   pr_info("\tExperimental boot-time adjustment of leaf fanout to 
%d.\n", rcu_fanout_leaf);
if (nr_cpu_ids != NR_CPUS)
-   printk(KERN_INFO "\tRCU restricting CPUs from NR_CPUS=%d to 
nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
+   pr_info("\tRCU restricting CPUs from NR_CPUS=%d to 
nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids);
 #ifdef CONFIG_RCU_NOCB_CPU
 #ifndef CONFIG_RCU_NOCB_CPU_NONE
if (!have_rcu_nocb_mask) {
@@ -122,7 +121,7 @@ static int rcu_preempted_readers_exp(struct rcu_node *rnp);
  */
 static void __init rcu_bootup_announce(void)
 {
-   printk(KERN_INFO "Preemptible hierarchical RCU implementation.\n");
+   pr_info("Preemptible hierarchical RCU implementation.\n");
rcu_bootup_announce_oddness();
 }
 
@@ -489,13 +488,13 @@ static void rcu_print_detail_task_stall(struct rcu_state 
*rsp)
 
 static void rcu_print_task_stall_begin(struct rcu_node *rnp)
 {
-   printk(KERN_ERR "\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
+   pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
   rnp->level, rnp->grplo, rnp->grphi);
 }
 
 static void rcu_print_task_stall_end(void)
 {
-   printk(KERN_CONT "\n");
+   pr_cont("\n");
 }
 
 #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
@@ -525,7 +524,7 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
t = list_entry(rnp->gp_tasks,
   struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, >blkd_tasks, rcu_node_entry) {
-   printk(KERN_CONT " P%d", t->pid);
+   pr_cont(" P%d", t->pid);
ndetected++;
}
rcu_print_task_stall_end();
@@ -941,7 +940,7 @@ static struct rcu_state *rcu_state = _sched_state;
  */
 static void __init rcu_bootup_announce(void)
 {
-   printk(KERN_INFO "Hierarchical RCU implementation.\n");
+   pr_info("Hierarchical RCU implementation.\n");
rcu_bootup_announce_oddness();
 }
 
@@ -1882,7 +1881,7 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
 /* Initiate the stall-info list. */
 static void print_cpu_stall_info_begin(void)
 {
-   printk(KERN_CONT "\n");
+   pr_cont("\n");
 }
 
 /*
@@ -1913,7 +1912,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, 
int cpu)
ticks_value = rsp->gpnum - rdp->gpnum;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
-   printk(KERN_ERR "\t%d: (%lu %s) 

[PATCH tip/core/rcu 7/7] rcu: Merge adjacent identical ifdefs

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

Two ifdefs in kernel/rcupdate.c now have identical conditions with
nothing between them, so the commit merges them into a single ifdef.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcupdate.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 48ab703..faeea98 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -145,9 +145,6 @@ static struct lock_class_key rcu_sched_lock_key;
 struct lockdep_map rcu_sched_lock_map =
STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_sched", _sched_lock_key);
 EXPORT_SYMBOL_GPL(rcu_sched_lock_map);
-#endif
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 int debug_lockdep_rcu_enabled(void)
 {
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 1/7] rcu: Convert rcutree.c printk calls

2013-04-12 Thread Paul E. McKenney
From: "Paul E. McKenney" 

This commit converts printk() calls to the corresponding pr_*() calls.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2d5f94c..bc3eac5 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -856,7 +856,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 * See Documentation/RCU/stallwarn.txt for info on how to debug
 * RCU CPU stall warnings.
 */
-   printk(KERN_ERR "INFO: %s detected stalls on CPUs/tasks:",
+   pr_err("INFO: %s detected stalls on CPUs/tasks:",
   rsp->name);
print_cpu_stall_info_begin();
rcu_for_each_leaf_node(rsp, rnp) {
@@ -889,7 +889,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
   smp_processor_id(), (long)(jiffies - rsp->gp_start),
   rsp->gpnum, rsp->completed, totqlen);
if (ndetected == 0)
-   printk(KERN_ERR "INFO: Stall ended before state dump start\n");
+   pr_err("INFO: Stall ended before state dump start\n");
else if (!trigger_all_cpu_backtrace())
rcu_dump_cpu_stacks(rsp);
 
@@ -912,7 +912,7 @@ static void print_cpu_stall(struct rcu_state *rsp)
 * See Documentation/RCU/stallwarn.txt for info on how to debug
 * RCU CPU stall warnings.
 */
-   printk(KERN_ERR "INFO: %s self-detected stall on CPU", rsp->name);
+   pr_err("INFO: %s self-detected stall on CPU", rsp->name);
print_cpu_stall_info_begin();
print_cpu_stall_info(rsp, smp_processor_id());
print_cpu_stall_info_end();
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 0/7] RCU fixes for 3.11

2013-04-12 Thread Paul E. McKenney
Hello!

This series contains the following fixes for RCU:

1-2.Convert remaining printk() calls to pr_*().

3.  Kick adaptive-ticks CPUs that are holding up RCU grace periods.

4.  Don't allocate bootmem from rcu_init(), courtesy of Sasha Levin.

5.  Remove "Experimental" flags from old RCU Kconfig options.

6.  Automatically tune defaults for delays between attempts to
force quiescent states.

7.  Merge adjacent identical #ifdefs.

Thanx, Paul

 b/init/Kconfig|2 -
 b/kernel/rcupdate.c   |3 -
 b/kernel/rcutree.c|   34 ++---
 b/kernel/rcutree.h|   13 +++-
 b/kernel/rcutree_plugin.h |   74 +++---
 5 files changed, 87 insertions(+), 39 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] kexec: X86: Pass memory ranges via e820 table instead of memmap= boot parameter

2013-04-12 Thread H. Peter Anvin
Yes... That is one reason I think it is a real problem.


Dave Hansen  wrote:

>On 04/12/2013 07:56 AM, H. Peter Anvin wrote:
>> On 04/12/2013 07:31 AM, Vivek Goyal wrote:
 I also have to admit that I don't see the difference between
>/dev/mem
 and /dev/oldmem, as the former allows access to memory ranges
>outside
 the ones used by the current kernel, which is what the oldmem
>device
 seems to be intended to od.
>
>It varies from arch to arch of course.
>
>But, /dev/mem has restrictions on it, like CONFIG_STRICT_DEVMEM or the
>ARCH_HAS_VALID_PHYS_ADDR_RANGE.  There's a lot of stuff that depends on
>it, *and* folks have tried to fix it up so that it's not _as_ blatant
>of
>a way to completely screw your system.
>
>/dev/mem also tries to be nice to arches that have restrictions like:
>
>> /*
>>  * On ia64 if a page has been mapped
>somewhere as
>>  * uncached, then it must also be accessed
>uncached
>>  * by the kernel or data corruption may occur
>>  */
>
>I think /dev/oldmem isn't so nice and could actually cause some real
>problems if used on ia64 where the cached/uncached accesses are mixed.

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep/fan problem bisected to 73201dbec64aebf6b0dca855b523f437972dc7bb

2013-04-12 Thread Jake Edge
On Fri, 12 Apr 2013 12:08:31 -0600 Jake Edge wrote:

> I have been having a problem on my laptop (HP Compaq 2510p) over the
> last two months with >= 3.7 kernels.  After the first resume, it turns
> on the fan and leaves it running at top speed no matter what the
> system is doing.  I finally bisected it over the last two days and
> that points the finger at your patch
> 73201dbec64aebf6b0dca855b523f437972dc7bb:
> 
>  x86, suspend: On wakeup always initialize cr4 and EFER
> 
> We already have a flag word to indicate the existence of
>  MISC_ENABLES, so use the same flag word to indicate existence of cr4
>  and EFER, and always restore them if they exist.  That way if
>  something passes a nonzero value when the value *should* be zero, we
>  will still initialize it.
> 
> I should note that the bisection (between 3.6, which works, and 3.7,
> which doesn't) did not exhibit the same symptoms as I see with any 3.7
> or 3.8 kernel (either from Fedora or built myself), but would instead
> refuse to come out of sleep (but still have the fan running at top
> speed) ... so maybe I've really bisected the "fails to come out of
> sleep" problem, rather than the "fan runs at top speed after resume"
> problem, dunno ... 

As you've undoubtedly surmised, I was in fact tracking down the resume
failure which was caused by the above and fixed by another of your
patches: 1396adc3c2bdc556d4cdd1cf107aa0b6d59fbb1e ... neither of which
were related to the fan issue as far as I can tell ... now I am off
bisecting while applying those ...

sorry for the noise ...

jake

-- 
Jake Edge - LWN - j...@lwn.net - http://lwn.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND][PATCH 1/3] PM / devfreq: exynos4_bus: Fix missing mutex_unlock if opp_find_freq_floor fails

2013-04-12 Thread Rafael J. Wysocki
On Friday, April 12, 2013 09:11:00 PM myungjoo.ham wrote:
> > On Friday, April 12, 2013 11:52:01 AM 함명주 wrote:
> > > > On Friday, April 12, 2013 01:54:18 PM Axel Lin wrote:
> > > > > We need to call mutex_unlock() in the error path.
> > > > > 
> > > > > Signed-off-by: Axel Lin 
> > > > 
> > > > All three patches applied to linux-pm.git/linux-next.
> > > > 
> > > > Exynos maintainers, if you have any objections, please holler.
> > > > 
> > > > Thanks,
> > > > Rafael
> > > 
> > > This patch was included in the last pull-request patchset
> > > though the path was updated. (its precedessor patch moved
> > > exynos drivers to /drivers/devfreq/exynos/* after adding
> > > Exynos common driver files)
> > 
> > OK, so do you want me to drop it?
> > 
> > What about the remaining two?
> 
> Yes, please drop 1/3. It's duplicated.
> 
> The patches 2~3/3 can wait. They are actually not bugfixes.

OK, I've dropped all three.

Axel, please push [2-3/3] thorugh the Exynos tree.

Thanks,
Rafael


> > > > > ---
> > > > >  drivers/devfreq/exynos4_bus.c |3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/devfreq/exynos4_bus.c 
> > > > > b/drivers/devfreq/exynos4_bus.c
> > > > > index 1deee09..54b9615 100644
> > > > > --- a/drivers/devfreq/exynos4_bus.c
> > > > > +++ b/drivers/devfreq/exynos4_bus.c
> > > > > @@ -974,7 +974,8 @@ static int 
> > > > > exynos4_busfreq_pm_notifier_event(struct notifier_block *this,
> > > > >   rcu_read_unlock();
> > > > >   dev_err(data->dev, "%s: unable to find a min 
> > > > > freq\n",
> > > > >   __func__);
> > > > > - return PTR_ERR(opp);
> > > > > + err = PTR_ERR(opp);
> > > > > + goto unlock;
> > > > >   }
> > > > >   new_oppinfo.rate = opp_get_freq(opp);
> > > > >   new_oppinfo.volt = opp_get_voltage(opp);
> > > > > 
> > 
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix

2013-04-12 Thread Tejun Heo
There's no reason to be using bitops, which tends to be more
cumbersome, to handle root flags.  Convert them to masks.  Also, as
they'll be moved to include/linux/cgroup.h and it's generally a good
idea, add CGRP_ prefix.

Note that flags are assigned from (1 << 1).  The first bit will be
used by a flag which will be added soon.

Signed-off-by: Tejun Heo 
---
 kernel/cgroup.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 678a22c..a372eaa 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -296,10 +296,10 @@ bool cgroup_is_descendant(struct cgroup *cgrp, struct 
cgroup *ancestor)
 }
 EXPORT_SYMBOL_GPL(cgroup_is_descendant);
 
-/* bits in struct cgroupfs_root flags field */
+/* cgroupfs_root->flags */
 enum {
-   ROOT_NOPREFIX,  /* mounted subsystems have no named prefix */
-   ROOT_XATTR, /* supports extended attributes */
+   CGRP_ROOT_NOPREFIX  = (1 << 1), /* mounted subsystems have no named 
prefix */
+   CGRP_ROOT_XATTR = (1 << 2), /* supports extended attributes */
 };
 
 static int cgroup_is_releasable(const struct cgroup *cgrp)
@@ -1137,9 +1137,9 @@ static int cgroup_show_options(struct seq_file *seq, 
struct dentry *dentry)
mutex_lock(_root_mutex);
for_each_subsys(root, ss)
seq_printf(seq, ",%s", ss->name);
-   if (test_bit(ROOT_NOPREFIX, >flags))
+   if (root->flags & CGRP_ROOT_NOPREFIX)
seq_puts(seq, ",noprefix");
-   if (test_bit(ROOT_XATTR, >flags))
+   if (root->flags & CGRP_ROOT_XATTR)
seq_puts(seq, ",xattr");
if (strlen(root->release_agent_path))
seq_printf(seq, ",release_agent=%s", root->release_agent_path);
@@ -1202,7 +1202,7 @@ static int parse_cgroupfs_options(char *data, struct 
cgroup_sb_opts *opts)
continue;
}
if (!strcmp(token, "noprefix")) {
-   set_bit(ROOT_NOPREFIX, >flags);
+   opts->flags |= CGRP_ROOT_NOPREFIX;
continue;
}
if (!strcmp(token, "clone_children")) {
@@ -1210,7 +1210,7 @@ static int parse_cgroupfs_options(char *data, struct 
cgroup_sb_opts *opts)
continue;
}
if (!strcmp(token, "xattr")) {
-   set_bit(ROOT_XATTR, >flags);
+   opts->flags |= CGRP_ROOT_XATTR;
continue;
}
if (!strncmp(token, "release_agent=", 14)) {
@@ -1293,8 +1293,7 @@ static int parse_cgroupfs_options(char *data, struct 
cgroup_sb_opts *opts)
 * with the old cpuset, so we allow noprefix only if mounting just
 * the cpuset subsystem.
 */
-   if (test_bit(ROOT_NOPREFIX, >flags) &&
-   (opts->subsys_mask & mask))
+   if ((opts->flags & CGRP_ROOT_NOPREFIX) && (opts->subsys_mask & mask))
return -EINVAL;
 
 
@@ -2523,7 +2522,7 @@ static struct simple_xattrs *__d_xattrs(struct dentry 
*dentry)
 static inline int xattr_enabled(struct dentry *dentry)
 {
struct cgroupfs_root *root = dentry->d_sb->s_fs_info;
-   return test_bit(ROOT_XATTR, >flags);
+   return root->flags & CGRP_ROOT_XATTR;
 }
 
 static bool is_valid_xattr(const char *name)
@@ -2695,7 +2694,7 @@ static int cgroup_add_file(struct cgroup *cgrp, struct 
cgroup_subsys *subsys,
 
simple_xattrs_init(>xattrs);
 
-   if (subsys && !test_bit(ROOT_NOPREFIX, >root->flags)) {
+   if (subsys && !(cgrp->root->flags & CGRP_ROOT_NOPREFIX)) {
strcpy(name, subsys->name);
strcat(name, ".");
}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >