Re: [PATCH V3 2/3] clocksource: mmp: support CLOCKSOURCE OF DECLARE

2013-07-09 Thread Haojian Zhuang
On Wed, Jul 10, 2013 at 5:16 AM, Arnd Bergmann  wrote:
> On Tuesday 09 July 2013, Thomas Petazzoni wrote:
>> Dear Neil Zhang,
>>
>> On Tue, 9 Jul 2013 14:42:45 +0800, Neil Zhang wrote:
>> > support CLOCKSOURCE OF DECLARE for mmp timer.
>> >
>> > Signed-off-by: Neil Zhang 
>> > ---
>> >  arch/arm/mach-mmp/mmp-dt.c  |5 ++---
>> >  arch/arm/mach-mmp/mmp2-dt.c |3 +--
>> >  arch/arm/mach-mmp/time.c|   15 ++-
>> >  3 files changed, 5 insertions(+), 18 deletions(-)
>>
>> Maybe it would be good to take this opportunity to move
>> arch/arm/mach-mmp/time.c into drivers/clocksource/.
>
> +1
>
> Or we might want to have a more coordinated move of all clocksource
> drivers in arch/arm to drivers/clocksource now, as we have done for some
> other subsystems.
>
> Arnd

I already sent some patches on this. But I didn't get response yet. I'll rebase
them and send them again. I hope that they could be merged in this cycle.

Regards
Haojian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uclinux-dist-devel] [GIT PULL] Blackfin updates for 3.11

2013-07-09 Thread Stephen Rothwell
Hi Steven,

On Tue, 9 Jul 2013 17:15:30 +0800 Steven Miao  wrote:
>
> I've signed up for an kernel.org account and moved the blackfin tree to 
> kernel.org for convenience as some developers' suggestion. Pls update the url 
> to:
> http://git.kernel.org/pub/scm/linux/kernel/git/realmz6/blackfin-linux.git

That tree only has for-linus and master branches.  linux-next uses the
blackfin-linus branch of your github tree ... so what should I use now?

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpjX3p5gL3gM.pgp
Description: PGP signature


[PATCH v2 net 1/2] tuntap: correctly linearize skb when zerocopy is used

2013-07-09 Thread Jason Wang
Userspace may produce vectors greater than MAX_SKB_FRAGS. When we try to
linearize parts of the skb to let the rest of iov to be fit in
the frags, we need count copylen into linear when calling tun_alloc_skb()
instead of partly counting it into data_len. Since this breaks
zerocopy_sg_from_iovec() since its inner counter assumes nr_frags should
be zero at beginning. This cause nr_frags to be increased wrongly without
setting the correct frags.

This bug were introduced from 0690899b4d4501b3505be069b9a687e68ccbe15b
(tun: experimental zero copy tx support)

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
- This patch is needed for stable.
- Changes from v1: introduce a local variable to track linear size
---
 drivers/net/tun.c |9 ++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 9c61f87..c3cb60b 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1044,7 +1044,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
 {
struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
struct sk_buff *skb;
-   size_t len = total_len, align = NET_SKB_PAD;
+   size_t len = total_len, align = NET_SKB_PAD, linear;
struct virtio_net_hdr gso = { 0 };
int offset = 0;
int copylen;
@@ -1108,10 +1108,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
copylen = gso.hdr_len;
if (!copylen)
copylen = GOODCOPY_LEN;
-   } else
+   linear = copylen;
+   } else {
copylen = len;
+   linear = gso.hdr_len;
+   }
 
-   skb = tun_alloc_skb(tfile, align, copylen, gso.hdr_len, noblock);
+   skb = tun_alloc_skb(tfile, align, copylen, linear, noblock);
if (IS_ERR(skb)) {
if (PTR_ERR(skb) != -EAGAIN)
tun->dev->stats.rx_dropped++;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 net 2/2] macvtap: correctly linearize skb when zerocopy is used

2013-07-09 Thread Jason Wang
Userspace may produce vectors greater than MAX_SKB_FRAGS. When we try to
linearize parts of the skb to let the rest of iov to be fit in
the frags, we need count copylen into linear when calling macvtap_alloc_skb()
instead of partly counting it into data_len. Since this breaks
zerocopy_sg_from_iovec() since its inner counter assumes nr_frags should
be zero at beginning. This cause nr_frags to be increased wrongly without
setting the correct frags.

This bug were introduced from b92946e2919134ebe2a4083e4302236295ea2a73
(macvtap: zerocopy: validate vectors before building skb).

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
- This patch is needed for stable.
- Changes from v1: introduce a local variable to track linear size.
---
 drivers/net/macvtap.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index b6dd6a7..502d948 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -647,6 +647,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
int vnet_hdr_len = 0;
int copylen = 0;
bool zerocopy = false;
+   size_t linear;
 
if (q->flags & IFF_VNET_HDR) {
vnet_hdr_len = q->vnet_hdr_sz;
@@ -701,11 +702,14 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
copylen = vnet_hdr.hdr_len;
if (!copylen)
copylen = GOODCOPY_LEN;
-   } else
+   linear = copylen;
+   } else {
copylen = len;
+   linear = vnet_hdr.hdr_len;
+   }
 
skb = macvtap_alloc_skb(>sk, NET_IP_ALIGN, copylen,
-   vnet_hdr.hdr_len, noblock, );
+   linear, noblock, );
if (!skb)
goto err;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] invoke arguments the right way in coccinelle script

2013-07-09 Thread Nicolas Palix
Hi Cristian,

You need Coccinelle version 1.0.0-rc11 or above. I think the minimal
required version
is about one year old.

For more information see Documentation/coccinelle.txt which have been updated.
If you prefer to stick to your current version, the commits to revert
are the following ones

ec97946ed038f4b3faa587bc76152b198805b0c4
93f14468491747d6d3efd0b3a42785b1d51a127a

Regards,


On Tue, Jul 9, 2013 at 6:12 PM, Lars-Peter Clausen  wrote:
> On 07/09/2013 05:21 PM, Cristian Bercaru wrote:
>> Because the command line arguments were invoked incorrectly
>> 'make coccicheck' failed to run 'irqf_oneshot.cocci' and all tests that
>> followed. Fixed that.
>>
>> Signed-off-by: Cristian Bercaru 
>
> Time to update your coccinelle installation ;)
>
> --no-include is support in new versions and is prefered over -no_includes
>
>> ---
>>  scripts/coccinelle/misc/irqf_oneshot.cocci | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/scripts/coccinelle/misc/irqf_oneshot.cocci 
>> b/scripts/coccinelle/misc/irqf_oneshot.cocci
>> index 6cfde94..5cca680 100644
>> --- a/scripts/coccinelle/misc/irqf_oneshot.cocci
>> +++ b/scripts/coccinelle/misc/irqf_oneshot.cocci
>> @@ -4,7 +4,7 @@
>>  //
>>  // Confidence: Good
>>  // Comments:
>> -// Options: --no-includes
>> +// Options: -no_includes
>>
>>  virtual patch
>>  virtual context
>
> ___
> Cocci mailing list
> co...@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci



-- 
Nicolas Palix
Tel: +33 4 76 51 46 27
http://membres-liglab.imag.fr/palix/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Jul 10

2013-07-09 Thread Stephen Rothwell
Hi all,

Changes since 20130709:

New trees: metag-fixes, drm-intel-fixes

The ceph tree gained conflicts against Linus' tree.

The slab tree gained a build failure so I used the version from
next-20130709.

The ftrace tree gained a conflict against Linus' tree.

The akpm tree lost lots of patches that turned up elsewhere.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 230 trees (counting Linus' and 33 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (2e17c5a Merge branch 'drm-next' of 
git://people.freedesktop.org/~airlied/linux)
Merging fixes/master (8177a9d lseek(fd, n, SEEK_END) does *not* go to eof - n)
Merging kbuild-current/rc-fixes (42a0940 Merge branch 'yem-kconfig-rc-fixes' of 
git://gitorious.org/linux-kconfig/linux-kconfig into kbuild/rc-fixes)
Merging arc-current/for-curr (baadb8f ARC: warn on improper stack unwind FDE 
entries)
Merging arm-current/fixes (3e0a07f ARM: 7773/1: PJ4B: Add support for errata 
4742)
Merging m68k-current/for-linus (767bcb4 Merge branch 'exotic-arch-fixes' into 
for-next)
Merging metag-fixes/fixes (d903bca metag: checksum.h: fix carry in 
csum_tcpudp_nofold)
Merging powerpc-merge/merge (ea461ab powerpc/eeh: Fix fetching bus for 
single-dev-PE)
Merging sparc/master (c069114 mn10300: Fix include dependency in irqflags.h et 
al.)
Merging net/master (8bb495e Linux 3.10)
Merging ipsec/master (01cb71d net_sched: restore "overhead xxx" handling)
Merging sound-current/for-linus (cd63a5f ALSA: hda - Keep halting ALC5505 DSP)
Merging pci-current/for-linus (65694c5 x86/PCI: Map PCI setup data with 
ioremap() so it can be in highmem)
Merging wireless/master (57bf744 Merge branch 'master' of 
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth)
Merging driver-core.current/driver-core-linus (fc76a25 Merge tag 
'driver-core-3.11-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging tty.current/tty-linus (9e895ac Linux 3.10-rc7)
Merging usb.current/usb-linus (fc76a25 Merge tag 'driver-core-3.11-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging staging.current/staging-linus (fc76a25 Merge tag 'driver-core-3.11-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging char-misc.current/char-misc-linus (fc76a25 Merge tag 
'driver-core-3.11-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging input-current/for-linus (62f548d Input: cyttsp4 - use 16bit address for 
I2C/SPI communication)
Merging md-current/for-linus (1376512 md/raid10: fix bug which causes all 
RAID10 reshapes to move no data.)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (02c0241 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto)
Merging ide/master (bf6b438 ide: gayle: use module_platform_driver_probe())
Merging dwmw2/master (5950f08 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (d94ea3f irqchip: Return -EPERM for 
reserved IRQs)
Merging devic

Re: [v3][PATCH 7/8] book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET

2013-07-09 Thread tiejun.chen

On 07/10/2013 01:20 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Linuxppc-dev [mailto:linuxppc-dev-
bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Tiejun Chen
Sent: Tuesday, July 09, 2013 1:33 PM
To: b...@kernel.crashing.org
Cc: linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org
Subject: [v3][PATCH 7/8] book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET

Book3e is always aligned 1GB to create TLB so we should
use (KERNELBASE - MEMORY_START) as VIRT_PHYS_OFFSET to
get __pa/__va properly while boot kdump.

Signed-off-by: Tiejun Chen 
---
  arch/powerpc/include/asm/page.h |2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 988c812..5b00081 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -112,6 +112,8 @@ extern long long virt_phys_offset;
  /* See Description below for VIRT_PHYS_OFFSET */
  #ifdef CONFIG_RELOCATABLE_PPC32
  #define VIRT_PHYS_OFFSET virt_phys_offset
+#elif defined(CONFIG_PPC_BOOK3E_64)
+#define VIRT_PHYS_OFFSET (KERNELBASE - MEMORY_START)


Can you please explain this code a bit more. I am not understanding this part:)


Nothing is special, we only need to redefine this to make sure __va()/__pa() can 
work well for BOOk3E-64 in BOOKE case:


#ifdef CONFIG_BOOKE
#define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))
#define __pa(x) ((unsigned long)(x) - VIRT_PHYS_OFFSET)

And the arch/powerpc/include/asm/page.h file has more descriptions inline :)

Tiejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] Add phy support for AM335X platform using Generic PHy framework

2013-07-09 Thread George Cherian

On 7/10/2013 10:53 AM, Felipe Balbi wrote:

On Wed, Jul 10, 2013 at 10:26:25AM +0530, George Cherian wrote:

On 7/9/2013 5:05 PM, Kishon Vijay Abraham I wrote:

Hi,

On Tuesday 09 July 2013 11:10 AM, George Cherian wrote:

On 7/9/2013 1:14 AM, Sebastian Andrzej Siewior wrote:

On 07/08/2013 12:43 PM, George Cherian wrote:

This patch series adds phy support for AM335X platform.
This patch series is based on Generic PHY framework [1].


This series has
 - adds dual musb instances support for am335x platform (just for testing)
 - adds phy-am-usb driver used in AM platforms
 - adds dt  bindings for the phys
 - removes usb-phy and replaced with generic phy apis in glue layer

No, I don't like this all. You did the one thing I tried to avoid while
posting my quick-and-dirty phy driver recently: You duplicated a lot of
code which can be served by the nop driver and added only power
on/power off callbacks.

I wanted to add phy wakeup control also, but currently phy_ops  dont have an op
for wkup_ctrl
Kishon, Can we add one?

Since this should be a capability of the PHY, can't we have wkup_ctrl always
enabled if the PHY has such a capability?

No, we cant have wakeup always enabled. Normally we enable it only
when we go to low power states and
if the user needs USB a wakeup source.

So how about enable/disable  phy wakeup from phy suspend/resume?

you should use something like so on your ->suspend() or
->runtime_suspend() method

static int my_phy_{suspend,runtime_suspend}(struct device *dev)
{
struct my_phy *phy = dev_get_drvdata(dev);

if (device_may_wakeup(dev))
my_phy_enable_wakeup(phy);

return 0;
}


Makes sense. will do it in v2.

or if it needs more user control,
should we implement a sysfs entry to enable wakeup?

that already exists ;-)




--
-George

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v3][PATCH 1/8] powerpc/book3e: rename interrupt_end_book3e with __end_interrupts

2013-07-09 Thread tiejun.chen

On 07/10/2013 01:17 PM, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Linuxppc-dev [mailto:linuxppc-dev-
bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Tiejun Chen
Sent: Tuesday, July 09, 2013 1:33 PM
To: b...@kernel.crashing.org
Cc: linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org
Subject: [v3][PATCH 1/8] powerpc/book3e: rename interrupt_end_book3e with
__end_interrupts

We can rename 'interrupt_end_book3e' with '__end_interrupts' then book3s/book3e
can share this unique label to make sure we can use this conveniently.


I think we can be consistent with start and end names, no?


Are you saying to rename 'interrupt_base_book3e' with '__base_interrupts' here? 
But seems optional since book3s have no this similar start label, so I'd like to 
keep that as original now.


Lets listen other comments firstly.

Tiejun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Viresh Kumar
On 10 July 2013 09:42, Michael Wang  wrote:
> I'm not sure what is supposed after notify CPUFREQ_GOV_STOP event, if it
> is in order to stop queued work and prevent follow work happen again,
> then it failed to, and we need some method to stop queue work again when
> CPUFREQ_GOV_STOP notified, like some flag in policy which will be
> checked before re-queue work in work.
>
> But if the event is just to sync the queued work but not prevent follow
> work happen, then things will become tough...we need confirm.

After GOV_STOP, until the time GOV_START is called, we shouldn't be
queuing any work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arch/arc updates #2 for 3.11-rc1

2013-07-09 Thread Vineet Gupta
Hi Linus,

Please pull.

Thx,
-Vineet

-->
The following changes since commit 7d132055814ef17a6c7b69f342244c410a5e000f:

  Linux 3.10-rc6 (2013-06-15 11:51:07 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc.git/
tags/arc-v3.11-rc1-part2

for you to fetch changes up to 723e2b801d803035ec7a7c0fe162a6c9fc118164:

  ARC: [TB10x] Updates for irqchip driver (2013-06-28 15:07:42 +0530)


Couple of Platform updates (Device Tree files primarily) given that the
corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c) have now
been merged into your tree.

Ideally these shd have been part of same submissions, oh well...


Alexey Brodkin (1):
  ARC: [plat-arcfpga] Enable arc_emac for ARCAngle4 Board

Christian Ruppert (1):
  ARC: [TB10x] Updates for irqchip driver

 arch/arc/boot/dts/abilis_tb100.dtsi | 32 +
 arch/arc/boot/dts/abilis_tb101.dtsi | 32 +
 arch/arc/boot/dts/abilis_tb10x.dtsi | 32 +
 arch/arc/boot/dts/angel4.dts| 16 +++
 arch/arc/configs/fpga_defconfig |  3 +++
 arch/arc/plat-arcfpga/include/plat/irq.h|  2 --
 arch/arc/plat-arcfpga/include/plat/memmap.h |  2 --
 arch/arc/plat-tb10x/Kconfig |  1 +
 8 files changed, 62 insertions(+), 58 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: phy: samsung-usb2: Toggle HSIC GPIO from device tree

2013-07-09 Thread Felipe Balbi
On Tue, Jul 09, 2013 at 05:34:15PM -0700, Julius Werner wrote:
> This patch adds support for a new 'samsung,hsic-reset-gpio' in the
> device tree, which will be interpreted as an active-low reset pin during
> PHY initialization when it exists. Useful for intergrated HSIC devices
> like an SMSC 3503 hub. It is necessary to add this directly to the PHY
> initialization to get the timing right, since resetting a HSIC device
> after it has already been enumerated can confuse the USB stack.
> 
> Also fixes PHY semaphore code to make sure we always go through the
> setup at least once, even if it was already turned on (e.g. by
> firmware), and changes a spinlock to a mutex to allow sleeping in the
> critical section.
> 
> Change-Id: Ieecac52c27daa7a17a7ed3b2863ddba3aeb8d16f
> Signed-off-by: Julius Werner 
> ---
>  .../devicetree/bindings/usb/samsung-usbphy.txt | 10 ++
>  drivers/usb/phy/phy-samsung-usb.c  | 17 ++
>  drivers/usb/phy/phy-samsung-usb.h  |  7 ++--
>  drivers/usb/phy/phy-samsung-usb2.c | 38 
> ++
>  drivers/usb/phy/phy-samsung-usb3.c | 12 +++
>  5 files changed, 55 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt 
> b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
> index 33fd354..82e2e16 100644
> --- a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
> +++ b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
> @@ -31,6 +31,12 @@ Optional properties:
>  - ranges: allows valid translation between child's address space and parent's
> address space.
>  
> +- samsung,hsic-reset-gpio: an active low GPIO pin that resets a device
> + connected to the HSIC port. Useful for things like
> + an on-board SMSC3503 hub.
> +- pinctrl-0: Pin control group containing the HSIC reset GPIO pin.
> +- pinctrl-names: Should contain only one value - "default".
> +
>  - The child node 'usbphy-sys' to the node 'usbphy' is for the system 
> controller
>interface for usb-phy. It should provide the following information 
> required by
>usb-phy controller to control phy.
> @@ -56,6 +62,10 @@ Example:
>   clocks = < 2>, < 305>;
>   clock-names = "xusbxti", "otg";
>  
> + samsung,hsic-reset-gpio = < 4 1>;

looks like this should be modeled as a fixed-regulator ?

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH 0/5] Add phy support for AM335X platform using Generic PHy framework

2013-07-09 Thread Felipe Balbi
On Wed, Jul 10, 2013 at 10:26:25AM +0530, George Cherian wrote:
> On 7/9/2013 5:05 PM, Kishon Vijay Abraham I wrote:
> >Hi,
> >
> >On Tuesday 09 July 2013 11:10 AM, George Cherian wrote:
> >>On 7/9/2013 1:14 AM, Sebastian Andrzej Siewior wrote:
> >>>On 07/08/2013 12:43 PM, George Cherian wrote:
> This patch series adds phy support for AM335X platform.
> This patch series is based on Generic PHY framework [1].
> 
> 
> This series has
>  - adds dual musb instances support for am335x platform (just for 
>  testing)
>  - adds phy-am-usb driver used in AM platforms
>  - adds dt  bindings for the phys
>  - removes usb-phy and replaced with generic phy apis in glue layer
> >>>No, I don't like this all. You did the one thing I tried to avoid while
> >>>posting my quick-and-dirty phy driver recently: You duplicated a lot of
> >>>code which can be served by the nop driver and added only power
> >>>on/power off callbacks.
> >>I wanted to add phy wakeup control also, but currently phy_ops  dont have 
> >>an op
> >>for wkup_ctrl
> >>Kishon, Can we add one?
> >Since this should be a capability of the PHY, can't we have wkup_ctrl always
> >enabled if the PHY has such a capability?
> 
> No, we cant have wakeup always enabled. Normally we enable it only
> when we go to low power states and
> if the user needs USB a wakeup source.
> 
> So how about enable/disable  phy wakeup from phy suspend/resume?

you should use something like so on your ->suspend() or
->runtime_suspend() method

static int my_phy_{suspend,runtime_suspend}(struct device *dev)
{
struct my_phy *phy = dev_get_drvdata(dev);

if (device_may_wakeup(dev))
my_phy_enable_wakeup(phy);

return 0;
}

> >or if it needs more user control,
> >should we implement a sysfs entry to enable wakeup?

that already exists ;-)

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH net-next] net: rename low latency sockets functions to busy poll

2013-07-09 Thread Eliezer Tamir
On 10/07/2013 07:41, David Miller wrote:
> From: Eliezer Tamir 
> Date: Wed, 10 Jul 2013 06:29:16 +0300
> 
>> If the following names changes are acceptable I will try to send out
>> a patch today.

>> 2. ndo_ll_poll -> ndo_busy_poll
>>
>> - not technically accurate since the ndo callback does not itself busy
>> poll, it's just used to implement it.
> 
> I think this name change is accurate, it expresses the two elements of
> what it does.  It's busy waiting, in that it's doing a synchronous
> scan of the device's RX queue, and it's polling just like NAPI polling
> does.

OK

> Well... what would be great would be to come up with some single
> interface that drivers can implement rather than having to have
> both napi->poll and netdevice_ops->ndo_ll_poll().  But that's a task
> for a later date.
> 
> Therefore, ndo_busy_poll is probably best for now.

I will think about this, maybe we could even unify ndo_poll_controller.
It seems like said unified method would have to have an extra parameter
that would indicate from which context it was called:
1. from napi poll (bh)
2. from poll controller (with interrupts disabled)
3. from busy poll (user context)
-of course not for today.

> Nope, looks complete.  And also do the manpage update.

Where do I find the repository for the manpages?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [v3][PATCH 7/8] book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET

2013-07-09 Thread Bhushan Bharat-R65777


> -Original Message-
> From: Linuxppc-dev [mailto:linuxppc-dev-
> bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Tiejun 
> Chen
> Sent: Tuesday, July 09, 2013 1:33 PM
> To: b...@kernel.crashing.org
> Cc: linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: [v3][PATCH 7/8] book3e/kexec/kdump: redefine VIRT_PHYS_OFFSET
> 
> Book3e is always aligned 1GB to create TLB so we should
> use (KERNELBASE - MEMORY_START) as VIRT_PHYS_OFFSET to
> get __pa/__va properly while boot kdump.
> 
> Signed-off-by: Tiejun Chen 
> ---
>  arch/powerpc/include/asm/page.h |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
> index 988c812..5b00081 100644
> --- a/arch/powerpc/include/asm/page.h
> +++ b/arch/powerpc/include/asm/page.h
> @@ -112,6 +112,8 @@ extern long long virt_phys_offset;
>  /* See Description below for VIRT_PHYS_OFFSET */
>  #ifdef CONFIG_RELOCATABLE_PPC32
>  #define VIRT_PHYS_OFFSET virt_phys_offset
> +#elif defined(CONFIG_PPC_BOOK3E_64)
> +#define VIRT_PHYS_OFFSET (KERNELBASE - MEMORY_START)

Can you please explain this code a bit more. I am not understanding this part:)

-Bharat

>  #else
>  #define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
>  #endif
> --
> 1.7.9.5
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [v3][PATCH 1/8] powerpc/book3e: rename interrupt_end_book3e with __end_interrupts

2013-07-09 Thread Bhushan Bharat-R65777


> -Original Message-
> From: Linuxppc-dev [mailto:linuxppc-dev-
> bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Tiejun 
> Chen
> Sent: Tuesday, July 09, 2013 1:33 PM
> To: b...@kernel.crashing.org
> Cc: linuxppc-...@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: [v3][PATCH 1/8] powerpc/book3e: rename interrupt_end_book3e with
> __end_interrupts
> 
> We can rename 'interrupt_end_book3e' with '__end_interrupts' then 
> book3s/book3e
> can share this unique label to make sure we can use this conveniently.

I think we can be consistent with start and end names, no?

-Bharat

> 
> Signed-off-by: Tiejun Chen 
> ---
>  arch/powerpc/kernel/exceptions-64e.S |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64e.S
> b/arch/powerpc/kernel/exceptions-64e.S
> index 645170a..a518e48 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -309,8 +309,8 @@ interrupt_base_book3e:
> /* fake
> trap */
>   EXCEPTION_STUB(0x300, hypercall)
>   EXCEPTION_STUB(0x320, ehpriv)
> 
> - .globl interrupt_end_book3e
> -interrupt_end_book3e:
> + .globl __end_interrupts
> +__end_interrupts:
> 
>  /* Critical Input Interrupt */
>   START_EXCEPTION(critical_input);
> @@ -493,7 +493,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
>   beq+1f
> 
>   LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
> - LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
> + LOAD_REG_IMMEDIATE(r15,__end_interrupts)
>   cmpld   cr0,r10,r14
>   cmpld   cr1,r10,r15
>   blt+cr0,1f
> @@ -559,7 +559,7 @@ kernel_dbg_exc:
>   beq+1f
> 
>   LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
> - LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
> + LOAD_REG_IMMEDIATE(r15,__end_interrupts)
>   cmpld   cr0,r10,r14
>   cmpld   cr1,r10,r15
>   blt+cr0,1f
> --
> 1.7.9.5
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 2/2] macvtap: correctly linearize skb when zerocopy is used

2013-07-09 Thread Jason Wang
On 07/09/2013 06:35 PM, Michael S. Tsirkin wrote:
> On Tue, Jul 09, 2013 at 06:10:51PM +0800, Jason Wang wrote:
>> Userspace may produce vectors greater than MAX_SKB_FRAGS. When we try to
>> linearize parts of the skb to let the rest of iov to be fit in
>> the frags, we need count copylen into linear when calling macvtap_alloc_skb()
>> instead of partly counting it into data_len. Since this breaks
>> zerocopy_sg_from_iovec() since its inner counter assumes nr_frags should
>> be zero at beginning. This cause nr_frags to be increased wrongly without
>> setting the correct frags.
>>
>> This bug were introduced from b92946e2919134ebe2a4083e4302236295ea2a73
>> (macvtap: zerocopy: validate vectors before building skb).
>>
>> Cc: Michael S. Tsirkin 
>> Signed-off-by: Jason Wang 
>> ---
>> This patch is needed for stable.
>> ---
>>  drivers/net/macvtap.c |3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index f2c4a3b..b213020 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -770,7 +770,8 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
>> struct msghdr *m,
>>  copylen = len;
>>  
>>  skb = macvtap_alloc_skb(>sk, NET_IP_ALIGN, copylen,
>> -vnet_hdr.hdr_len, noblock, );
>> +zerocopy ? copylen : vnet_hdr.hdr_len,
>> +noblock, );
>>  if (!skb)
>>  goto err;
> Same comment as for tun - let's add code for the if statement above.
> Thanks!
>

Sure, I will post v2.

Thanks
>> -- 
>> 1.7.1
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls

2013-07-09 Thread Alexey Kardashevskiy
On 07/10/2013 03:02 AM, Alexander Graf wrote:
> On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
>> This adds real mode handlers for the H_PUT_TCE_INDIRECT and
>> H_STUFF_TCE hypercalls for QEMU emulated devices such as IBMVIO
>> devices or emulated PCI.  These calls allow adding multiple entries
>> (up to 512) into the TCE table in one call which saves time on
>> transition to/from real mode.
> 
> We don't mention QEMU explicitly in KVM code usually.
> 
>> This adds a tce_tmp cache to kvm_vcpu_arch to save valid TCEs
>> (copied from user and verified) before writing the whole list into
>> the TCE table. This cache will be utilized more in the upcoming
>> VFIO/IOMMU support to continue TCE list processing in the virtual
>> mode in the case if the real mode handler failed for some reason.
>>
>> This adds a guest physical to host real address converter
>> and calls the existing H_PUT_TCE handler. The converting function
>> is going to be fully utilized by upcoming VFIO supporting patches.
>>
>> This also implements the KVM_CAP_PPC_MULTITCE capability,
>> so in order to support the functionality of this patch, QEMU
>> needs to query for this capability and set the "hcall-multi-tce"
>> hypertas property only if the capability is present, otherwise
>> there will be serious performance degradation.
> 
> Same as above. But really you're only giving recommendations here. What's
> the point? Please describe what the benefit of this patch is, not what some
> other random subsystem might do with the benefits it brings.
> 
>>
>> Signed-off-by: Paul Mackerras
>> Signed-off-by: Alexey Kardashevskiy
>>
>> ---
>> Changelog:
>> 2013/07/06:
>> * fixed number of wrong get_page()/put_page() calls
>>
>> 2013/06/27:
>> * fixed clear of BUSY bit in kvmppc_lookup_pte()
>> * H_PUT_TCE_INDIRECT does realmode_get_page() now
>> * KVM_CAP_SPAPR_MULTITCE now depends on CONFIG_PPC_BOOK3S_64
>> * updated doc
>>
>> 2013/06/05:
>> * fixed mistype about IBMVIO in the commit message
>> * updated doc and moved it to another section
>> * changed capability number
>>
>> 2013/05/21:
>> * added kvm_vcpu_arch::tce_tmp
>> * removed cleanup if put_indirect failed, instead we do not even start
>> writing to TCE table if we cannot get TCEs from the user and they are
>> invalid
>> * kvmppc_emulated_h_put_tce is split to kvmppc_emulated_put_tce
>> and kvmppc_emulated_validate_tce (for the previous item)
>> * fixed bug with failthrough for H_IPI
>> * removed all get_user() from real mode handlers
>> * kvmppc_lookup_pte() added (instead of making lookup_linux_pte public)
>>
>> Signed-off-by: Alexey Kardashevskiy
>> ---
>>   Documentation/virtual/kvm/api.txt   |  25 +++
>>   arch/powerpc/include/asm/kvm_host.h |   9 ++
>>   arch/powerpc/include/asm/kvm_ppc.h  |  16 +-
>>   arch/powerpc/kvm/book3s_64_vio.c| 154 ++-
>>   arch/powerpc/kvm/book3s_64_vio_hv.c | 260
>> 
>>   arch/powerpc/kvm/book3s_hv.c|  41 -
>>   arch/powerpc/kvm/book3s_hv_rmhandlers.S |   6 +
>>   arch/powerpc/kvm/book3s_pr_papr.c   |  37 -
>>   arch/powerpc/kvm/powerpc.c  |   3 +
>>   9 files changed, 517 insertions(+), 34 deletions(-)
>>
>> diff --git a/Documentation/virtual/kvm/api.txt
>> b/Documentation/virtual/kvm/api.txt
>> index 6365fef..762c703 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -2362,6 +2362,31 @@ calls by the guest for that service will be passed
>> to userspace to be
>>   handled.
>>
>>
>> +4.86 KVM_CAP_PPC_MULTITCE
>> +
>> +Capability: KVM_CAP_PPC_MULTITCE
>> +Architectures: ppc
>> +Type: vm
>> +
>> +This capability means the kernel is capable of handling hypercalls
>> +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
>> +space. This significanly accelerates DMA operations for PPC KVM guests.
> 
> significanly? Please run this through a spell checker.
> 
>> +The user space should expect that its handlers for these hypercalls
> 
> s/The//
> 
>> +are not going to be called.
> 
> Is user space guaranteed they will not be called? Or can it still happen?

... if user space previously registered LIOBN in KVM (via
KVM_CREATE_SPAPR_TCE or similar calls).

ok?

There is also KVM_CREATE_SPAPR_TCE_IOMMU but it is not in the kernel yet
and may never get there.


>> +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
>> +the user space might have to advertise it for the guest. For example,
>> +IBM pSeries guest starts using them if "hcall-multi-tce" is present in
>> +the "ibm,hypertas-functions" device-tree property.
> 
> This paragraph describes sPAPR. That's fine, but please document it as
> such. Also please check your grammar.

>> +
>> +Without this capability, only H_PUT_TCE is handled by the kernel and
>> +therefore the use of H_PUT_TCE_INDIRECT and H_STUFF_TCE is not recommended
>> +unless the capability is present as passing hypercalls to the userspace
>> +slows 

Re: [PATCH 0/5] Add phy support for AM335X platform using Generic PHy framework

2013-07-09 Thread George Cherian

On 7/9/2013 5:05 PM, Kishon Vijay Abraham I wrote:

Hi,

On Tuesday 09 July 2013 11:10 AM, George Cherian wrote:

On 7/9/2013 1:14 AM, Sebastian Andrzej Siewior wrote:

On 07/08/2013 12:43 PM, George Cherian wrote:

This patch series adds phy support for AM335X platform.
This patch series is based on Generic PHY framework [1].


This series has
 - adds dual musb instances support for am335x platform (just for testing)
 - adds phy-am-usb driver used in AM platforms
 - adds dt  bindings for the phys
 - removes usb-phy and replaced with generic phy apis in glue layer

No, I don't like this all. You did the one thing I tried to avoid while
posting my quick-and-dirty phy driver recently: You duplicated a lot of
code which can be served by the nop driver and added only power
on/power off callbacks.

I wanted to add phy wakeup control also, but currently phy_ops  dont have an op
for wkup_ctrl
Kishon, Can we add one?

Since this should be a capability of the PHY, can't we have wkup_ctrl always
enabled if the PHY has such a capability?


No, we cant have wakeup always enabled. Normally we enable it only when 
we go to low power states and

if the user needs USB a wakeup source.

So how about enable/disable  phy wakeup from phy suspend/resume?

or if it needs more user control,
should we implement a sysfs entry to enable wakeup?


Nope, I dont want to create another sysfs entry.


Thanks
Kishon



--
-George

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] fsio: filesystem io accounting cgroup

2013-07-09 Thread Sha Zhengju
Hi,

On Mon, Jul 8, 2013 at 5:59 PM, Konstantin Khlebnikov
 wrote:
> This is proof of concept, just basic functionality for IO controller.
> This cgroup will control filesystem usage on vfs layer, it's main goal is
> bandwidth control. It's supposed to be much more lightweight than memcg/blkio.
>
> This patch shows easy way for accounting pages in dirty/writeback state in
> per-inode manner. This is easier that doing this in memcg in per-page manner.
> Main idea is in keeping on each inode pointer (->i_fsio) to cgroup which owns
> dirty data in that inode. It's settled by fsio_account_page_dirtied() when
> first dirty tag appears in the inode. Relying to mapping tags gives us locking
> for free, this patch doesn't add any new locks to hot paths.

While referring to dirty/writeback numbers, what I care about is 'how
many dirties in how many memory' and later may use the proportion to
decide throttling or something else. So if you are talking about nr of
dirty pages without memcg's amount of memory, I don't see the meaning
of a single number.

What's more, counting dirty/writeback stats in per-node manner can
bring inaccuracy in some situations: considering two tasks from
different fsio cgroups are dirtying one file concurrently but may only
be counting in one fsio stats, or a task is moved to another fsio
cgroup after dirtrying one file. As talking about task moving, it is
the root cause of adding memcg locks in page stat routines, since
there's a race window between 'modify cgroup owner' and 'update stats
using cgroup pointer'. But if you are going to handle task move or
take care of ->i_fsio for better accuracy in future, I'm afraid you
will also need some synchronization mechanism in hot paths. Maybe also
a new lock or mapping->tree_lock(which is already hot enough) IMHO.



Thanks,
Sha


>
> Unlike to blkio this method works for all of filesystems, not just 
> disk-backed.
> Also it's able to handle writeback, because each inode has context which can 
> be
> used in writeback thread to account io operations.
>
> This is early prototype, I have some plans about extra functionality because
> this accounting itself is mostly useless, but it can be used as basis for more
> usefull features.
>
> Planned impovements:
> * Split bdi into several tiers and account them separately. For example:
>   hdd/ssd/usb/nfs. In complicated containerized environments that might be
>   different kinds of storages with different limits and billing. This is more
>   usefull that independent per-disk accounting and much easier to implement
>   because all per-tier structures are allocated before disk appearance.
> * Add some hooks for accounting actualy issued IO requests (iops).
> * Implement bandwidth throttlers for each tier individually (bps and iops).
>   This will be the most tasty feature. I already have very effective 
> prototype.
> * Add hook into balance_dirty_pages to limit amount of dirty page for each
>   cgroup in each tier individually. This is required for accurate throttling,
>   because if we want to limit speed of writeback we also must limit amount
>   of dirty pages otherwise we have to inject enourmous delay after each 
> sync().
> * Implement filtered writeback requests for writing only data which belongs to
>   particular fsio cgroup (or cgroups tree) to keep dirty balance in 
> background.
> * Implement filtered 'sync', special mode for sync() which syncs only
>   filesystems which 'belong' to current fsio cgroup. Each container should 
> sync
>   only it's own filesystems. This also can be made in terms of 'visibility' in
>   vfsmount namespaces.
>
> This patch lays on top of this:
> b26008c page_writeback: put account_page_redirty() after set_page_dirty()
> 80979bd page_writeback: get rid of account_size argument in 
> cancel_dirty_page()
> c575ef6 hugetlbfs: remove cancel_dirty_page() from truncate_huge_page()
> b720923 nfs: remove redundant cancel_dirty_page() from nfs_wb_page_cancel()
> 4c21e52 mm: remove redundant dirty pages check from __delete_from_page_cache()
>
> Signed-off-by: Konstantin Khlebnikov 
> Cc: cgro...@vger.kernel.org
> Cc: de...@openvz.org
> Cc: Michal Hocko 
> Cc: Sha Zhengju 
> ---
>  block/blk-core.c  |2 +
>  fs/Makefile   |2 +
>  fs/direct-io.c|2 +
>  fs/fsio_cgroup.c  |  137 
> +
>  fs/nfs/direct.c   |2 +
>  include/linux/cgroup_subsys.h |6 ++
>  include/linux/fs.h|3 +
>  include/linux/fsio_cgroup.h   |  136 
> +
>  init/Kconfig  |3 +
>  mm/page-writeback.c   |8 ++
>  mm/readahead.c|2 +
>  mm/truncate.c |2 +
>  12 files changed, 304 insertions(+), 1 deletion(-)
>  create mode 100644 fs/fsio_cgroup.c
>  create mode 100644 include/linux/fsio_cgroup.h
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 

Re: [PATCH net-next] net: rename low latency sockets functions to busy poll

2013-07-09 Thread David Miller
From: Eliezer Tamir 
Date: Wed, 10 Jul 2013 06:29:16 +0300

> If the following names changes are acceptable I will try to send out
> a patch today.
> 
> 1. include/net/ll_poll.h -> include/net/busy_poll.h

Agreed.

> 2. ndo_ll_poll -> ndo_busy_poll
> 
> - not technically accurate since the ndo callback does not itself busy
> poll, it's just used to implement it.

I think this name change is accurate, it expresses the two elements of
what it does.  It's busy waiting, in that it's doing a synchronous
scan of the device's RX queue, and it's polling just like NAPI polling
does.

> maybe ndo_napi_id_poll? or ndo_id_poll? I don't really like any of them,
> so a suggestion would be nice.

This would make it sound like it's some new version of the existing
NAPI poll.

Well... what would be great would be to come up with some single
interface that drivers can implement rather than having to have
both napi->poll and netdevice_ops->ndo_ll_poll().  But that's a task
for a later date.

Therefore, ndo_busy_poll is probably best for now.

> 3. sysctl_net_ll_{read,poll} -> sysctl_net_busy_{read,poll}
> - along with matching file name changes.

Agreed.

> 4. {sk,skb}_mark_ll -> {sk,skb}_mark_napi_id

Agreed.

> 5. LL_SO -> BUSY_POLL_SO

Agreed.

> Did I miss anything?

Nope, looks complete.  And also do the manpage update.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] virtio_net: fix race in RX VQ processing

2013-07-09 Thread Asias He
On Tue, Jul 09, 2013 at 11:28:34AM +0800, Jason Wang wrote:
> On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote:
> > virtio net called virtqueue_enable_cq on RX path after napi_complete, so
> > with NAPI_STATE_SCHED clear - outside the implicit napi lock.
> > This violates the requirement to synchronize virtqueue_enable_cq wrt
> > virtqueue_add_buf.  In particular, used event can move backwards,
> > causing us to lose interrupts.
> > In a debug build, this can trigger panic within START_USE.
> >
> > Jason Wang reports that he can trigger the races artificially,
> > by adding udelay() in virtqueue_enable_cb() after virtio_mb().
> >
> > However, we must call napi_complete to clear NAPI_STATE_SCHED before
> > polling the virtqueue for used buffers, otherwise napi_schedule_prep in
> > a callback will fail, causing us to lose RX events.
> >
> > To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
> > set (under napi lock), later call virtqueue_poll with
> > NAPI_STATE_SCHED clear (outside the lock).
> >
> > Reported-by: Jason Wang 
> > Signed-off-by: Michael S. Tsirkin 

Acked-by: Asias He 

> > ---
> 
> Tested-by: Jason Wang 
> Acked-by: Jason Wang 
> >  drivers/net/virtio_net.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 5305bd1..fbdd79a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -622,8 +622,9 @@ again:
> >  
> > /* Out of packets? */
> > if (received < budget) {
> > +   unsigned r = virtqueue_enable_cb_prepare(rq->vq);
> > napi_complete(napi);
> > -   if (unlikely(!virtqueue_enable_cb(rq->vq)) &&
> > +   if (unlikely(virtqueue_poll(rq->vq, r)) &&
> > napi_schedule_prep(napi)) {
> > virtqueue_disable_cb(rq->vq);
> > __napi_schedule(napi);
> 
> ___
> Virtualization mailing list
> virtualizat...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization

-- 
Asias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] virtio: support unlocked queue poll

2013-07-09 Thread Asias He
On Mon, Jul 08, 2013 at 12:04:36PM +0300, Michael S. Tsirkin wrote:
> This adds a way to check ring empty state after enable_cb outside any
> locks. Will be used by virtio_net.
> 
> Note: there's room for more optimization: caller is likely to have a
> memory barrier already, which means we might be able to get rid of a
> barrier here.  Deferring this optimization until we do some
> benchmarking.
> 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Asias He 

> ---
>  drivers/virtio/virtio_ring.c | 56 
> ++--
>  include/linux/virtio.h   |  4 
>  2 files changed, 48 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5217baf..37d58f8 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -607,19 +607,21 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
>  EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>  
>  /**
> - * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * virtqueue_enable_cb_prepare - restart callbacks after disable_cb
>   * @vq: the struct virtqueue we're talking about.
>   *
> - * This re-enables callbacks; it returns "false" if there are pending
> - * buffers in the queue, to detect a possible race between the driver
> - * checking for more work, and enabling callbacks.
> + * This re-enables callbacks; it returns current queue state
> + * in an opaque unsigned value. This value should be later tested by
> + * virtqueue_poll, to detect a possible race between the driver checking for
> + * more work, and enabling callbacks.
>   *
>   * Caller must ensure we don't call this with other virtqueue
>   * operations at the same time (except where noted).
>   */
> -bool virtqueue_enable_cb(struct virtqueue *_vq)
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
>  {
>   struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 last_used_idx;
>  
>   START_USE(vq);
>  
> @@ -629,15 +631,45 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
>* either clear the flags bit or point the event index at the next
>* entry. Always do both to keep code simple. */
>   vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
> - vring_used_event(>vring) = vq->last_used_idx;
> + vring_used_event(>vring) = last_used_idx = vq->last_used_idx;
> + END_USE(vq);
> + return last_used_idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> +
> +/**
> + * virtqueue_poll - query pending used buffers
> + * @vq: the struct virtqueue we're talking about.
> + * @last_used_idx: virtqueue state (from call to 
> virtqueue_enable_cb_prepare).
> + *
> + * Returns "true" if there are pending used buffers in the queue.
> + *
> + * This does not need to be serialized.
> + */
> +bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
>   virtio_mb(vq->weak_barriers);
> - if (unlikely(more_used(vq))) {
> - END_USE(vq);
> - return false;
> - }
> + return (u16)last_used_idx != vq->vring.used->idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_poll);
>  
> - END_USE(vq);
> - return true;
> +/**
> + * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * @vq: the struct virtqueue we're talking about.
> + *
> + * This re-enables callbacks; it returns "false" if there are pending
> + * buffers in the queue, to detect a possible race between the driver
> + * checking for more work, and enabling callbacks.
> + *
> + * Caller must ensure we don't call this with other virtqueue
> + * operations at the same time (except where noted).
> + */
> +bool virtqueue_enable_cb(struct virtqueue *_vq)
> +{
> + unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq);
> + return !virtqueue_poll(_vq, last_used_idx);
>  }
>  EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
>  
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 9ff8645..72398ee 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -70,6 +70,10 @@ void virtqueue_disable_cb(struct virtqueue *vq);
>  
>  bool virtqueue_enable_cb(struct virtqueue *vq);
>  
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
> +
> +bool virtqueue_poll(struct virtqueue *vq, unsigned);
> +
>  bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
>  
>  void *virtqueue_detach_unused_buf(struct virtqueue *vq);
> -- 
> MST
> 
> ___
> Virtualization mailing list
> virtualizat...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization

-- 
Asias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-net: put virtio net header inline with data

2013-07-09 Thread David Miller
From: Rusty Russell 
Date: Tue, 09 Jul 2013 17:38:51 +0930

> If you convince DaveM, I won't object :)

Simplifications are great, but not when the merge window opens up.

Sorry, this isn't appropriate now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [off topic] [research] Interviews for contributors over 50 for Oregon State University research

2013-07-09 Thread Jen D
I apologize, I sent out the incorrect recruitment email.

Here is the correct one:

Hello,

Researchers at Oregon State University are striving to conduct
research to learn more about the free/open source software community
landscape as it relates to older adults. We’re looking for older
adults who are older than 50 and are currently involved with a
free/open source software project. You will be excluded from the study
if you are younger than 50, have not contributed to a free/open source
software project, or are not fluent in English. If you’re interested,
we will either do an in-person interview (if you are local to the
Corvallis or Portland area), or an interview over the phone (if you
are not local). The interview is expected to last no longer than an
hour. You will not be compensated for participating in this study.

The study title is Involving Older Adults in the Design and
Development of Free/Open Source Software – Part 2. The principal
investigator is Dr. Carlos Jensen.

If you would like to participate in the study, please read through the
eligibility document and consent document [1]. Please email us at
david...@onid.orst.edu to set up a time to determine your eligibility
and to set up a time/location to do an interview.

Thank you,

Jennifer DavidsonCarlos Jensen

david...@onid.orst.educjen...@eecs.oregonstate.edu

[1] people.oregonstate.edu/~davidsje/researchForms/groupB/

If you are still interested in participating, please let me know.



On Tue, Jul 9, 2013 at 4:55 PM, Jen D  wrote:
> Hello,
>
> Researchers at Oregon State University are striving to conduct
> research to learn more about the free/open source software community
> landscape as it relates to older adults. We have identified you as a
> leader for a free/open source software community. If you’re
> interested, we will either do an in-person interview (if you are local
> to the Corvallis or Portland area), or an interview over the phone (if
> you are not local). The interview is expected to last no longer than
> an hour. You will not be compensated for participating in this study.
>
> The study title is Involving Older Adults in the Design and
> Development of Free/Open Source Software – Part 2. The principal
> investigator is Dr. Carlos Jensen.
>
> If you would like to participate in the study, please read through the
> consent document [1]. Please email us at david...@onid.orst.edu to set
> up a time/location to do an interview.
>
> Thank you,
>
> Jennifer DavidsonCarlos Jensen
>
> david...@onid.orst.educjen...@eecs.oregonstate.edu
>
>  [1] people.oregonstate.edu/~davidsje/researchForms/groupA/
>
> --
> Jennifer Davidson
> Human-Computer Interaction PhD Student
> IGERT in Aging Sciences Program
> Center for Healthy Aging Research
> Department of Electrical Engineering and Computer Science
> Oregon State University



-- 
Jennifer Davidson
Human-Computer Interaction PhD Student
IGERT in Aging Sciences Program
Center for Healthy Aging Research
Department of Electrical Engineering and Computer Science
Oregon State University
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Networking

2013-07-09 Thread David Miller
From: Linus Torvalds 
Date: Tue, 9 Jul 2013 19:50:41 -0700

> Now, I'm all for making descriptive merge commit messages, including
> improving on the summary line. So by all means write those nice merge
> messages with explanations. I think something like
> 
> dc3d807d6fd9 Merge "openvswitch: gre tunneling support."
> 
> would have been a *fine* summary line, for example, and quite possibly
> better than the default kind of git merge summary lines (ie "Merge
> branch 'openswitch'"). So I'm not against playing with merge messages
> per se, it's literally this "cannot tell it's a merge any more in the
> summary" that I thing is a problem.

Ok, I'll use that format in the future.  I was actually trying to add
more information, not less. :-) But yeah that header line has to
mention that it's a merge, for sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pps-gpio: add pinctrl suppport

2013-07-09 Thread Matt Ranostay
Add pincontrol support to pps-gpio driver for selecting the
repective GPIO muxing if applicable.

Signed-off-by: Matt Ranostay 
---
 drivers/pps/clients/pps-gpio.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/pps/clients/pps-gpio.c b/drivers/pps/clients/pps-gpio.c
index eae0eda..8d51d10 100644
--- a/drivers/pps/clients/pps-gpio.c
+++ b/drivers/pps/clients/pps-gpio.c
@@ -33,6 +33,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 
@@ -93,6 +96,7 @@ static int pps_gpio_probe(struct platform_device *pdev)
const char *gpio_label;
int ret;
int pps_default_params;
+   struct pinctrl *pinctrl;
const struct pps_gpio_platform_data *pdata = pdev->dev.platform_data;
struct device_node *np = pdev->dev.of_node;
 
@@ -121,6 +125,11 @@ static int pps_gpio_probe(struct platform_device *pdev)
data->assert_falling_edge = true;
}
 
+   /* PINCTL setup */
+   pinctrl = devm_pinctrl_get_select_default(>dev);
+   if (IS_ERR(pinctrl))
+   pr_warn("pins are not configured from the driver\n");
+
/* GPIO setup */
ret = devm_gpio_request(>dev, data->gpio_pin, gpio_label);
if (ret) {
-- 
1.8.2.rc3.6.g407929c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: /sys/module/pcie_aspm/parameters/policy not writable?

2013-07-09 Thread Robert Hancock

On 07/09/2013 03:49 AM, Pavel Machek wrote:

On Mon 2013-07-08 21:13:21, Greg KH wrote:

On Tue, Jul 09, 2013 at 03:26:11AM +0200, Pavel Machek wrote:

Hi!

My thinkpad has rather high ping latencies... and perhaps it is due to
PCIE ASPM.


Why would that be the problem?  The odds that the PCIE bus is the issue
seems strange to me.


Aha: I guess that's why the file is not writable:

pavel@amd:~$ dmesg | grep -i aspm
ACPI FADT declares the system doesn't support PCIe ASPM, so disable it


IIRC, this message is somewhat misleading. When that FADT flag is set by 
the BIOS, the kernel doesn't so much disable ASPM as disable the 
kernel's control over ASPM. I believe this was to match Windows behavior.



e1000e :02:00.0: Disabling ASPM L0s L1


And given that, I think this message may also be misleading, as the 
kernel won't touch the device's ASPM state. Force-enabling ASPM may 
actually be allowing the driver to disable ASPM on the device.


I seem to recall a recent thread on this about another device.. maybe we 
need to allow drivers to explicitly disable ASPM if it's enabled even if 
the FADT flag is set?



pavel@amd:~$ cat /sys/module/pcie_aspm/parameters/policy
[default] performance powersave
pavel@amd:~$
root@amd:~# echo -n performance >
/sys/module/pcie_aspm/parameters/policy
-su: echo: write error: Operation not permitted
root@amd:~#

But:
1) it should not list unavailable options

2) operation not permitted seems like wrong error code for
operation not supported.

Pavel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Suspend-to-disk issue with identifying swap partition

2013-07-09 Thread Robert Hancock
I recently ran into a problem with suspend to disk on Fedora 19, which I 
reported here:


https://bugzilla.redhat.com/show_bug.cgi?id=981841

In this case swap and /home are encrypted volumes. Essentially (from 
what I understand, correct me if I'm wrong) what happens is that when 
dracut boots up, unlocks the encrypted swap and writes the major/minor 
number of the swap partition to /sys/power/resume to try to resume from 
it, and fails as there's no hibernate image present, the kernel still 
stashes away the major/minor number of the device into 
swsusp_resume_device (see resume_store in kernel/power/hibernate.c). For 
whatever reason those dm-crypt mappings get torn down after dracut 
finishes and recreated afterwards. As it turned out, because of the 
order of the LUKS entries on the kernel command line versus the order of 
the lines in /etc/fstab, the mappings were being recreated in the 
opposite order during the main boot sequence. This resulted in that 
stored major/minor device in swsusp_resume_device now pointing at the 
/home partition instead of the swap partition. When you go to hibernate, 
it fails as obviously that device isn't a swap partition.


It seems to me that it's not a great idea to stash away major/minor 
numbers at attempted resume and try to use them later on. There's no 
guarantee that they will still point at the same device or even exist at 
all. It appears that if the resume device was never explicitly set at 
hibernate time, then the kernel will just pick a usable swap partition 
to hibernate to, but once userspace has set a resume device, there's no 
way to get the kernel to forget about that device and just auto-detect 
at hibernate time again. And if that device no longer exists or isn't a 
swap device anymore, it seems like you're pretty much screwed.


Any thoughts?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
On 07/09/2013 09:07 PM, Srivatsa S. Bhat wrote:
[snip]
> 
> But this still doesn't immediately explain how we can end up trying to
> queue work items on offline CPUs (since policy->cpus is supposed to always
> contain online cpus only, and this does look correct in the code as well,
> at a first glance). But I just wanted to share this finding, in case it
> helps us find out the real root-cause.

The prev info show the policy->cpus won't contain offline cpu, but after
you get one cpu id from it, that cpu will go offline at any time.

I'm not sure what is supposed after notify CPUFREQ_GOV_STOP event, if it
is in order to stop queued work and prevent follow work happen again,
then it failed to, and we need some method to stop queue work again when
CPUFREQ_GOV_STOP notified, like some flag in policy which will be
checked before re-queue work in work.

But if the event is just to sync the queued work but not prevent follow
work happen, then things will become tough...we need confirm.

What's your opinion?

Regards,
Michael Wang

> 
> Also, you might perhaps want to try the (untested) patch shown below, and
> see if it resolves your problem. It basically makes work-items requeue
> themselves on only their respective CPUs and not others, so that
> gov_cancel_work succeeds in its mission. However, I guess the patch is
> wrong from a cpufreq perspective, in case cpufreq really depends on the
> "requeue-work-on-everybody" model.
> 
> Regards,
> Srivatsa S. Bhat
> 
> 
> 
>  drivers/cpufreq/cpufreq_conservative.c |2 +-
>  drivers/cpufreq/cpufreq_governor.c |2 --
>  drivers/cpufreq/cpufreq_ondemand.c |2 +-
>  3 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c 
> b/drivers/cpufreq/cpufreq_conservative.c
> index 0ceb2ef..bbfc1dd 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -120,7 +120,7 @@ static void cs_dbs_timer(struct work_struct *work)
>   struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
>   struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>   int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
> - bool modify_all = true;
> + bool modify_all = false;
> 
>   mutex_lock(_dbs_info->cdbs.timer_mutex);
>   if (!need_load_eval(_dbs_info->cdbs, cs_tuners->sampling_rate))
> diff --git a/drivers/cpufreq/cpufreq_governor.c 
> b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..ec4baeb 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -137,10 +137,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct 
> cpufreq_policy *policy,
>   if (!all_cpus) {
>   __gov_queue_work(smp_processor_id(), dbs_data, delay);
>   } else {
> - get_online_cpus();
>   for_each_cpu(i, policy->cpus)
>   __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
>   }
>  }
>  EXPORT_SYMBOL_GPL(gov_queue_work);
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
> b/drivers/cpufreq/cpufreq_ondemand.c
> index 93eb5cb..241ebc0 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -230,7 +230,7 @@ static void od_dbs_timer(struct work_struct *work)
>   struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
>   struct od_dbs_tuners *od_tuners = dbs_data->tuners;
>   int delay = 0, sample_type = core_dbs_info->sample_type;
> - bool modify_all = true;
> + bool modify_all = false;
> 
>   mutex_lock(_dbs_info->cdbs.timer_mutex);
>   if (!need_load_eval(_dbs_info->cdbs, od_tuners->sampling_rate)) {
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] platform: Convert apple-gmux driver to dev_pm_ops from legacy pm_ops

2013-07-09 Thread Shuah Khan
Convert drivers/platform/x86/apple-gmux to use dev_pm_ops instead of
legacy pm_ops. This patch depends on pnp driver bus ops change to invoke
pnp_driver dev_pm_ops.

Signed-off-by: Shuah Khan 
---
 drivers/platform/x86/apple-gmux.c |   18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/x86/apple-gmux.c 
b/drivers/platform/x86/apple-gmux.c
index f74bfcb..8eea2ef 100644
--- a/drivers/platform/x86/apple-gmux.c
+++ b/drivers/platform/x86/apple-gmux.c
@@ -393,17 +393,21 @@ static void gmux_notify_handler(acpi_handle device, u32 
value, void *context)
complete(_data->powerchange_done);
 }
 
-static int gmux_suspend(struct pnp_dev *pnp, pm_message_t state)
+static int gmux_suspend(struct device *dev)
 {
+   struct pnp_dev *pnp = to_pnp_dev(dev);
struct apple_gmux_data *gmux_data = pnp_get_drvdata(pnp);
+
gmux_data->resume_client_id = gmux_active_client(gmux_data);
gmux_disable_interrupts(gmux_data);
return 0;
 }
 
-static int gmux_resume(struct pnp_dev *pnp)
+static int gmux_resume(struct device *dev)
 {
+   struct pnp_dev *pnp = to_pnp_dev(dev);
struct apple_gmux_data *gmux_data = pnp_get_drvdata(pnp);
+
gmux_enable_interrupts(gmux_data);
gmux_switchto(gmux_data->resume_client_id);
if (gmux_data->power_state == VGA_SWITCHEROO_OFF)
@@ -605,13 +609,19 @@ static const struct pnp_device_id gmux_device_ids[] = {
{"", 0}
 };
 
+static const struct dev_pm_ops gmux_dev_pm_ops = {
+   .suspend = gmux_suspend,
+   .resume = gmux_resume,
+};
+
 static struct pnp_driver gmux_pnp_driver = {
.name   = "apple-gmux",
.probe  = gmux_probe,
.remove = gmux_remove,
.id_table   = gmux_device_ids,
-   .suspend= gmux_suspend,
-   .resume = gmux_resume
+   .driver = {
+   .pm = _dev_pm_ops,
+   },
 };
 
 static int __init apple_gmux_init(void)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] pnp: Change pnp bus pm_ops to invoke pnp driver dev_pm_ops if specified

2013-07-09 Thread Shuah Khan
pnp_bus_suspend() and pnp_bus_resume() invoke legacy pm_ops from
pnp_driver. Changed pnp_bus_suspend() and pnp_bus_resume() to check
if pnp driver has dev_pm_ops and call. If dev_pm_ops don't exist, then
call use legacy pm_ops. Without this change, pnp_driver dev_pm_ops will
not get called.

In addition to the pnp driver bus pm_ops change to invoke driver dev_pm_ops,
this patch set contains changes to rtc-cmos, tpm_tis, and apple-gmux pnp
drivers to convert from legacy pm_ops to dev_pm_ops.

Shuah Khan (4):
pnp: Change pnp bus pm_ops to invoke pnp driver dev_pm_ops
rtc: convert rtc-cmos to dev_pm_ops from legacy pm_ops
tpm: Convert tpm_tis driver to use dev_pm_ops from legacy pm_ops
platform: Convert apple-gmux driver to dev_pm_ops from legacy pm_ops

 drivers/pnp/driver.c |   13 +
 1 file changed, 13 insertions(+)

 drivers/rtc/rtc-cmos.c |   24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

 drivers/char/tpm/tpm_tis.c |   60 ++--
 1 file changed, 24 insertions(+), 36 deletions(-)

 drivers/platform/x86/apple-gmux.c |   18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] tpm: Convert tpm_tis driver to use dev_pm_ops from legacy pm_ops

2013-07-09 Thread Shuah Khan
Convert drivers/char/tpm/tpm_tis.c to use dev_pm_ops instead of legacy pm_ops.
This patch depends on pnp driver bus ops change to invoke pnp_driver
dev_pm_ops.

Signed-off-by: Shuah Khan 
---
 drivers/char/tpm/tpm_tis.c |   60 ++--
 1 file changed, 24 insertions(+), 36 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 4519cb3..5796d01 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -766,6 +766,25 @@ static void tpm_tis_reenable_interrupts(struct tpm_chip 
*chip)
 }
 #endif
 
+#ifdef CONFIG_PM_SLEEP
+static int tpm_tis_resume(struct device *dev)
+{
+   struct tpm_chip *chip = dev_get_drvdata(dev);
+   int ret;
+
+   if (chip->vendor.irq)
+   tpm_tis_reenable_interrupts(chip);
+
+   ret = tpm_pm_resume(dev);
+   if (!ret)
+   tpm_do_selftest(chip);
+
+   return ret;
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(tpm_tis_pm, tpm_pm_suspend, tpm_tis_resume);
+
 #ifdef CONFIG_PNP
 static int tpm_tis_pnp_init(struct pnp_dev *pnp_dev,
  const struct pnp_device_id *pnp_id)
@@ -787,26 +806,6 @@ static int tpm_tis_pnp_init(struct pnp_dev *pnp_dev,
return tpm_tis_init(_dev->dev, start, len, irq);
 }
 
-static int tpm_tis_pnp_suspend(struct pnp_dev *dev, pm_message_t msg)
-{
-   return tpm_pm_suspend(>dev);
-}
-
-static int tpm_tis_pnp_resume(struct pnp_dev *dev)
-{
-   struct tpm_chip *chip = pnp_get_drvdata(dev);
-   int ret;
-
-   if (chip->vendor.irq)
-   tpm_tis_reenable_interrupts(chip);
-
-   ret = tpm_pm_resume(>dev);
-   if (!ret)
-   tpm_do_selftest(chip);
-
-   return ret;
-}
-
 static struct pnp_device_id tpm_pnp_tbl[] = {
{"PNP0C31", 0}, /* TPM */
{"ATM1200", 0}, /* Atmel */
@@ -835,9 +834,12 @@ static struct pnp_driver tis_pnp_driver = {
.name = "tpm_tis",
.id_table = tpm_pnp_tbl,
.probe = tpm_tis_pnp_init,
-   .suspend = tpm_tis_pnp_suspend,
-   .resume = tpm_tis_pnp_resume,
.remove = tpm_tis_pnp_remove,
+#ifdef CONFIG_PM_SLEEP
+   .driver = {
+   .pm = _tis_pm,
+   },
+#endif
 };
 
 #define TIS_HID_USR_IDX sizeof(tpm_pnp_tbl)/sizeof(struct pnp_device_id) -2
@@ -846,20 +848,6 @@ module_param_string(hid, tpm_pnp_tbl[TIS_HID_USR_IDX].id,
 MODULE_PARM_DESC(hid, "Set additional specific HID for this driver to probe");
 #endif
 
-#ifdef CONFIG_PM_SLEEP
-static int tpm_tis_resume(struct device *dev)
-{
-   struct tpm_chip *chip = dev_get_drvdata(dev);
-
-   if (chip->vendor.irq)
-   tpm_tis_reenable_interrupts(chip);
-
-   return tpm_pm_resume(dev);
-}
-#endif
-
-static SIMPLE_DEV_PM_OPS(tpm_tis_pm, tpm_pm_suspend, tpm_tis_resume);
-
 static struct platform_driver tis_drv = {
.driver = {
.name = "tpm_tis",
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] pnp: Change pnp bus pm_ops to invoke pnp driver dev_pm_ops if specified

2013-07-09 Thread Shuah Khan
pnp_bus_suspend() and pnp_bus_resume() invoke legacy pm_ops from
pnp_driver. Changed pnp_bus_suspend() and pnp_bus_resume() to check
if pnp driver has dev_pm_ops and call. If dev_pm_ops don't exist, then
call use legacy pm_ops. Without this change, pnp_driver dev_pm_ops will
not get called.

Signed-off-by: Shuah Khan 
---
 drivers/pnp/driver.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/pnp/driver.c b/drivers/pnp/driver.c
index 00e9403..c3f9e89 100644
--- a/drivers/pnp/driver.c
+++ b/drivers/pnp/driver.c
@@ -163,6 +163,13 @@ static int pnp_bus_suspend(struct device *dev, 
pm_message_t state)
if (!pnp_drv)
return 0;
 
+   if (pnp_drv->driver.pm && pnp_drv->driver.pm->suspend) {
+   error = pnp_drv->driver.pm->suspend(dev);
+   suspend_report_result(pnp_drv->driver.pm->suspend, error);
+   if (error)
+   return error;
+   }
+
if (pnp_drv->suspend) {
error = pnp_drv->suspend(pnp_dev, state);
if (error)
@@ -201,6 +208,12 @@ static int pnp_bus_resume(struct device *dev)
return error;
}
 
+   if (pnp_drv->driver.pm && pnp_drv->driver.pm->resume) {
+   error = pnp_drv->driver.pm->resume(dev);
+   if (error)
+   return error;
+   }
+
if (pnp_drv->resume) {
error = pnp_drv->resume(pnp_dev);
if (error)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] rtc: convert rtc-cmos to dev_pm_ops from legacy pm_ops

2013-07-09 Thread Shuah Khan
Convert drivers/rtc/rtc-cmos to use dev_pm_ops instead of legacy pm_ops.
This patch depends on pnp driver bus ops change to invoke pnp_driver
dev_pm_ops.

Signed-off-by: Shuah Khan 
---
 drivers/rtc/rtc-cmos.c |   24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index be06d71..24e733c 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -1018,23 +1018,6 @@ static void __exit cmos_pnp_remove(struct pnp_dev *pnp)
cmos_do_remove(>dev);
 }
 
-#ifdef CONFIG_PM
-
-static int cmos_pnp_suspend(struct pnp_dev *pnp, pm_message_t mesg)
-{
-   return cmos_suspend(>dev);
-}
-
-static int cmos_pnp_resume(struct pnp_dev *pnp)
-{
-   return cmos_resume(>dev);
-}
-
-#else
-#definecmos_pnp_suspendNULL
-#definecmos_pnp_resume NULL
-#endif
-
 static void cmos_pnp_shutdown(struct pnp_dev *pnp)
 {
if (system_state == SYSTEM_POWER_OFF && !cmos_poweroff(>dev))
@@ -1060,8 +1043,11 @@ static struct pnp_driver cmos_pnp_driver = {
 
/* flag ensures resume() gets called, and stops syslog spam */
.flags  = PNP_DRIVER_RES_DO_NOT_CHANGE,
-   .suspend= cmos_pnp_suspend,
-   .resume = cmos_pnp_resume,
+#ifdef CONFIG_PM_SLEEP
+   .driver = {
+   .pm = _pm_ops,
+   },
+#endif
 };
 
 #endif /* CONFIG_PNP */
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND][PATCH] mtd: refactor call to request_module

2013-07-09 Thread Rusty Russell
Kees Cook  writes:
> This reduces the size of the stack frame when calling request_module().
> Performing the sprintf before the call is not needed.
>
> Signed-off-by: Kees Cook 

Acked-by: Rusty Russell 

Thanks,
Rusty.

> ---
>  drivers/mtd/chips/gen_probe.c |4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/mtd/chips/gen_probe.c b/drivers/mtd/chips/gen_probe.c
> index 74dbb6b..ffb36ba 100644
> --- a/drivers/mtd/chips/gen_probe.c
> +++ b/drivers/mtd/chips/gen_probe.c
> @@ -211,9 +211,7 @@ static inline struct mtd_info *cfi_cmdset_unknown(struct 
> map_info *map,
>  
>   probe_function = __symbol_get(probename);
>   if (!probe_function) {
> - char modname[sizeof("cfi_cmdset_%4.4X")];
> - sprintf(modname, "cfi_cmdset_%4.4X", type);
> - request_module(modname);
> + request_module("cfi_cmdset_%4.4X", type);
>   probe_function = __symbol_get(probename);
>   }
>  
> -- 
> 1.7.9.5
>
>
> -- 
> Kees Cook
> Chrome OS Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PULL] modules-next

2013-07-09 Thread Rusty Russell
The following changes since commit 06df44ee41442d83be061c5fd1b1de4f5fc6fbbf:

  modpost.c: Add .text.unlikely to TEXT_SECTIONS (2013-05-20 12:08:45 +0930)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git 
tags/modules-next-for-linus

for you to fetch changes up to 9eb76d7797b892a1dad4f2efb6f786681306dd13:

  module: cleanup call chain. (2013-07-03 10:15:10 +0930)


Nothing interesting.  Except the most embarrassing bugfix ever.  But let's
ignore that.

Cheers,
Rusty.


Jean Delvare (2):
  There is no /sys/parameters
  ABI: Clarify when /sys/module/MODULENAME is created

Mathias Krause (1):
  module: don't modify argument of module_kallsyms_lookup_name()

Rusty Russell (3):
  modules: don't fail to load on unknown parameters.
  module: do percpu allocation after uniqueness check.  No, really!
  module: cleanup call chain.

 Documentation/ABI/stable/sysfs-module | 10 +++--
 include/linux/moduleparam.h   |  2 +-
 kernel/module.c   | 77 +++
 kernel/params.c   |  2 +-
 4 files changed, 52 insertions(+), 39 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] virtio: support unlocked queue poll

2013-07-09 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> This adds a way to check ring empty state after enable_cb outside any
> locks. Will be used by virtio_net.
>
> Note: there's room for more optimization: caller is likely to have a
> memory barrier already, which means we might be able to get rid of a
> barrier here.  Deferring this optimization until we do some
> benchmarking.
>
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Rusty Russell 

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PULL] virtio-next

2013-07-09 Thread Rusty Russell
The following changes since commit b3087e48ce20be784fae1dbabc2e42e2ad0f21bc:

  virtio: remove virtqueue_add_buf(). (2013-05-20 12:16:01 +0930)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git 
tags/virtio-next-for-linus

for you to fetch changes up to c893c8d763d8a8a757028a48ace7d1bb2dd8373f:

  MAINTAINERS: add tools/virtio/ under virtio (2013-07-09 10:47:54 +0930)


No real surprises.

Thanks,
Rusty.


Andrew Vagin (1):
  virtio-pci: fix leaks of msix_affinity_masks

Jason Wang (1):
  virtio-net: fix the race between channels setting and refill

Luiz Capitulino (1):
  virtio_balloon: leak_balloon(): only tell host if we got pages deflated

Michael S. Tsirkin (3):
  virtio: include asm/barrier explicitly
  tools/virtio: move module license stub to module.h
  MAINTAINERS: add tools/virtio/ under virtio

Paul Bolle (1):
  Fix comment typo "CONFIG_PAE"

Rusty Russell (4):
  tools/lguest: fix missing rmb().
  tools/lguest: real barriers.
  lguest: fix example launcher compilation for broken glibc headers.
  virtio: VIRTIO_F_ANY_LAYOUT feature

 MAINTAINERS|  1 +
 drivers/lguest/page_tables.c   |  2 +-
 drivers/net/virtio_net.c   |  5 +
 drivers/virtio/virtio_balloon.c|  3 ++-
 drivers/virtio/virtio_pci.c|  5 +++--
 include/linux/virtio_ring.h|  1 +
 include/uapi/linux/virtio_config.h |  3 +++
 tools/lguest/Makefile  |  1 -
 tools/lguest/lguest.c  | 32 +++-
 tools/virtio/linux/module.h|  5 +
 tools/virtio/linux/virtio.h|  3 ---
 11 files changed, 40 insertions(+), 21 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: early microcode on amd is broken when no initramfs provided

2013-07-09 Thread Borislav Petkov
On Tue, Jul 09, 2013 at 10:53:31PM -0500, Jacob Shin wrote:
> I won't have access to a box for a while, Boris or Suravee, could you
> please try and reproduce it and get the stack trace when you get the
> chance?
>
> So sorry,

No worries, Jacob, I'm on it. Take your time. :)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH] xen: remove unused Kconfig parameter

2013-07-09 Thread H. Peter Anvin
On 07/09/2013 03:34 PM, Sander Eikelenboom wrote:
> 
> Grub does this in it's update script to prevent adding a xen + kernel 
> combination that has no chance of booting when dom0 support has not been 
> configured in the kernel.
> That doesn't seem to be a unreasonable thought.
> 

Except it does it backwards.  The test it uses is sufficient but not
necessary.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] relay: fix timer madness

2013-07-09 Thread Andrew Morton
On Wed, 10 Jul 2013 11:37:26 +0800 Li Zefan  wrote:

> On 2013/7/10 10:18, zhangwei(Jovi) wrote:
> > When I'm using ktap script to tracing all event tracepoints by relay
> > transport, without this patch, the system will hang in few seconds.
> > 
> > I found the original patch discussion in 2007.
> > http://marc.info/?l=linux-kernel=118544794717162=2
> > (In that mail thread, the patch didn't fix that problem, but it fix
> > the problem I encountered now)
> > 
> > Changed from v1:
> > mod timer interval changed from jiffies+1 to HZ/10, as Ingo suggested.
> > 
> > Original patch changelog from Ingo in 2007:
> > 
> > Remove timer calls (!!!) from deep within the tracing infrastructure.
> > This was totally bogus code that can cause lockups and worse.
> > Poll the buffer every 2 jiffies for now.
> > 
> > Signed-off-by: Ingo Molnar 
> > Signed-off-by: "zhangwei(Jovi)" 
> > Cc: Steven Rostedt 
> > Cc: Jens Axboe 
> > Cc: Al Viro 
> > Cc: Eric Dumazet 
> > Signed-off-by: Andrew Morton 
> 
> I don't think this patch should have Andrew's signed-off-by?

I guess not, unless it was taken from -mm, which would be odd, as I
have the old version.

v1 has been in my tree for a few months - Ingo requested some updates
but nothing happened and I have not checked whether v2 addresses his
requests.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the ceph tree with Linus' tree

2013-07-09 Thread Sage Weil
On Wed, 10 Jul 2013, Stephen Rothwell wrote:
> Hi Sage,
> 
> Today's linux-next merge of the ceph tree got conflicts in
> drivers/block/rbd.c and net/ceph/osd_client.c because the ceph tree was
> rebased before being sent to Linus and it looks like one patch
> was dropped and several more added.
> 
> I just used the upstream version of the cpeh tree for today - please
> clean up.

Fixed, thanks!

sage
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread Guenter Roeck
On Wed, Jul 10, 2013 at 01:53:15AM +0100, David Woodhouse wrote:
> On Tue, 2013-07-09 at 17:18 -0700, Guenter Roeck wrote:
> > 
> > I meant warning as in pr_warn or dev_warn, not WARNING as in traceback.
> > Keep in mind that a casual user doesn't expect to see a traceback and will 
> > tend
> > to get alarmed. Several bugs have been filed against this "issue" in various
> > distributions, which is not surprising given the alarmist message.
> > What is the point of that ?
> 
> It is warning you that your hardware is broken. Take it back to the
> place from which you purchased it, and ask for your money back if it
> isn't fixed.
> 
> (Slightly) more seriously, this level of warning *does* get things
> fixed, and when kerneloops was running it made it very easy to track
> this kind of issue and apply pressure where it was needed to improve
> quality.
> 
> Any user who has taken the trouble to file bugs has *also* taken it up
> with their firmware vendor, I hope?
> 
No idea ... but have you ever tried that as a private entity ?

If there is a secret list of people to contact at vendor X to get things
like this one fixed, please let me know ;).

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the ftrace tree with Linus' tree

2013-07-09 Thread Stephen Rothwell
Hi Steven,

Today's linux-next merge of the ftrace tree got a conflict in
kernel/panic.c between commit dcb6b45254e2 ("panic: add cpu/pid to
warn_slowpath_common in WARNING printk()s") from the  tree and commit
de7edd31457b ("tracing: Disable tracing on warning") from the ftrace tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc kernel/panic.c
index 9771231,4cea6cc..000
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@@ -399,9 -400,10 +400,11 @@@ struct slowpath_args 
  static void warn_slowpath_common(const char *file, int line, void *caller,
 unsigned taint, struct slowpath_args *args)
  {
+   disable_trace_on_warning();
+ 
 -  printk(KERN_WARNING "[ cut here ]\n");
 -  printk(KERN_WARNING "WARNING: at %s:%d %pS()\n", file, line, caller);
 +  pr_warn("[ cut here ]\n");
 +  pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS()\n",
 +  raw_smp_processor_id(), current->pid, file, line, caller);
  
if (args)
vprintk(args->fmt, args->args);


pgpf8ueHEpi4j.pgp
Description: PGP signature


Re: [PATCH V2] relay: fix timer madness

2013-07-09 Thread Li Zefan
On 2013/7/10 10:18, zhangwei(Jovi) wrote:
> When I'm using ktap script to tracing all event tracepoints by relay
> transport, without this patch, the system will hang in few seconds.
> 
> I found the original patch discussion in 2007.
> http://marc.info/?l=linux-kernel=118544794717162=2
> (In that mail thread, the patch didn't fix that problem, but it fix
> the problem I encountered now)
> 
> Changed from v1:
> mod timer interval changed from jiffies+1 to HZ/10, as Ingo suggested.
> 
> Original patch changelog from Ingo in 2007:
> 
> Remove timer calls (!!!) from deep within the tracing infrastructure.
> This was totally bogus code that can cause lockups and worse.
> Poll the buffer every 2 jiffies for now.
> 
> Signed-off-by: Ingo Molnar 
> Signed-off-by: "zhangwei(Jovi)" 
> Cc: Steven Rostedt 
> Cc: Jens Axboe 
> Cc: Al Viro 
> Cc: Eric Dumazet 
> Signed-off-by: Andrew Morton 

I don't think this patch should have Andrew's signed-off-by?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] fsio: filesystem io accounting cgroup

2013-07-09 Thread Tejun Heo
Hello,

On Tue, Jul 09, 2013 at 11:09:55PM -0400, Vivek Goyal wrote:
> Stacking drivers are pretty important ones and we expect throttling
> to work with them. By throttling bio, a single hook worked both for
> request based drivers and bio based drivers.

Oh yeah, sure, we have them working now, so there's no way to break
them but that doesn't mean it's a good overall design.  I don't have a
good answer for this one.  The root cause is having the distinction
between bio and rq based drivers.  With the right constructs, I
suspect we probably could have done away with bio based drivers, but,
well, that's all history now.

> So looks like for bio based drivers you want bio throttling and for
> request based drviers, request throttling and define a separate hook
> in blk_queue_bio(). A generic hook probably can check the type of request
> queue and not throttle bio if it is request based queue and ultimately
> request queue based hook will throttle it.
> 
> So in a cgroup we blkio.throttle.io_serviced will have stats for
> bio/request depending on type of device.
> 
> And we will need to modify throttling logic so that it can handle
> both bio and request throttling. Not sure how much of code can be
> shared for bio/request throttling.

I'm not sure how much (de)multiplexing and sharing we'd be doing but
I'm afraid there's gonna need to be some.  We really can't use the
same logic for SSDs and rotating rusts after all and it probably would
be best to avoid contaminating SSD paths with lots of guesstimating
logics necessary for rotating rusts.

> I am not sure about request based multipath driver and it might
> require some special handling.

If it's not supported now, I'll be happy with just leaving it alone
and telling mp users to configure the underlying queues.

> Is it roughly inline with what you have been thinking.

I'm hoping to keep it somewhat manageable at least.  I wouldn't mind
leaving stacking driver and cfq-iosched support as they are while only
supporting SSD devices with new code.  It's all pie in the sky at this
point and none of this matters before we fix the bdi and writeback
issue anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] powerpc: add real mode support for dma operations on powernv

2013-07-09 Thread Benjamin Herrenschmidt
On Tue, 2013-07-09 at 18:02 +0200, Alexander Graf wrote:
> On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
> > The existing TCE machine calls (tce_build and tce_free) only support
> > virtual mode as they call __raw_writeq for TCE invalidation what
> > fails in real mode.
> >
> > This introduces tce_build_rm and tce_free_rm real mode versions
> > which do mostly the same but use "Store Doubleword Caching Inhibited
> > Indexed" instruction for TCE invalidation.
> 
> So would always using stdcix have any bad side effects?

Yes. Those instructions are only supposed to be used in hypervisor real
mode as per the architecture spec.

Cheers,
Ben.

> 
> Alex
> 
> >
> > This new feature is going to be utilized by real mode support of VFIO.
> >
> > Signed-off-by: Alexey Kardashevskiy
> > ---
> >   arch/powerpc/include/asm/machdep.h| 12 ++
> >   arch/powerpc/platforms/powernv/pci-ioda.c | 26 +++--
> >   arch/powerpc/platforms/powernv/pci.c  | 38 
> > ++-
> >   arch/powerpc/platforms/powernv/pci.h  |  2 +-
> >   4 files changed, 64 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/machdep.h 
> > b/arch/powerpc/include/asm/machdep.h
> > index 92386fc..0c19eef 100644
> > --- a/arch/powerpc/include/asm/machdep.h
> > +++ b/arch/powerpc/include/asm/machdep.h
> > @@ -75,6 +75,18 @@ struct machdep_calls {
> > long index);
> > void(*tce_flush)(struct iommu_table *tbl);
> >
> > +   /* _rm versions are for real mode use only */
> > +   int (*tce_build_rm)(struct iommu_table *tbl,
> > +long index,
> > +long npages,
> > +unsigned long uaddr,
> > +enum dma_data_direction direction,
> > +struct dma_attrs *attrs);
> > +   void(*tce_free_rm)(struct iommu_table *tbl,
> > +   long index,
> > +   long npages);
> > +   void(*tce_flush_rm)(struct iommu_table *tbl);
> > +
> > void __iomem *  (*ioremap)(phys_addr_t addr, unsigned long size,
> >unsigned long flags, void *caller);
> > void(*iounmap)(volatile void __iomem *token);
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 2931d97..2797dec 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -68,6 +68,12 @@ define_pe_printk_level(pe_err, KERN_ERR);
> >   define_pe_printk_level(pe_warn, KERN_WARNING);
> >   define_pe_printk_level(pe_info, KERN_INFO);
> >
> > +static inline void rm_writed(unsigned long paddr, u64 val)
> > +{
> > +   __asm__ __volatile__("sync; stdcix %0,0,%1"
> > +   : : "r" (val), "r" (paddr) : "memory");
> > +}
> > +
> >   static int pnv_ioda_alloc_pe(struct pnv_phb *phb)
> >   {
> > unsigned long pe;
> > @@ -442,7 +448,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb 
> > *phb, struct pci_dev *pdev
> >   }
> >
> >   static void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
> > -u64 *startp, u64 *endp)
> > +u64 *startp, u64 *endp, bool rm)
> >   {
> > u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
> > unsigned long start, end, inc;
> > @@ -471,7 +477,10 @@ static void pnv_pci_ioda1_tce_invalidate(struct 
> > iommu_table *tbl,
> >
> >   mb(); /* Ensure above stores are visible */
> >   while (start<= end) {
> > -__raw_writeq(start, invalidate);
> > +   if (rm)
> > +   rm_writed((unsigned long) invalidate, start);
> > +   else
> > +   __raw_writeq(start, invalidate);
> >   start += inc;
> >   }
> >
> > @@ -483,7 +492,7 @@ static void pnv_pci_ioda1_tce_invalidate(struct 
> > iommu_table *tbl,
> >
> >   static void pnv_pci_ioda2_tce_invalidate(struct pnv_ioda_pe *pe,
> >  struct iommu_table *tbl,
> > -u64 *startp, u64 *endp)
> > +u64 *startp, u64 *endp, bool rm)
> >   {
> > unsigned long start, end, inc;
> > u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
> > @@ -502,22 +511,25 @@ static void pnv_pci_ioda2_tce_invalidate(struct 
> > pnv_ioda_pe *pe,
> > mb();
> >
> > while (start<= end) {
> > -   __raw_writeq(start, invalidate);
> > +   if (rm)
> > +   rm_writed((unsigned long) invalidate, start);
> > +   else
> > +   __raw_writeq(start, invalidate);
> > start += inc;
> > }
> >   }
> >
> >   void pnv_pci_ioda_tce_invalidate(struct iommu_table *tbl,
> > -

Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
On 07/09/2013 09:07 PM, Srivatsa S. Bhat wrote:
[snip]
>>
> 
> Yeah, exactly!
> 
> So I had proposed doing an asynchronous cancel-work or doing the
> synchronous cancel-work in the CPU_POST_DEAD phase, where the
> cpu_hotplug.lock is not held. See this thread:
> 
> http://marc.info/?l=linux-kernel=137241212616799=2
> http://marc.info/?l=linux-pm=137242906622537=2
> 
> But now that I look at commit 2f7021a8 again, I still think we should
> revert it and fix the _actual_ root-cause of the bug.

Agree, or we could revert it with some better fix, otherwise the prev
bug report will back again...

> 
> Cpufreq subsystem has enough synchronization to ensure that policy->cpus
> always contains online CPUs. And it also has the concept of cancelling
> queued work items, *before* that CPU is taken offline.
> So, where is the chance that we try to queue work items on offline CPUs?
> 
> To answer that question, I was looking at the cpufreq code yesterday
> and found something very interesting: the gov_cancel_work() that is
> invoked before a CPU goes offline, can actually end up effectively
> *NOT* cancelling the queued work item!
> 
> The reason is, the per-cpu work items are not just self-queueing (if
> that was the case, gov_cancel_work would have been successful without
> any doubt), but instead, they can also queue work items on *other* CPUs!
> 
> Example from ondemand governor's per-cpu work item:
> 
> static void od_dbs_timer(struct work_struct *work)
> {
>   ...
>   bool modify_all = true;
>   ...
>   gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy, delay, modify_all);
> }
> 
> So, every per-cpu work item can re-queue the work item on *many other*
> CPUs, and not just itself!
> 
> So that leads to a race which makes gov_cancel_work() ineffective.
> The call to cancel_delayed_work_sync() will cancel all pending work items
> on say CPU 3 (which is going down), but immediately after that, say CPU4's
> work item fires and queues the work item on CPU4 as well as CPU3. Thus,
> gov_cancel_work() _effectively_ didn't do anything useful.

That's interesting, sense like a little closer to the root, the timer is
supposed to stop but failed... I need some investigation here...

Regards,
Michael Wang

> 
> But this still doesn't immediately explain how we can end up trying to
> queue work items on offline CPUs (since policy->cpus is supposed to always
> contain online cpus only, and this does look correct in the code as well,
> at a first glance). But I just wanted to share this finding, in case it
> helps us find out the real root-cause.
> 
> Also, you might perhaps want to try the (untested) patch shown below, and
> see if it resolves your problem. It basically makes work-items requeue
> themselves on only their respective CPUs and not others, so that
> gov_cancel_work succeeds in its mission. However, I guess the patch is
> wrong from a cpufreq perspective, in case cpufreq really depends on the
> "requeue-work-on-everybody" model.
> 
> Regards,
> Srivatsa S. Bhat
> 
> 
> 
>  drivers/cpufreq/cpufreq_conservative.c |2 +-
>  drivers/cpufreq/cpufreq_governor.c |2 --
>  drivers/cpufreq/cpufreq_ondemand.c |2 +-
>  3 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c 
> b/drivers/cpufreq/cpufreq_conservative.c
> index 0ceb2ef..bbfc1dd 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -120,7 +120,7 @@ static void cs_dbs_timer(struct work_struct *work)
>   struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
>   struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>   int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
> - bool modify_all = true;
> + bool modify_all = false;
> 
>   mutex_lock(_dbs_info->cdbs.timer_mutex);
>   if (!need_load_eval(_dbs_info->cdbs, cs_tuners->sampling_rate))
> diff --git a/drivers/cpufreq/cpufreq_governor.c 
> b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..ec4baeb 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -137,10 +137,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct 
> cpufreq_policy *policy,
>   if (!all_cpus) {
>   __gov_queue_work(smp_processor_id(), dbs_data, delay);
>   } else {
> - get_online_cpus();
>   for_each_cpu(i, policy->cpus)
>   __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
>   }
>  }
>  EXPORT_SYMBOL_GPL(gov_queue_work);
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c 
> b/drivers/cpufreq/cpufreq_ondemand.c
> index 93eb5cb..241ebc0 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -230,7 +230,7 @@ static void od_dbs_timer(struct work_struct *work)
>   struct dbs_data *dbs_data = 

Re: [PATCH net-next] net: rename low latency sockets functions to busy poll

2013-07-09 Thread Eliezer Tamir
On 10/07/2013 02:06, David Miller wrote:
> From: Jonathan Corbet 
> Date: Tue, 9 Jul 2013 16:25:14 -0600
> 
>> On Mon, 08 Jul 2013 16:20:34 +0300
>> Eliezer Tamir  wrote:
>>
>>> Rename POLL_LL to POLL_BUSY_LOOP.
>>
>> So pardon me if I speak out of turn, but it occurs to me to
>> wonder...should the SO_LL socket option be renamed in a similar fashion
>> before this interface escapes into the wild?
> 
> Sure and we can rename include/net/ll_poll.h to something more
> fitting as well.
> 
> I'll make sure this happens before 3.11 gets even close to release.

David,

If the following names changes are acceptable I will try to send out
a patch today.

1. include/net/ll_poll.h -> include/net/busy_poll.h

2. ndo_ll_poll -> ndo_busy_poll

- not technically accurate since the ndo callback does not itself busy
poll, it's just used to implement it.

maybe ndo_napi_id_poll? or ndo_id_poll? I don't really like any of them,
so a suggestion would be nice.

3. sysctl_net_ll_{read,poll} -> sysctl_net_busy_{read,poll}
- along with matching file name changes.

4. {sk,skb}_mark_ll -> {sk,skb}_mark_napi_id

5. LL_SO -> BUSY_POLL_SO

Are you OK with the names?
Did I miss anything?

Thanks,
Eliezer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] powerpc: add real mode support for dma operations on powernv

2013-07-09 Thread Alexey Kardashevskiy
On 07/10/2013 02:02 AM, Alexander Graf wrote:
> On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
>> The existing TCE machine calls (tce_build and tce_free) only support
>> virtual mode as they call __raw_writeq for TCE invalidation what
>> fails in real mode.
>>
>> This introduces tce_build_rm and tce_free_rm real mode versions
>> which do mostly the same but use "Store Doubleword Caching Inhibited
>> Indexed" instruction for TCE invalidation.
> 
> So would always using stdcix have any bad side effects?


PowerISA says "They must be executed only when MSRDR=0" about stdcix.



> 
> 
> Alex
> 
>>
>> This new feature is going to be utilized by real mode support of VFIO.
>>
>> Signed-off-by: Alexey Kardashevskiy
>> ---
>>   arch/powerpc/include/asm/machdep.h| 12 ++
>>   arch/powerpc/platforms/powernv/pci-ioda.c | 26 +++--
>>   arch/powerpc/platforms/powernv/pci.c  | 38
>> ++-
>>   arch/powerpc/platforms/powernv/pci.h  |  2 +-
>>   4 files changed, 64 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/machdep.h
>> b/arch/powerpc/include/asm/machdep.h
>> index 92386fc..0c19eef 100644
>> --- a/arch/powerpc/include/asm/machdep.h
>> +++ b/arch/powerpc/include/asm/machdep.h
>> @@ -75,6 +75,18 @@ struct machdep_calls {
>>   long index);
>>   void(*tce_flush)(struct iommu_table *tbl);
>>
>> +/* _rm versions are for real mode use only */
>> +int(*tce_build_rm)(struct iommu_table *tbl,
>> + long index,
>> + long npages,
>> + unsigned long uaddr,
>> + enum dma_data_direction direction,
>> + struct dma_attrs *attrs);
>> +void(*tce_free_rm)(struct iommu_table *tbl,
>> +long index,
>> +long npages);
>> +void(*tce_flush_rm)(struct iommu_table *tbl);
>> +
>>   void __iomem *(*ioremap)(phys_addr_t addr, unsigned long size,
>>  unsigned long flags, void *caller);
>>   void(*iounmap)(volatile void __iomem *token);
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
>> b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 2931d97..2797dec 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -68,6 +68,12 @@ define_pe_printk_level(pe_err, KERN_ERR);
>>   define_pe_printk_level(pe_warn, KERN_WARNING);
>>   define_pe_printk_level(pe_info, KERN_INFO);
>>
>> +static inline void rm_writed(unsigned long paddr, u64 val)
>> +{
>> +__asm__ __volatile__("sync; stdcix %0,0,%1"
>> +: : "r" (val), "r" (paddr) : "memory");
>> +}
>> +
>>   static int pnv_ioda_alloc_pe(struct pnv_phb *phb)
>>   {
>>   unsigned long pe;
>> @@ -442,7 +448,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb
>> *phb, struct pci_dev *pdev
>>   }
>>
>>   static void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
>> - u64 *startp, u64 *endp)
>> + u64 *startp, u64 *endp, bool rm)
>>   {
>>   u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
>>   unsigned long start, end, inc;
>> @@ -471,7 +477,10 @@ static void pnv_pci_ioda1_tce_invalidate(struct
>> iommu_table *tbl,
>>
>>   mb(); /* Ensure above stores are visible */
>>   while (start<= end) {
>> -__raw_writeq(start, invalidate);
>> +if (rm)
>> +rm_writed((unsigned long) invalidate, start);
>> +else
>> +__raw_writeq(start, invalidate);
>>   start += inc;
>>   }
>>
>> @@ -483,7 +492,7 @@ static void pnv_pci_ioda1_tce_invalidate(struct
>> iommu_table *tbl,
>>
>>   static void pnv_pci_ioda2_tce_invalidate(struct pnv_ioda_pe *pe,
>>struct iommu_table *tbl,
>> - u64 *startp, u64 *endp)
>> + u64 *startp, u64 *endp, bool rm)
>>   {
>>   unsigned long start, end, inc;
>>   u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
>> @@ -502,22 +511,25 @@ static void pnv_pci_ioda2_tce_invalidate(struct
>> pnv_ioda_pe *pe,
>>   mb();
>>
>>   while (start<= end) {
>> -__raw_writeq(start, invalidate);
>> +if (rm)
>> +rm_writed((unsigned long) invalidate, start);
>> +else
>> +__raw_writeq(start, invalidate);
>>   start += inc;
>>   }
>>   }
>>
>>   void pnv_pci_ioda_tce_invalidate(struct iommu_table *tbl,
>> - u64 *startp, u64 *endp)
>> + u64 *startp, u64 *endp, bool rm)
>>   {
>>   struct pnv_ioda_pe *pe = container_of(tbl, struct pnv_ioda_pe,
>> tce32_table);
>>   struct pnv_phb *phb = pe->phb;
>>
>>   if (phb->type == PNV_PHB_IODA1)
>> -pnv_pci_ioda1_tce_invalidate(tbl, startp, endp);
>> +pnv_pci_ioda1_tce_invalidate(tbl, 

Re: [Xen-devel] [PATCH] xen: remove unused Kconfig parameter

2013-07-09 Thread Borislav Petkov
On Wed, Jul 10, 2013 at 12:34:58AM +0200, Sander Eikelenboom wrote:
> Grub does this in it's update script to prevent adding a xen + kernel
> combination that has no chance of booting when dom0 support has not
> been configured in the kernel. That doesn't seem to be a unreasonable
> thought.

Actually, I don't see the problem - if there's no such support, then the
boot fails - plain and simple. That's like not building in or adding
into the initrd, support for your root filesystem - it is your own damn
fault. It is not the bootloader's job to sanity check whether your
kernel boots or not.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] perf tools: Make Power7 events available for perf

2013-07-09 Thread Michael Ellerman
On Tue, Jul 09, 2013 at 10:14:34AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 08, 2013 at 10:24:34PM -0400, Vince Weaver wrote:
> > 
> > So something like they have on ARM?
> > 
> > vince@pandaboard:/sys/bus/event_source/devices$ ls -l
> > lrwxrwxrwx 1 root root 0 Jul  8 21:57 ARMv7 Cortex-A9 -> 
> > ../../../devices/ARMv7 Cortex-A9
> > lrwxrwxrwx 1 root root 0 Jul  8 21:57 breakpoint -> 
> > ../../../devices/breakpoint
> > lrwxrwxrwx 1 root root 0 Jul  8 21:57 software -> ../../../devices/software
> > lrwxrwxrwx 1 root root 0 Jul  8 21:57 tracepoint -> 
> > ../../../devices/tracepoint
> 
> Right so what I remember of the ARM case is that their /proc/cpuinfo isn't
> sufficient to identify their PMU. And they don't have a cpuid like instruction
> at all.
> 
> > > For the cpu you can obviously just detect what processor you're on with
> > > cpuid or whatever, but it's a bit of a hack. And that really doesn't
> > > work for non-cpu PMUs.
> > 
> > why is it a hack to use cpuid?
> 
> I agree, for x86 cpuid is perfectly fine, as would /proc/cpuinfo be, I suspect
> that just the model number is sufficient in most cases, even for uncore stuff.
 
What about things on PCI? Other strange buses?

As long as everything's in /sys then it should be _possible_ for
userspace to work out what's what, but it's going to end up with a bunch
of detection logic and heuristics in the library.

At which point you've just rewritten libpfm4.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


nfsd changes for 3.11

2013-07-09 Thread J. Bruce Fields
Please pull nfsd changes for 3.11 from the for-3.11 branch at

  git://linux-nfs.org/~bfields/linux.git for-3.11

Changes this time include:

- 4.1 enabled on the server by default: the last 4.1-specific
  issues I know of are fixed, so we're not going to find the
  rest of the bugs without more exposure.
- Experimental support for NFSv4.2 MAC Labeling (to allow
  running selinux over NFS), from Dave Quigley.
- Fixes for some delicate cache/upcall races that could cause
  rare server hangs; thanks to Neil Brown and Bodo Stroesser for
  extreme debugging persistence.
- Fixes for some bugs found at the recent NFS bakeathon, mostly
  v4 and v4.1-specific, but also a generic bug handling
  fragmented rpc calls.

--b.

David Quigley (1):
  NFSD: Server implementation of MAC Labeling

J. Bruce Fields (21):
  nfsd4: store correct client minorversion for >=4.2
  security: cap_inode_getsecctx returning garbage
  sunrpc: server back channel needs no rpcbind method
  nfsd4: fix compile in !CONFIG_NFSD_V4_SECURITY_LABEL case
  Merge branch 'for-3.10' into 'for-3.11'
  svcrpc: introduce init_svc_cred
  svcrpc: store gss mech in svc_cred
  nfsd4: implement minimal SP4_MACH_CRED
  nfsd4: fail attempts to request gss on the backchannel
  nfsd4: allow client to send no cb_sec flavors
  nfsd4: clean up nfs4_open_delegation
  nfsd4: fix decoding of compounds across page boundaries
  nfsd4: minor read_buf cleanup
  svcrpc: fix handling of too-short rpc's
  svcrpc: don't error out on small tcp fragment
  nfsd4: delegation-based open reclaims should bypass permissions
  nfsd4: do not throw away 4.1 lock state on last unlock
  nfsd4: return delegation immediately if lease fails
  svcrpc: fix failures to handle -1 uid's
  nfsd4: allow destroy_session over destroyed session
  nfsd4: support minorversion 1 by default

Jim Rees (1):
  nfsd: avoid undefined signed overflow

NeilBrown (5):
  sunrpc/cache: remove races with queuing an upcall.
  sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
  sunrpc/cache: ensure items removed from cache do not have pending upcalls.
  net/sunrpc: xpt_auth_cache should be ignored when expired.
  sunrpc: Don't schedule an upcall on a replaced cache entry.

Steve Dickson (3):
  NFS: Add NFSv4.2 protocol constants
  NFSDv4.2: Add NFS v4.2 support to the NFS server
  NFSD: Don't give out read delegations on creates

Zhao Hongjiang (1):
  nfsd: get rid of the unused functions in vfs

chaoting fan (1):
  sunrpc: the cache_detail in cache_is_valid is unused any more

 fs/nfsd/Kconfig   |   16 +++
 fs/nfsd/nfs4proc.c|   44 ++-
 fs/nfsd/nfs4state.c   |  225 ++---
 fs/nfsd/nfs4xdr.c |  169 ++---
 fs/nfsd/nfsd.h|   26 +++-
 fs/nfsd/nfssvc.c  |2 +-
 fs/nfsd/state.h   |1 +
 fs/nfsd/vfs.c |   28 
 fs/nfsd/vfs.h |7 +-
 fs/nfsd/xdr4.h|4 +
 include/linux/nfs4.h  |9 ++
 include/linux/sunrpc/cache.h  |   49 ---
 include/linux/sunrpc/gss_api.h|2 +
 include/linux/sunrpc/svcauth.h|   11 ++
 net/sunrpc/auth_gss/gss_mech_switch.c |5 +-
 net/sunrpc/auth_gss/svcauth_gss.c |   10 +-
 net/sunrpc/cache.c|   83 ++--
 net/sunrpc/svcauth_unix.c |6 +-
 net/sunrpc/svcsock.c  |9 +-
 net/sunrpc/xprtsock.c |1 -
 security/capability.c |2 +-
 21 files changed, 527 insertions(+), 182 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/params.c: print failure information instead of 'KOBJ_ADD' to user space, when sysfs_create_file() fails.

2013-07-09 Thread Chen Gang F T
On 07/10/2013 10:35 AM, Chen Gang wrote:
> On 07/10/2013 10:17 AM, Chen Gang F T wrote:
>> On 07/09/2013 04:07 PM, Rusty Russell wrote:
>>> Chen Gang  writes:
 When sysfs_create_file() fails, recommend to print the related failure
 information. And it is useless to still 'KOBJ_ADD' to user space.

 Signed-off-by: Chen Gang 
>>>
>>> sysfs_create_file() should not fail during boot, should it?
>>>
>>
>> Hmm..., please reference locate_module_kobject() in "kernel/params.c",
>> which is an '__init' function, and also call sysfs_create_file(), it
>> processes the related error.
>>
>> So I recommend to get the check too in version_sysfs_builtin().
>>
> 
> Oh, also for locate_module_kobject(), if !CONFIG_MODULES, when error
> occurs, it still print the information about "Adding module".
> 


> Hmm..., do we need call kobject_get() before kobject_put() in failure
> processing block ?
> 

Oh, sorry for what I said for kobject_get/put() items above, it is
incorrect.

What about the diff below for kobject_get() ?

---diff begin---

diff --git a/kernel/params.c b/kernel/params.c
index 440e65d..ef8d720 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -754,11 +754,11 @@ static struct module_kobject * __init 
locate_module_kobject(const char *name)
name, err);
return NULL;
}
-
-   /* So that we hold reference in both cases. */
-   kobject_get(>kobj);
}
 
+   /* So that we hold reference in both cases. */
+   kobject_get(>kobj);
+
return mk;
 }

---diff end-

And it also need add additional kobject_put(), if we really need
process the failure in version_sysfs_builtin().

Thanks.

> 
> 740 mk = kzalloc(sizeof(struct module_kobject), GFP_KERNEL);
> 741 BUG_ON(!mk);
> 742 
> 743 mk->mod = THIS_MODULE;
> 744 mk->kobj.kset = module_kset;
> 745 err = kobject_init_and_add(>kobj, _ktype, NULL,
> 746"%s", name);
> 747 #ifdef CONFIG_MODULES
> 748 if (!err)
> 749 err = sysfs_create_file(>kobj, 
> _uevent.attr);
> 750 #endif
> 751 if (err) {
> 752 kobject_put(>kobj);
> 753 pr_crit("Adding module '%s' to sysfs failed (%d), 
> the system may be unstable.\n",
> 754 name, err);
> 755 return NULL;
> 756 }
> 757 
> 758 /* So that we hold reference in both cases. */
> 759 kobject_get(>kobj);
> 760 }
> 761 
> 762 return mk;
> 763 }
> 
> 
>> Thanks.
>>
>>> Cheers,
>>> Rusty.
>>>
 ---
  kernel/params.c |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

 diff --git a/kernel/params.c b/kernel/params.c
 index 440e65d..f5299c1 100644
 --- a/kernel/params.c
 +++ b/kernel/params.c
 @@ -845,7 +845,13 @@ static void __init version_sysfs_builtin(void)
mk = locate_module_kobject(vattr->module_name);
if (mk) {
err = sysfs_create_file(>kobj, >mattr.attr);
 -  kobject_uevent(>kobj, KOBJ_ADD);
 +  if (err)
 +  printk(KERN_WARNING
 + "%s (%d): sysfs_create_file fail for %s, 
 err: %d\n",
 + __FILE__, __LINE__,
 + vattr->module_name, err);
 +  else
 +  kobject_uevent(>kobj, KOBJ_ADD);
kobject_put(>kobj);
}
}
 -- 
 1.7.7.6
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>
>>
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] PCI,pciehp: avoid add a device already exist during pciehp_resume

2013-07-09 Thread Yijing Wang
Hi Bjorn,
   Thanks for your review and comments!

>> We can use PCIe Device Serial Number to identify the device if
>> device support DSN.
> 
> I think I like the idea of this, especially because the Microsoft PCI
> Hardware Compliance Test apparently requires DSN for hot-pluggable
> PCIe devices [1], so it should be pretty universal.
> 
> [1] 
> http://www.techtalkz.com/microsoft-device-drivers-dtm/341362-dtm-pcihct-test-violates-pci-express-base-specification-revision-1-a.html
> 
>> currently:
>> 1. slot is empty before suspend, insert card during suspend.
>> In this case, is safe, pciehp will add device by check adapter
>> status register in pciehp_resume.
> 
> Your patch doesn't change anything here.

Yes, I only to make some changes for case 3/4.

> 
>> 2. slot is non empty before suspend, remove card during suspend.
>> Also be safe, pciehp will remove device by check adapter
>> status register in pciehp_resume.
> 
> Your patch doesn't change anything here.  (But I think the driver
> .remove() method will try to poke at the non-existent device; see
> below.)

I'm not sure the result of driver .remove() method to poke at the non-existent 
device.
If driver .remove() method cannot detect the real device, remove action will be 
block ?
If the slot support surprise hot remove, this action maybe safe. right?

If the slot does not support surprise hot remove, but the device was already 
removed,
we seem to have no other way to clean the stale data related to the old device.

Now if we check adapter status in slot and found adapter is non existent, 
pciehp resume
call pciehp_disable_slot() , in pciehp_disable_slot() function, we will check 
latch status,
I guess this case latch is open(because slot is empty), this action will abort.
But I have no platform to test it.

> 
>> 3. slot is non empty before suspend, remove card during suspend
>> and insert a new card.
>> Now pciehp just call pciehp_enable_slot() roughly. We should
>> remove the old card firstly, then add the new card.
> 
> With your patch, I think we'll call the old driver's .remove() method
> on the new device.  This seems bad; see below.

Ah, this is issue.
What about power off slot first, then call the old driver's remove() method
will not touch the new physical device. After the old driver's remove() cleanup,
we call pciehp_enable_slot() to power on and enable the new device.

> 
> With your patch, if we remove and reinsert the same device while
> suspended, we do nothing because the DSN didn't change.  Previously we
> called pciehp_enable_slot().  I don't know if we need to do anything
> here or not.

Mainly to avoid the redundant device add, the same like the changes for case 4

> 
>> 4. slot is non empty before suspend, no action during suspend.
>> We should do nothing in pciehp_resume, but we call
>> pciehp_enable_slot(), so some uncomfortable messages show like above.
>> In this case, we can improve it a little by add a guard
>> if (!list_empty(bus->devices)).
> 
> This is the common case.  Previously we called pciehp_enable_slot(),
> and with your patch we do nothing.  I think that seems sensible, but
> this part should be split into a separate patch.  That way we can keep
> the benefit of this change even if we trip over something with the
> other changes.

OK, I will split this changes into a new patch.

> 
>> Reported-by: Paul Bolle 
>> Signed-off-by: Yijing Wang 
>> Cc: Paul Bolle 
>> Cc: "Rafael J. Wysocki" 
>> Cc: Oliver Neukum 
>> Cc: Gu Zheng 
>> Cc: linux-...@vger.kernel.org
>> ---
>>  drivers/pci/hotplug/pciehp_core.c |   38 
>> +---
>>  1 files changed, 34 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/pci/hotplug/pciehp_core.c 
>> b/drivers/pci/hotplug/pciehp_core.c
>> index 7d72c5e..d01e093 100644
>> --- a/drivers/pci/hotplug/pciehp_core.c
>> +++ b/drivers/pci/hotplug/pciehp_core.c
>> @@ -291,6 +291,28 @@ static void pciehp_remove(struct pcie_device *dev)
>>  }
>>
>>  #ifdef CONFIG_PM
>> +
>> +/* If device support Device Serial Numner, use DSN
> 
> s/support/supports/
> s/Numner/Number/
> Use conventional comment style:
>   /*
>* If device ...
>*/
> 

Will update, thanks.


>> + * to identify the device
>> + */
>> +static bool device_in_slot_is_changed(struct pci_bus *pbus)
>> +{
>> +   u64 old_dsn, new_dsn;
>> +   struct pci_dev *pdev;
>> +
>> +   pdev = pci_get_slot(pbus, PCI_DEVFN(0, 0));
> 
> pci_get_slot() can fail.

Will add failure return check, thanks.

> 
>> +   old_dsn = pdev->sn;
>> +
>> +   /* get func 0 device serial number */
>> +   pci_get_dsn(pdev, _dsn);
>> +   if (status) {
>> +   if (list_empty(>devices))
>> +   pciehp_enable_slot(slot);
>> +   else if (device_in_slot_is_changed(pbus)) {
>> +   pciehp_disable_slot(slot);
> 
> pciehp_disable_slot() ultimately calls the .remove() method for the
> device that has already been removed.  

linux-next: build failure after merge of the slab tree

2013-07-09 Thread Stephen Rothwell
Hi all,

After merging the slab tree, today's linux-next build (x86_64
allmodconfig) failed like this:

In file included from include/linux/slab.h:17:0,
 from include/linux/crypto.h:24,
 from arch/x86/kernel/asm-offsets.c:8:
include/linux/kmemleak.h: In function 'kmemleak_alloc_recursive':
include/linux/kmemleak.h:44:16: error: 'SLAB_NOLEAKTRACE' undeclared (first use 
in this function)
  if (!(flags & SLAB_NOLEAKTRACE))
^
include/linux/kmemleak.h: In function 'kmemleak_free_recursive':
include/linux/kmemleak.h:50:16: error: 'SLAB_NOLEAKTRACE' undeclared (first use 
in this function)
  if (!(flags & SLAB_NOLEAKTRACE))
^

Probably caused by commit 590a63973e36 ("mm/sl[aou]b: Move kmalloc
definitions to slab.h").

I have used the slab tree from next-20130709 for today.

And, yes, I am a little annoyed by this.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpAAqmQiDVLL.pgp
Description: PGP signature


Re: [PATCH] of: match the compatible in the order set by the dts file

2013-07-09 Thread Huang Shijie
于 2013年07月09日 20:03, Rob Herring 写道:
> the same and that is the assumption. Matching is not just based on
> compatible properties and your patch does not handle the other cases.
Could you show a example of "the other cases"?

After this patch,

[1] the matching will first check the @match->type and @match->name,
if we can match the @match->type or @match->name, return the match
immediately.

[2] If [1] fails, we will check the compatible properties.



thanks
Huang Shijie



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Networking

2013-07-09 Thread Linus Torvalds
On Tue, Jul 9, 2013 at 2:53 PM, David Miller  wrote:
>
> This is a re-do of the net-next pull request for the current merge
> window.  The only difference from the one I made the other day is that
> this has Eliezer's interface renames and the timeout handling changes
> made based upon your feedback, as well as a few bug fixes that have
> trickeled in.

David, what the heck are you doing?

Take a look at commit e1d6fbc3dedb, for example.

That's a *merge* commit that you have done using "git merge --no-ff"
or something equivalent. Fine.

But what is *not* fine is how you've then edited the message to make
the commit log look like it's not a merge at all!

There's another one in dc3d807d6fd9. Again, it's a merge, but you
wouldn't know if from the commit message.

You seem to do this non-ff thing on purpose, since there are also
things like commit b0b02c77d7aa, but there at least you make it clear
it's a merge. I'm not a huge fan of non-ff merges, but I do see the
advantages of maintainers being able to separate that part of the
history and giving an added "overall summary" merge commit message, so
I'm ok with that part, and I've considered it myself. We can even
discuss making it some recommended thing if people really like it
widely.

But the summary lines absolutely needs to spell out that it's a merge.
You can't just make a merge look like it's some kind of normal commit.
Because in many contexts it really is not otherwise all that
noticeable that they are merges.

Seriously. Those commits now have TOTALLY MISLEADING summary messages.
Think about what they look like in shortlogs etc one-liner summary
formats ("git log --oneline" etc).

So the summary message for a merge needs to mention that it's a merge.
Not this insane "try to make merges look like non-merge commits" thing
you've done. There is zero upside to editing away the merge part of
the message. Plus now they look totally different from your other
merges, for no good reasons.

Looking around, you've apparently done this before: commit
912df2628bd1 back in January did it too. I didn't catch it back then.
But now there's two new ones. See which ones stand out by doing

git log --oneline --merges --author=davem

(There's a few oddball ones by other people too, most from the early
days in 2005 when we didn't have good git workflows. Some much more
recent dubious ones too, so you're not _entirely_ alone here, there's
one from Linville too, for example)

Now, I'm all for making descriptive merge commit messages, including
improving on the summary line. So by all means write those nice merge
messages with explanations. I think something like

dc3d807d6fd9 Merge "openvswitch: gre tunneling support."

would have been a *fine* summary line, for example, and quite possibly
better than the default kind of git merge summary lines (ie "Merge
branch 'openswitch'"). So I'm not against playing with merge messages
per se, it's literally this "cannot tell it's a merge any more in the
summary" that I thing is a problem.

I'm going to pull it, because trying to fix this is too damn painful,
but I really *really* want to see merges with summary messages that
make it clear that they are merges (ie they spell out that "Merge"
part). If you want to improve them by extending on the branch name
etc, go wild.  But don't break "git shortlog" or "git log --oneline"
etc.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v3.10 regression] deadlock on cpu hotplug

2013-07-09 Thread Michael Wang
On 07/09/2013 07:51 PM, Bartlomiej Zolnierkiewicz wrote:
[snip]
> 
> It doesn't help and unfortunately it just can't help as it only
> addresses lockdep functionality while the issue is not a lockdep
> problem but a genuine locking problem. CPU hot-unplug invokes
> _cpu_down() which calls cpu_hotplug_begin() which in turn takes
> _hotplug.lock. The lock is then hold during __cpu_notify()
> call. Notifier chain goes up to cpufreq_governor_dbs() which for
> CPUFREQ_GOV_STOP event does gov_cancel_work(). This function
> flushes pending work and waits for it to finish. The all above
> happens in one kernel thread. At the same time the other kernel
> thread is doing the work we are waiting to complete and it also
> happens to do gov_queue_work() which calls get_online_cpus().
> Then the code tries to take _hotplug.lock which is already
> held by the first thread and deadlocks.

Hmm...I think I get your point, some thread hold the lock and
flush some work which also try to hold the same lock, correct?

Ok, that's a problem, let's figure out a good way to solve it :)

Regards,
Michael Wang




> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R Institute Poland
> Samsung Electronics
> 
>> diff --git a/drivers/cpufreq/cpufreq_governor.c 
>> b/drivers/cpufreq/cpufreq_governor.c
>> index 5af40ad..aa05eaa 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>> }
>>  }
>>  
>> +static struct lock_class_key j_cdbs_key;
>> +
>>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>> struct common_dbs_data *cdata, unsigned int event)
>>  {
>> @@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
>> 
>> kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>>  
>> mutex_init(_cdbs->timer_mutex);
>> +   lockdep_set_class(_cdbs->timer_mutex, _cdbs_key);
>> +
>> INIT_DEFERRABLE_WORK(_cdbs->work,
>>  dbs_data->cdata->gov_dbs_timer);
>> }
>>
>> Regards,
>> Michael Wang
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/2] perf tools: Make Power7 events available for perf

2013-07-09 Thread Michael Ellerman
On Tue, Jul 09, 2013 at 11:20:50AM -0400, Vince Weaver wrote:
> On Tue, 9 Jul 2013, Michael Ellerman wrote:
> 
> > On Mon, Jul 08, 2013 at 10:24:34PM -0400, Vince Weaver wrote:
> > > why is it a hack to use cpuid?
> > 
> > Because you're assuming that the PMU the kernel has exposed is for the
> > cpu you happen to be executing on.
> > 
> > But the real issue is with PMUs that are not in the CPU - there is no
> > easy way for userspace to detect them and determine which event list it
> > should be consulting.
> 
> what kind of devices are you talking about?  

GPUs, PCI host bridges, memory controllers, PCI attached accelerators,
strange devices on non standard buses, you name it.

> If they have kernel/perf_event support then they'd be putting a
> directory entry with a unique name into
> /sys/bus/event_source/devices/, right?

Yes. But although the name is unique it's not sufficient to actually
identify the list of events.

For example the CPU PMU is called "cpu" on most architectures, so userspace
needs to work out which exact CPU it is - and I know that's possible,
but it means the "simple little" event parsing library is not so simple
anymore.

Then imagine you have a GPU on PCI which registers its PMU as "gpu" -
how do you work out which GPU it is? Userspace can probably work it out
by trawling through sysfs and finding the vendor and device ids and
matching that with a lookup table. The library just got less simple
again.

Now say you have a PMU in your memory controller, it's not represented
in sysfs except for the PMU. Which memory controller is it? Maybe you
can infer it from the CPU you're on, but maybe you can't.

> > This whole thread is about making the event list not the kernel's job?
> 
> Yes.  This has been debated forever here; I'm firmly in the "event lists 
> should be entirely in userspace" camp but that's not the majority 
> position.

Yes we agree on the event list being in userspace, you can stop trying
to convince me.

What shouldn't be in userspace is the logic to detect which PMUs are
available on the system.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/params.c: print failure information instead of 'KOBJ_ADD' to user space, when sysfs_create_file() fails.

2013-07-09 Thread Chen Gang
On 07/10/2013 10:17 AM, Chen Gang F T wrote:
> On 07/09/2013 04:07 PM, Rusty Russell wrote:
>> Chen Gang  writes:
>>> When sysfs_create_file() fails, recommend to print the related failure
>>> information. And it is useless to still 'KOBJ_ADD' to user space.
>>>
>>> Signed-off-by: Chen Gang 
>>
>> sysfs_create_file() should not fail during boot, should it?
>>
> 
> Hmm..., please reference locate_module_kobject() in "kernel/params.c",
> which is an '__init' function, and also call sysfs_create_file(), it
> processes the related error.
> 
> So I recommend to get the check too in version_sysfs_builtin().
> 

Oh, also for locate_module_kobject(), if !CONFIG_MODULES, when error
occurs, it still print the information about "Adding module".

Hmm..., do we need call kobject_get() before kobject_put() in failure
processing block ?


740 mk = kzalloc(sizeof(struct module_kobject), GFP_KERNEL);
741 BUG_ON(!mk);
742 
743 mk->mod = THIS_MODULE;
744 mk->kobj.kset = module_kset;
745 err = kobject_init_and_add(>kobj, _ktype, NULL,
746"%s", name);
747 #ifdef CONFIG_MODULES
748 if (!err)
749 err = sysfs_create_file(>kobj, 
_uevent.attr);
750 #endif
751 if (err) {
752 kobject_put(>kobj);
753 pr_crit("Adding module '%s' to sysfs failed (%d), 
the system may be unstable.\n",
754 name, err);
755 return NULL;
756 }
757 
758 /* So that we hold reference in both cases. */
759 kobject_get(>kobj);
760 }
761 
762 return mk;
763 }


> Thanks.
> 
>> Cheers,
>> Rusty.
>>
>>> ---
>>>  kernel/params.c |8 +++-
>>>  1 files changed, 7 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/kernel/params.c b/kernel/params.c
>>> index 440e65d..f5299c1 100644
>>> --- a/kernel/params.c
>>> +++ b/kernel/params.c
>>> @@ -845,7 +845,13 @@ static void __init version_sysfs_builtin(void)
>>> mk = locate_module_kobject(vattr->module_name);
>>> if (mk) {
>>> err = sysfs_create_file(>kobj, >mattr.attr);
>>> -   kobject_uevent(>kobj, KOBJ_ADD);
>>> +   if (err)
>>> +   printk(KERN_WARNING
>>> +  "%s (%d): sysfs_create_file fail for %s, 
>>> err: %d\n",
>>> +  __FILE__, __LINE__,
>>> +  vattr->module_name, err);
>>> +   else
>>> +   kobject_uevent(>kobj, KOBJ_ADD);
>>> kobject_put(>kobj);
>>> }
>>> }
>>> -- 
>>> 1.7.7.6
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

2013-07-09 Thread Dave Chinner
On Mon, Jul 08, 2013 at 02:53:52PM +0200, Michal Hocko wrote:
> On Thu 04-07-13 18:36:43, Michal Hocko wrote:
> > On Wed 03-07-13 21:24:03, Dave Chinner wrote:
> > > On Tue, Jul 02, 2013 at 02:44:27PM +0200, Michal Hocko wrote:
> > > > On Tue 02-07-13 22:19:47, Dave Chinner wrote:
> > > > [...]
> > > > > Ok, so it's been leaked from a dispose list somehow. Thanks for the
> > > > > info, Michal, it's time to go look at the code
> > > > 
> > > > OK, just in case we will need it, I am keeping the machine in this state
> > > > for now. So we still can play with crash and check all the juicy
> > > > internals.
> > > 
> > > My current suspect is the LRU_RETRY code. I don't think what it is
> > > doing is at all valid - list_for_each_safe() is not safe if you drop
> > > the lock that protects the list. i.e. there is nothing that protects
> > > the stored next pointer from being removed from the list by someone
> > > else. Hence what I think is occurring is this:
> > > 
> > > 
> > > thread 1  thread 2
> > > lock(lru)
> > > list_for_each_safe(lru)   lock(lru)
> > >   isolate ..
> > > lock(i_lock)
> > > has buffers
> > >   __iget
> > >   unlock(i_lock)
> > >   unlock(lru)
> > >   .   (gets lru lock)
> > >   list_for_each_safe(lru)
> > > walks all the inodes
> > > finds inode being isolated by other thread
> > > isolate
> > >   i_count > 0
> > > list_del_init(i_lru)
> > > return LRU_REMOVED;
> > >  moves to next inode, inode that
> > >  other thread has stored as next
> > >  isolate
> > >i_state |= I_FREEING
> > >list_move(dispose_list)
> > >return LRU_REMOVED
> > >
> > >unlock(lru)
> > >   lock(lru)
> > >   return LRU_RETRY;
> > >   if (!first_pass)
> > > 
> > >   --nr_to_scan
> > >   (loop again using next, which has already been removed from the
> > >   LRU by the other thread!)
> > >   isolate
> > > lock(i_lock)
> > > if (i_state & ~I_REFERENCED)
> > >   list_del_init(i_lru)< inode is on dispose list!
> > >   < inode is now isolated, with I_FREEING set
> > >   return LRU_REMOVED;
> > > 
> > > That fits the corpse left on your machine, Michal. One thread has
> > > moved the inode to a dispose list, the other thread thinks it is
> > > still on the LRU and should be removed, and removes it.
> > > 
> > > This also explains the lru item count going negative - the same item
> > > is being removed from the lru twice. So it seems like all the
> > > problems you've been seeing are caused by this one problem
> > > 
> > > Patch below that should fix this.
> > 
> > Good news! The test was running since morning and it didn't hang nor
> > crashed. So this really looks like the right fix. It will run also
> > during weekend to be 100% sure. But I guess it is safe to say
> 
> Hmm, it seems I was too optimistic or we have yet another issue here (I
> guess the later is more probable).
> 
> The weekend testing got stuck as well. 

> 20761 [] xlog_grant_head_wait+0xdd/0x1a0 [xfs]
> [] xlog_grant_head_check+0xc6/0xe0 [xfs]
> [] xfs_log_reserve+0xff/0x240 [xfs]
> [] xfs_trans_reserve+0x234/0x240 [xfs]
> [] xfs_create+0x1a9/0x5c0 [xfs]
> [] xfs_vn_mknod+0x8a/0x1a0 [xfs]
> [] xfs_vn_create+0xe/0x10 [xfs]
> [] vfs_create+0xad/0xd0
> [] lookup_open+0x1b8/0x1d0
> [] do_last+0x2de/0x780
> [] path_openat+0xda/0x400
> [] do_filp_open+0x43/0xa0
> [] do_sys_open+0x160/0x1e0
> [] sys_open+0x1c/0x20
> [] system_call_fastpath+0x16/0x1b
> [] 0x

That's an XFS log space issue, indicating that it has run out of
space in IO the log and it is waiting for more to come free. That
requires IO completion to occur.

> [276962.652076] INFO: task xfs-data/sda9:930 blocked for more than 480 
> seconds.
> [276962.652087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [276962.652093] xfs-data/sda9   D 88001ffb9cc8 0   930  2 
> 0x

Oh, that's why. This is the IO completion worker...

> [276962.652102]  88003794d198 0046 8800325f4480 
> 
> [276962.652113]  88003794c010 00012dc0 00012dc0 
> 00012dc0
> [276962.652121]  00012dc0 88003794dfd8 88003794dfd8 
> 00012dc0
> [276962.652128] Call Trace:
> [276962.652151]  [] ? __blk_run_queue+0x32/0x40
> [276962.652160]  [] ? queue_unplugged+0x78/0xb0
> [276962.652171]  [] schedule+0x24/0x70
> [276962.652178]  [] io_schedule+0x9c/0xf0
> [276962.652187]  [] 

[PATCH V2] relay: fix timer madness

2013-07-09 Thread zhangwei(Jovi)
When I'm using ktap script to tracing all event tracepoints by relay
transport, without this patch, the system will hang in few seconds.

I found the original patch discussion in 2007.
http://marc.info/?l=linux-kernel=118544794717162=2
(In that mail thread, the patch didn't fix that problem, but it fix
the problem I encountered now)

Changed from v1:
mod timer interval changed from jiffies+1 to HZ/10, as Ingo suggested.

Original patch changelog from Ingo in 2007:

Remove timer calls (!!!) from deep within the tracing infrastructure.
This was totally bogus code that can cause lockups and worse.
Poll the buffer every 2 jiffies for now.

Signed-off-by: Ingo Molnar 
Signed-off-by: "zhangwei(Jovi)" 
Cc: Steven Rostedt 
Cc: Jens Axboe 
Cc: Al Viro 
Cc: Eric Dumazet 
Signed-off-by: Andrew Morton 
---
 kernel/relay.c |   14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/kernel/relay.c b/kernel/relay.c
index b91488b..87af4ce 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -339,6 +339,10 @@ static void wakeup_readers(unsigned long data)
 {
struct rchan_buf *buf = (struct rchan_buf *)data;
wake_up_interruptible(>read_wait);
+   /*
+* Stupid polling for now:
+*/
+   mod_timer(>timer, HZ / 10);
 }

 /**
@@ -356,6 +360,7 @@ static void __relay_reset(struct rchan_buf *buf, unsigned 
int init)
init_waitqueue_head(>read_wait);
kref_init(>kref);
setup_timer(>timer, wakeup_readers, (unsigned long)buf);
+   mod_timer(>timer, HZ / 10);
} else
del_timer_sync(>timer);

@@ -739,15 +744,6 @@ size_t relay_switch_subbuf(struct rchan_buf *buf, size_t 
length)
else
buf->early_bytes += buf->chan->subbuf_size -
buf->padding[old_subbuf];
-   smp_mb();
-   if (waitqueue_active(>read_wait))
-   /*
-* Calling wake_up_interruptible() from here
-* will deadlock if we happen to be logging
-* from the scheduler (trying to re-grab
-* rq->lock), so defer it.
-*/
-   mod_timer(>timer, jiffies + 1);
}

old = buf->data;
-- 
1.7.9.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/params.c: print failure information instead of 'KOBJ_ADD' to user space, when sysfs_create_file() fails.

2013-07-09 Thread Chen Gang F T
On 07/09/2013 04:07 PM, Rusty Russell wrote:
> Chen Gang  writes:
>> When sysfs_create_file() fails, recommend to print the related failure
>> information. And it is useless to still 'KOBJ_ADD' to user space.
>>
>> Signed-off-by: Chen Gang 
> 
> sysfs_create_file() should not fail during boot, should it?
> 

Hmm..., please reference locate_module_kobject() in "kernel/params.c",
which is an '__init' function, and also call sysfs_create_file(), it
processes the related error.

So I recommend to get the check too in version_sysfs_builtin().

Thanks.

> Cheers,
> Rusty.
> 
>> ---
>>  kernel/params.c |8 +++-
>>  1 files changed, 7 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/params.c b/kernel/params.c
>> index 440e65d..f5299c1 100644
>> --- a/kernel/params.c
>> +++ b/kernel/params.c
>> @@ -845,7 +845,13 @@ static void __init version_sysfs_builtin(void)
>>  mk = locate_module_kobject(vattr->module_name);
>>  if (mk) {
>>  err = sysfs_create_file(>kobj, >mattr.attr);
>> -kobject_uevent(>kobj, KOBJ_ADD);
>> +if (err)
>> +printk(KERN_WARNING
>> +   "%s (%d): sysfs_create_file fail for %s, 
>> err: %d\n",
>> +   __FILE__, __LINE__,
>> +   vattr->module_name, err);
>> +else
>> +kobject_uevent(>kobj, KOBJ_ADD);
>>  kobject_put(>kobj);
>>  }
>>  }
>> -- 
>> 1.7.7.6
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] extcon: palmas: enable ID_GND and ID_FLOAT detection always

2013-07-09 Thread Chanwoo Choi
Hi Laxman,

On 07/09/2013 10:04 PM, Laxman Dewangan wrote:
> When integrating driver with Tegra platform, it is found that
> the ID pins get detected only once after booting system and
> further removal and re-insert does not detect the ID pin.
> 
> Fixing this issue with enabling interrupt on ID_GND and ID_FLOAT
> always  and clearing the status on LATCH register which actually
> occurred.
> 
> Also if interrupt occurs with line status as zero then based on
> previous status, set the cable state.
> 
> Add debug prints to display the cable state when any cable
> insertion/removal happen.
> 
> Signed-off-by: Laxman Dewangan 
> ---
>  drivers/extcon/extcon-palmas.c |   24 +++-
>  1 files changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/extcon/extcon-palmas.c b/drivers/extcon/extcon-palmas.c
> index b752a0a..587034b 100644
> --- a/drivers/extcon/extcon-palmas.c
> +++ b/drivers/extcon/extcon-palmas.c
> @@ -57,6 +57,7 @@ static irqreturn_t palmas_vbus_irq_handler(int irq, void 
> *_palmas_usb)
>   if (palmas_usb->linkstat != PALMAS_USB_STATE_VBUS) {
>   palmas_usb->linkstat = PALMAS_USB_STATE_VBUS;
>   extcon_set_cable_state(_usb->edev, "USB", true);
> + dev_info(palmas_usb->dev, "USB cable state: TRUE\n");

I prfer following info message when USB cable is inserted.
dev_info(palmas_usb->dev, "USB cable is attached\n");


>   } else {
>   dev_dbg(palmas_usb->dev,
>   "Spurious connect event detected\n");
> @@ -65,6 +66,7 @@ static irqreturn_t palmas_vbus_irq_handler(int irq, void 
> *_palmas_usb)
>   if (palmas_usb->linkstat == PALMAS_USB_STATE_VBUS) {
>   palmas_usb->linkstat = PALMAS_USB_STATE_DISCONNECT;
>   extcon_set_cable_state(_usb->edev, "USB", false);
> + dev_info(palmas_usb->dev, "USB cable state: FALSE\n");

ditto.
dev_info(palmas_usb->dev, "USB cable is detached\n");

>   } else {
>   dev_dbg(palmas_usb->dev,
>   "Spurious disconnect event detected\n");
> @@ -84,28 +86,23 @@ static irqreturn_t palmas_id_irq_handler(int irq, void 
> *_palmas_usb)
>  
>   if (set & PALMAS_USB_ID_INT_SRC_ID_GND) {
>   palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
> - PALMAS_USB_ID_INT_EN_HI_SET,
> - PALMAS_USB_ID_INT_EN_HI_SET_ID_FLOAT);
> - palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
> - PALMAS_USB_ID_INT_EN_HI_CLR,
> - PALMAS_USB_ID_INT_EN_HI_CLR_ID_GND);
> - palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
>   PALMAS_USB_ID_INT_LATCH_CLR,
>   PALMAS_USB_ID_INT_EN_HI_CLR_ID_GND);
>   palmas_usb->linkstat = PALMAS_USB_STATE_ID;
>   extcon_set_cable_state(_usb->edev, "USB-HOST", true);
> + dev_info(palmas_usb->dev, "HOST cable state: TRUE\n");

ditto.
dev_info(palmas_usb->dev, "USB-HOST cable is attached\n");

>   } else if (set & PALMAS_USB_ID_INT_SRC_ID_FLOAT) {
>   palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
> - PALMAS_USB_ID_INT_EN_HI_SET,
> - PALMAS_USB_ID_INT_EN_HI_SET_ID_GND);
> - palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
> - PALMAS_USB_ID_INT_EN_HI_CLR,
> - PALMAS_USB_ID_INT_EN_HI_CLR_ID_FLOAT);
> - palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
>   PALMAS_USB_ID_INT_LATCH_CLR,
>   PALMAS_USB_ID_INT_EN_HI_CLR_ID_FLOAT);
>   palmas_usb->linkstat = PALMAS_USB_STATE_DISCONNECT;
>   extcon_set_cable_state(_usb->edev, "USB-HOST", false);
> + dev_info(palmas_usb->dev, "HOST cable state: FALSE\n");

ditto.
dev_info(palmas_usb->dev, "USB-HOST cable is detached\n");

> + } else if ((palmas_usb->linkstat == PALMAS_USB_STATE_ID) &&
> + (!(set & PALMAS_USB_ID_INT_SRC_ID_GND))) {
> + palmas_usb->linkstat = PALMAS_USB_STATE_DISCONNECT;
> + extcon_set_cable_state(_usb->edev, "USB-HOST", false);
> + dev_info(palmas_usb->dev, "HOST cable state: FALSE\n");

dev_info(palmas_usb->dev, "USB-HOST cable is detached\n");

>   }
>  
>   return IRQ_HANDLED;
> @@ -122,7 +119,8 @@ static void palmas_enable_irq(struct palmas_usb 
> *palmas_usb)
>  
>   palmas_write(palmas_usb->palmas, PALMAS_USB_OTG_BASE,
>   PALMAS_USB_ID_INT_EN_HI_SET,
> - PALMAS_USB_ID_INT_EN_HI_SET_ID_GND);
> + PALMAS_USB_ID_INT_EN_HI_SET_ID_GND |
> + PALMAS_USB_ID_INT_EN_HI_SET_ID_FLOAT);
>  
>   palmas_vbus_irq_handler(palmas_usb->vbus_irq, palmas_usb);
>  
> 

I you would modify info message, anything 

Re: virtio indirect with lots of descriptors

2013-07-09 Thread Rusty Russell
Dave Airlie  writes:
> Hi Rusty,
>
> playing with my virtio gpu, I started hitting the qemu
> error_report("Too many read descriptors in indirect table");
>
> Now I'm not sure but this doesn't seem to be a virtio limit that the
> guest catches from what I can see, since my host dies quite quickly,
> when I'm doing transfers in/out of a 5MB object with an sg entry per
> page.
>
> Just wondering if you can confirm if this is only a qemu limitation or
> if I should just work around it at a bit of a higher level in my
> driver/device?

You're not allowed to place more descriptors in a single request than
there are elements in the ring *even* if you use indirects, which are
seen as an optimization (thus you can always fall back to direct
descriptors if OOM).

We could change this rule in the 1.0 spec if required, or even make a
special rule for your device, but for the moment that's how it is.

Hope that helps,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] sched: smart wake-affine foundation

2013-07-09 Thread Michael Wang
On 07/10/2013 09:52 AM, Sam Ben wrote:
> On 07/08/2013 10:36 AM, Michael Wang wrote:
>> Hi, Sam
>>
>> On 07/07/2013 09:31 AM, Sam Ben wrote:
>>> On 07/04/2013 12:55 PM, Michael Wang wrote:
 wake-affine stuff is always trying to pull wakee close to waker, by
 theory,
 this will bring benefit if waker's cpu cached hot data for wakee, or
 the
 extreme ping-pong case.
>>> What's the meaning of ping-pong case?
>> PeterZ explained it well in here:
>>
>> https://lkml.org/lkml/2013/3/7/332
>>
>> And you could try to compare:
>> taskset 1 perf bench sched pipe
>> with
>> perf bench sched pipe
> 
> Why sched pipe is special?

I think the link already explained the reason well, or you can read the
code of that pipe implementation, and you will find out there is a high
chances to match the ping-pong cases :)

Regards,
Michael Wang

> 
>>
>> to confirm it ;-)
>>
>> Regards,
>> Michael Wang
>>
 And testing show it could benefit hackbench 15% at most.

 However, the whole stuff is somewhat blindly and time-consuming, some
 workload therefore suffer.

 And testing show it could damage pgbench 50% at most.

 Thus, wake-affine stuff should be more smart, and realise when to stop
 it's thankless effort.

 This patch introduced 'nr_wakee_switch', which will be increased each
 time the task switch it's wakee.

 So a high 'nr_wakee_switch' means the task has more than one wakee, and
 bigger the number, higher the wakeup frequency.

 Now when making the decision on whether to pull or not, pay
 attention on
 the wakee with a high 'nr_wakee_switch', pull such task may benefit
 wakee,
 but also imply that waker will face cruel competition later, it
 could be
 very cruel or very fast depends on the story behind 'nr_wakee_switch',
 whatever, waker therefore suffer.

 Furthermore, if waker also has a high 'nr_wakee_switch', imply that
 multiple
 tasks rely on it, then waker's higher latency will damage all of them,
 pull
 wakee seems to be a bad deal.

 Thus, when 'waker->nr_wakee_switch / wakee->nr_wakee_switch' become
 higher
 and higher, the deal seems to be worse and worse.

 The patch therefore help wake-affine stuff to stop it's work when:

  wakee->nr_wakee_switch > factor &&
  waker->nr_wakee_switch > (factor * wakee->nr_wakee_switch)

 The factor here is the node-size of current-cpu, so bigger node will
 lead
 to more pull since the trial become more severe.

 After applied the patch, pgbench show 40% improvement at most.

 Test:
  Tested with 12 cpu X86 server and tip 3.10.0-rc7.

  pgbenchbasesmart

  | db_size | clients |  tps  ||  tps  |
  +-+-+---+   +---+
  | 22 MB   |   1 | 10598 |   | 10796 |
  | 22 MB   |   2 | 21257 |   | 21336 |
  | 22 MB   |   4 | 41386 |   | 41622 |
  | 22 MB   |   8 | 51253 |   | 57932 |
  | 22 MB   |  12 | 48570 |   | 54000 |
  | 22 MB   |  16 | 46748 |   | 55982 | +19.75%
  | 22 MB   |  24 | 44346 |   | 55847 | +25.93%
  | 22 MB   |  32 | 43460 |   | 54614 | +25.66%
  | 7484 MB |   1 |  8951 |   |  9193 |
  | 7484 MB |   2 | 19233 |   | 19240 |
  | 7484 MB |   4 | 37239 |   | 37302 |
  | 7484 MB |   8 | 46087 |   | 50018 |
  | 7484 MB |  12 | 42054 |   | 48763 |
  | 7484 MB |  16 | 40765 |   | 51633 | +26.66%
  | 7484 MB |  24 | 37651 |   | 52377 | +39.11%
  | 7484 MB |  32 | 37056 |   | 51108 | +37.92%
  | 15 GB   |   1 |  8845 |   |  9104 |
  | 15 GB   |   2 | 19094 |   | 19162 |
  | 15 GB   |   4 | 36979 |   | 36983 |
  | 15 GB   |   8 | 46087 |   | 49977 |
  | 15 GB   |  12 | 41901 |   | 48591 |
  | 15 GB   |  16 | 40147 |   | 50651 | +26.16%
  | 15 GB   |  24 | 37250 |   | 52365 | +40.58%
  | 15 GB   |  32 | 36470 |   | 50015 | +37.14%

 CC: Ingo Molnar 
 CC: Peter Zijlstra 
 CC: Mike Galbraith 
 Signed-off-by: Michael Wang 
 ---
include/linux/sched.h |3 +++
kernel/sched/fair.c   |   47
 +++
2 files changed, 50 insertions(+), 0 deletions(-)

 diff --git a/include/linux/sched.h b/include/linux/sched.h
 index 178a8d9..1c996c7 100644
 --- a/include/linux/sched.h
 +++ b/include/linux/sched.h
 @@ -1041,6 +1041,9 @@ struct task_struct {
#ifdef CONFIG_SMP
struct llist_node wake_entry;
int on_cpu;
 +struct task_struct *last_wakee;
 +unsigned long nr_wakee_switch;
 +unsigned long last_switch_decay;
#endif
int 

Re: [RFC][PATCH 1/9] sched: Introduce power scheduler

2013-07-09 Thread Arjan van de Ven

On 7/9/2013 8:55 AM, Morten Rasmussen wrote:

+   mod_delayed_work_on(schedule_cpu(), system_wq, ,
+   msecs_to_jiffies(INTERVAL));


so thinking about this more, this really really should not be a work queue.

a work queue will cause a large number of context switches for no reason
(on Intel and AMD you can switch P state from interrupt context, and I'm pretty 
sure
that holds for many ARM as well)

and in addition, it causes some really nasty cases, especially around real time 
tasks.
Your workqueue will schedule a kernel thread, which will run
BEHIND real time tasks, and such real time task will then never be able to 
start running at a higher performance.

(and with the delta between lowest and highest performance sometimes being 10x 
or more,
the real time task will be running SLOW... quite possible longer than several 
milliseconds)


and all for no good reason; a normal timer running in irq context would be much 
better for this kind of thing!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND] The initmpfs patches.

2013-07-09 Thread Rob Landley

Attached, so you don't have to fish them out of:

  http://lkml.indiana.edu/hypermail/linux/kernel/1306.3/04204.html

Do they look worth applying, or should I wash it through linux-next for  
a bit? (Which I'm not sure how to do if I don't host a git tree on a  
server, or I'd have done it already.)


There was a previous post with a patch demonstrating the basic concept  
a while ago (https://lwn.net/Articles/545740/). This is the cleaned up,  
broken up, tested in as many ways as I could think of, does not have  
section mismatches, allows you to disable it at runtime, passes  
checkpatch.pl version. Still applies to a git pull from 3 minutes ago  
(two patches have offsets, but no fuzz).


Thanks,

RobFrom: Rob Landley 
Subject: [PATCH 0/5] initmpfs: use tmpfs instead of ramfs for rootfs
To: linux-kernel@vger.kernel.org
Cc: Alexander Viro 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: "Eric W. Biederman" 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Jeff Layton 
Cc: Jens Axboe 
Cc: Jim Cromie 
Cc: linux-fsde...@vger.kernel.org
Cc: linux...@kvack.org
Cc: Rusty Russell 
Cc: Sam Ravnborg 
Cc: Stephen Warren 

Use tmpfs for rootfs when CONFIG_TMPFS=y and there's no root=.
Specify rootfstype=ramfs to get the old initramfs behavior.

The previous initramfs code provided a fairly crappy root filesystem:
didn't let you --bind mount directories out of it, reported zero
size/usage so it didn't show up in "df" and couldn't run things like
rpm that query available space before proceeding, would fill up all
available memory and panic the system if you wrote too much to it...

Using tmpfs instead provides a much better root filesystem.
From: Rob Landley 
Subject: [PATCH 1/5] initmpfs: replace MS_NOUSER in initramfs
To: linux-kernel@vger.kernel.org
Cc: Al Viro 
Cc: "Eric W. Biederman" 

From: Rob Landley 

Mounting MS_NOUSER prevents --bind mounts from rootfs. Prevent new rootfs
mounts with a different mechanism that doesn't affect bind mounts.

Signed-off-by: Rob Landley 
---

 fs/ramfs/inode.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
index c24f1e1..14b9c35 100644
--- a/fs/ramfs/inode.c
+++ b/fs/ramfs/inode.c
@@ -247,7 +247,14 @@ struct dentry *ramfs_mount(struct file_system_type *fs_type,
 static struct dentry *rootfs_mount(struct file_system_type *fs_type,
 	int flags, const char *dev_name, void *data)
 {
-	return mount_nodev(fs_type, flags|MS_NOUSER, data, ramfs_fill_super);
+	static int once;
+
+	if (once)
+		return ERR_PTR(-ENODEV);
+	else
+		once++;
+
+	return mount_nodev(fs_type, flags, data, ramfs_fill_super);
 }
 
 static void ramfs_kill_sb(struct super_block *sb)
From: Rob Landley 
Subject: [PATCH 2/5] initmpfs: Move bdi setup from init_rootfs to init_ramfs
To: linux-kernel@vger.kernel.org
Cc: Al Viro 
Cc: "Eric W. Biederman" 

From: Rob Landley 

Even though ramfs hasn't got a backing device, commit e0bf68ddec4f added one
anyway, and put the initialization in init_rootfs() since that's the first
user, leaving it out of init_ramfs() to avoid duplication.

But initmpfs uses init_tmpfs() instead, so move the init into the filesystem's
init function, add a "once" guard to prevent duplicate initialization, and
call the filesystem init from rootfs init.

This goes part of the way to allowing ramfs to be built as a module.

Signed-off-by: Rob Landley 
---

 fs/ramfs/inode.c |   25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

--- initold/fs/ramfs/inode.c	2013-06-28 15:12:03.205879730 -0500
+++ initold2/fs/ramfs/inode.c	2013-06-28 15:12:12.425880115 -0500
@@ -277,21 +277,36 @@
 
 static int __init init_ramfs_fs(void)
 {
-	return register_filesystem(_fs_type);
+	static int once;
+	int err;
+
+	if (once)
+		return 0;
+	else
+		once++;
+
+	err = bdi_init(_backing_dev_info);
+	if (err)
+		return err;
+
+	err = register_filesystem(_fs_type);
+	if (err)
+		bdi_destroy(_backing_dev_info);
+
+	return err;
 }
 module_init(init_ramfs_fs)
 
 int __init init_rootfs(void)
 {
-	int err;
+	int err = register_filesystem(_fs_type);
 
-	err = bdi_init(_backing_dev_info);
 	if (err)
 		return err;
 
-	err = register_filesystem(_fs_type);
+	err = init_ramfs_fs();
 	if (err)
-		bdi_destroy(_backing_dev_info);
+		unregister_filesystem(_fs_type);
 
 	return err;
 }
From: Rob Landley 
Subject: [PATCH 3/5] initmpfs: Move rootfs code from fs/ramfs/ to init/
To: linux-kernel@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Cc: Jeff Layton 
Cc: Jens Axboe 
Cc: Stephen Warren 
Cc: Rusty Russell 
Cc: Jim Cromie 
Cc: Sam Ravnborg 
Cc: Greg Kroah-Hartman 
Cc: Andrew Morton 
Cc: "Eric W. Biederman" 
Cc: Alexander Viro 

From: Rob Landley 

When the rootfs code was a wrapper around ramfs, having them in the same
file made sense. Now that it can wrap another filesystem type, move it
in with the init code instead.

This also allows a subsequent patch to access rootfstype= command line arg.

Signed-off-by: Rob Landley 
---

 fs/namespace.c|2 

[PATCH v3] btusb: fix wrong use of PTR_ERR()

2013-07-09 Thread Adam Lee
PTR_ERR() returns a signed long type value which is limited by IS_ERR(),
it must be a negative number whose range is [-MAX_ERRNO, 0).

The bug here returns negative numbers as error codes, then check it by
"if (ret < 0)", but -PTR_ERR() is actually positive. The wrong use here
leads to failure as below, even panic.

[   12.958920] Bluetooth: hci0 command 0xfc8e tx timeout
[   14.961765] Bluetooth: hci0 command 0xfc8e tx timeout
[   16.964688] Bluetooth: hci0 command 0xfc8e tx timeout
[   20.954501] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   22.957358] Bluetooth: hci0 command 0xfc8e tx timeout
[   30.948922] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   32.951780] Bluetooth: hci0 command 0xfc8e tx timeout
[   40.943359] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   42.946219] Bluetooth: hci0 command 0xfc8e tx timeout
[   50.937812] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   52.940670] Bluetooth: hci0 command 0xfc8e tx timeout
[   60.932236] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   62.935092] Bluetooth: hci0 command 0xfc8e tx timeout
[   70.926688] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   72.929545] Bluetooth: hci0 command 0xfc8e tx timeout
[   80.92] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed 
(-110)
[   82.923969] Bluetooth: hci0 command 0xfc2f tx timeout
[   90.915542] Bluetooth: hci0 sending Intel patch command (0xfc2f) failed 
(-110)
[   92.918406] Bluetooth: hci0 command 0xfc11 tx timeout
[  100.909955] Bluetooth: hci0 sending Intel patch command (0xfc11) failed 
(-110)
[  102.912858] Bluetooth: hci0 command 0xfc60 tx timeout
[  110.904394] Bluetooth: hci0 sending Intel patch command (0xfc60) failed 
(-110)
[  112.907293] Bluetooth: hci0 command 0xfc11 tx timeout
[  120.898831] Bluetooth: hci0 exiting Intel manufacturer mode failed (-110)
[  120.904757] bluetoothd[1030]: segfault at 4 ip 7f8b2eb55236 sp 
7fff53ff6920 error 4 in bluetoothd[7f8b2eaff000+cb000]

Signed-off-by: Adam Lee 
---
 drivers/bluetooth/btusb.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 7a7e5f8..23df968 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -1092,7 +1092,7 @@ static int btusb_setup_intel_patching(struct hci_dev 
*hdev,
if (IS_ERR(skb)) {
BT_ERR("%s sending Intel patch command (0x%4.4x) failed (%ld)",
   hdev->name, cmd->opcode, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
 
/* It ensures that the returned event matches the event data read from
@@ -1144,7 +1144,7 @@ static int btusb_setup_intel(struct hci_dev *hdev)
if (IS_ERR(skb)) {
BT_ERR("%s sending initial HCI reset command failed (%ld)",
   hdev->name, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
kfree_skb(skb);
 
@@ -1158,7 +1158,7 @@ static int btusb_setup_intel(struct hci_dev *hdev)
if (IS_ERR(skb)) {
BT_ERR("%s reading Intel fw version command failed (%ld)",
   hdev->name, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
 
if (skb->len != sizeof(*ver)) {
@@ -1216,7 +1216,7 @@ static int btusb_setup_intel(struct hci_dev *hdev)
BT_ERR("%s entering Intel manufacturer mode failed (%ld)",
   hdev->name, PTR_ERR(skb));
release_firmware(fw);
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
 
if (skb->data[0]) {
@@ -1273,7 +1273,7 @@ static int btusb_setup_intel(struct hci_dev *hdev)
if (IS_ERR(skb)) {
BT_ERR("%s exiting Intel manufacturer mode failed (%ld)",
   hdev->name, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
kfree_skb(skb);
 
@@ -1289,7 +1289,7 @@ exit_mfg_disable:
if (IS_ERR(skb)) {
BT_ERR("%s exiting Intel manufacturer mode failed (%ld)",
   hdev->name, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
kfree_skb(skb);
 
@@ -1307,7 +1307,7 @@ exit_mfg_deactivate:
if (IS_ERR(skb)) {
BT_ERR("%s exiting Intel manufacturer mode failed (%ld)",
   hdev->name, PTR_ERR(skb));
-   return -PTR_ERR(skb);
+   return PTR_ERR(skb);
}
kfree_skb(skb);
 
-- 
1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [REGRESSION] x86 vmalloc issue from recent 3.10.0+ commit

2013-07-09 Thread Dave Jones
On Tue, Jul 09, 2013 at 09:51:32PM -0400, Michael L. Semon wrote:

 > kernel: [ 2580.395592] vmap allocation for size 20480 failed: use 
 > vmalloc= to increase size.
 > kernel: [ 2580.395761] vmalloc: allocation failure: 16384 bytes

I was seeing a lot of these recently too.
(Though I also saw memory corruption afterwards possibly caused by
 a broken fallback path somewhere when that vmalloc fails)

http://comments.gmane.org/gmane.linux.kernel.mm/102895

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kexec: return error of machine_kexec() fails

2013-07-09 Thread Simon Horman
From: Stephen Warren 

Prior to commit 3ab8352 "kexec jump", if machine_kexec() returned,
sys_reboot() would return -EINVAL. This patch restores this behaviour
for the non-KEXEC_JUMP case, where machine_kexec() is not expected to
return.

This situation can occur on ARM, where kexec requires disabling all but
one CPU using CPU hotplug. However, if hotplug isn't supported by the
particular HW the kernel is running on, then kexec cannot succeed.

Signed-off-by: Stephen Warren 
Acked-by: Will Deacon 
Acked-by: Zhang Yanfei 
Acked-by: Simon Horman 
---
 kernel/kexec.c | 2 ++
 1 file changed, 2 insertions(+)

Andrew, could you consider picking up this patch?

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 59f7b55..bde1190 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1702,6 +1702,8 @@ int kernel_kexec(void)
pm_restore_console();
unlock_system_sleep();
}
+#else
+   error = -EINVAL;
 #endif
 
  Unlock:
-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Networking

2013-07-09 Thread Rob Landley

On 07/09/2013 12:32:56 PM, Linus Torvalds wrote:

On Mon, Jul 8, 2013 at 7:29 PM, Rob Landley  wrote:
>
> Um, does that mean I should have cc'd you on the initmpfs patch  
series

> back before the merge window opened?

So I personally don't tend to care about that kind of patch-series -
it doesn't really add new features to basic code unlike the polling
changes.


I tried to be as non-intrusive as possible while mucking about with  
early boot code on all targets. :)



But I suspect for something like that series, the people to
cc are Al Viro and Hugh Dickins because they maintain vfs adn tmpfs
respectively.. That said, I think neither really cares.deeply, and
this looks like a "Andrew" patch-series if only because it's so random
;)


All three were cc'd by get_maintainer.pl, but none of 'em seem to have  
noticed.


The embedded community continues to have strange needs, but they make  
sense to us. (And oddly enough to the supercomputer folks, who are  
basically "embedded with money".)


Thanks,

Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] sched: smart wake-affine foundation

2013-07-09 Thread Sam Ben

On 07/08/2013 10:36 AM, Michael Wang wrote:

Hi, Sam

On 07/07/2013 09:31 AM, Sam Ben wrote:

On 07/04/2013 12:55 PM, Michael Wang wrote:

wake-affine stuff is always trying to pull wakee close to waker, by
theory,
this will bring benefit if waker's cpu cached hot data for wakee, or the
extreme ping-pong case.

What's the meaning of ping-pong case?

PeterZ explained it well in here:

https://lkml.org/lkml/2013/3/7/332

And you could try to compare:
taskset 1 perf bench sched pipe
with
perf bench sched pipe


Why sched pipe is special?



to confirm it ;-)

Regards,
Michael Wang


And testing show it could benefit hackbench 15% at most.

However, the whole stuff is somewhat blindly and time-consuming, some
workload therefore suffer.

And testing show it could damage pgbench 50% at most.

Thus, wake-affine stuff should be more smart, and realise when to stop
it's thankless effort.

This patch introduced 'nr_wakee_switch', which will be increased each
time the task switch it's wakee.

So a high 'nr_wakee_switch' means the task has more than one wakee, and
bigger the number, higher the wakeup frequency.

Now when making the decision on whether to pull or not, pay attention on
the wakee with a high 'nr_wakee_switch', pull such task may benefit
wakee,
but also imply that waker will face cruel competition later, it could be
very cruel or very fast depends on the story behind 'nr_wakee_switch',
whatever, waker therefore suffer.

Furthermore, if waker also has a high 'nr_wakee_switch', imply that
multiple
tasks rely on it, then waker's higher latency will damage all of them,
pull
wakee seems to be a bad deal.

Thus, when 'waker->nr_wakee_switch / wakee->nr_wakee_switch' become
higher
and higher, the deal seems to be worse and worse.

The patch therefore help wake-affine stuff to stop it's work when:

 wakee->nr_wakee_switch > factor &&
 waker->nr_wakee_switch > (factor * wakee->nr_wakee_switch)

The factor here is the node-size of current-cpu, so bigger node will lead
to more pull since the trial become more severe.

After applied the patch, pgbench show 40% improvement at most.

Test:
 Tested with 12 cpu X86 server and tip 3.10.0-rc7.

 pgbenchbasesmart

 | db_size | clients |  tps  ||  tps  |
 +-+-+---+   +---+
 | 22 MB   |   1 | 10598 |   | 10796 |
 | 22 MB   |   2 | 21257 |   | 21336 |
 | 22 MB   |   4 | 41386 |   | 41622 |
 | 22 MB   |   8 | 51253 |   | 57932 |
 | 22 MB   |  12 | 48570 |   | 54000 |
 | 22 MB   |  16 | 46748 |   | 55982 | +19.75%
 | 22 MB   |  24 | 44346 |   | 55847 | +25.93%
 | 22 MB   |  32 | 43460 |   | 54614 | +25.66%
 | 7484 MB |   1 |  8951 |   |  9193 |
 | 7484 MB |   2 | 19233 |   | 19240 |
 | 7484 MB |   4 | 37239 |   | 37302 |
 | 7484 MB |   8 | 46087 |   | 50018 |
 | 7484 MB |  12 | 42054 |   | 48763 |
 | 7484 MB |  16 | 40765 |   | 51633 | +26.66%
 | 7484 MB |  24 | 37651 |   | 52377 | +39.11%
 | 7484 MB |  32 | 37056 |   | 51108 | +37.92%
 | 15 GB   |   1 |  8845 |   |  9104 |
 | 15 GB   |   2 | 19094 |   | 19162 |
 | 15 GB   |   4 | 36979 |   | 36983 |
 | 15 GB   |   8 | 46087 |   | 49977 |
 | 15 GB   |  12 | 41901 |   | 48591 |
 | 15 GB   |  16 | 40147 |   | 50651 | +26.16%
 | 15 GB   |  24 | 37250 |   | 52365 | +40.58%
 | 15 GB   |  32 | 36470 |   | 50015 | +37.14%

CC: Ingo Molnar 
CC: Peter Zijlstra 
CC: Mike Galbraith 
Signed-off-by: Michael Wang 
---
   include/linux/sched.h |3 +++
   kernel/sched/fair.c   |   47
+++
   2 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 178a8d9..1c996c7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1041,6 +1041,9 @@ struct task_struct {
   #ifdef CONFIG_SMP
   struct llist_node wake_entry;
   int on_cpu;
+struct task_struct *last_wakee;
+unsigned long nr_wakee_switch;
+unsigned long last_switch_decay;
   #endif
   int on_rq;
   diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c61a614..a4ddbf5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2971,6 +2971,23 @@ static unsigned long cpu_avg_load_per_task(int
cpu)
   return 0;
   }
   +static void record_wakee(struct task_struct *p)
+{
+/*
+ * Rough decay(wiping) for cost saving, don't worry
+ * about the boundary, really active task won't care
+ * the loose.
+ */
+if (jiffies > current->last_switch_decay + HZ) {
+current->nr_wakee_switch = 0;
+current->last_switch_decay = jiffies;
+}
+
+if (current->last_wakee != p) {
+current->last_wakee = p;
+current->nr_wakee_switch++;
+}
+}
 static void task_waking_fair(struct task_struct *p)
   {
@@ -2991,6 +3008,7 @@ 

[REGRESSION] x86 vmalloc issue from recent 3.10.0+ commit

2013-07-09 Thread Michael L. Semon
Hi!  I'm doing volunteer testing of xfstests and was sent here to 
ask about this issue.  I apologize in advance if the problem has 
already been solved...

I've been testing XFS from various git kernels on 32-bit Pentium 4 
and Pentium III PCs.  There was an issue with xfstests test xfs/167, 
which is one of many tests that run a lot of instances of the 
program "fsstress" to try to break something.  Usually, the test 
passes, but this time, the Pentium 4 got stuck in a loop, and the 
Pentium III had processes killed but didn't seem to have resources 
released back to the system.

The solution was to bisect the kernel to find the problem commit, 
get a patch out of it, then use the patch to reverse the commit.

The kernel git used was pulled on July 7.  SGI's xfs-oss/master was 
updated as well, and these additional XFS patches were applied:

xfs: clean up unused codes at xfs_bulkstat()
xfs: dquot log reservations are too small
xfs: remove local fork format handling from xfs_bmapi_write()
xfs: update mount options documentation

Hopefully, such merging and patching won't be needed to reproduce the 
problem on your end.  The problem is 100% reproducible here.

The rest of this letter is supplementary data from the Pentium 4 PC.

Thanks!

Michael

The partition used in the test was this (from `gdisk /dev/sdb`):

   57018905690046463   9.5 GiB 8300  gScratchDev

The original/fixed test behaviors look like this to xfstests:

root@plbearer:/var/lib/xfstests# ./check xfs/167
FSTYP -- xfs (debug)
PLATFORM  -- Linux/i686 plbearer 3.10.0+
MKFS_OPTIONS  -- -f -bsize=4096 /dev/sdb5
MOUNT_OPTIONS -- /dev/sdb5 /mnt/xfstests-scratch

xfs/167 922s ... 891s
Ran: xfs/167
Passed all 1 tests

On a failing test, the hard drive light dies after a while, and it's 
impossible to switch framebuffer consoles (i915).  This is the 
beginning of the infinite loop started by hitting Alt-Shift-SysRq-e-
i-e-i-s-u-s, captured over netconsole (the first SysRq-s seems to be 
the trigger):

logger: run xfstest xfs/167
kernel: [ 2497.774818] XFS (sdb5): Version 5 superblock detected. This kernel 
has EXPERIMENTAL support enabled!
kernel: [ 2497.774818] Use of these features in this kernel is at your own risk!
kernel: [ 2497.862312] XFS (sdb5): Mounting Filesystem
kernel: [ 2580.395592] vmap allocation for size 20480 failed: use 
vmalloc= to increase size.
kernel: [ 2580.395761] vmalloc: allocation failure: 16384 bytes
kernel: [ 2580.395769] fsstress: page allocation failure: order:0, mode:0x80d2
kernel: [ 2580.395776] CPU: 0 PID: 6262 Comm: fsstress Not tainted 3.10.0+ #1
kernel: [ 2580.395781] Hardware name: Dell Computer Corporation Dimension 
2350/07W080, BIOS A01 12/17/2002
kernel: [ 2580.395785]  0001 0001 c50b3bfc c14825b2 c50b3c24 c10a1c6c 
c15c1f70 ee319bb4
kernel: [ 2580.395802]   80d2 c50b3c38 c15c34a4 c50b3c14 fffa 
c50b3c54 c10c3243
kernel: [ 2580.395817]  80d2  c15c34a4 4000 c6bffb50 4000 
c6bffb80 f06f
kernel: [ 2580.395832] Call Trace:
kernel: [ 2580.395847]  [] dump_stack+0x16/0x18
kernel: [ 2580.395859]  [] warn_alloc_failed+0xb4/0xe7
kernel: [ 2580.395868]  [] __vmalloc_node_range+0x16d/0x1cf
kernel: [ 2580.395875]  [] __vmalloc_node+0x48/0x4f
kernel: [ 2580.395884]  [] ? kmem_zalloc_greedy+0x21/0x2c
kernel: [ 2580.395890]  [] vzalloc+0x30/0x32
kernel: [ 2580.395897]  [] ? kmem_zalloc_greedy+0x21/0x2c
kernel: [ 2580.395904]  [] kmem_zalloc_greedy+0x21/0x2c
kernel: [ 2580.395913]  [] xfs_bulkstat+0x12a/0x94b
kernel: [ 2580.395921]  [] ? lock_release_non_nested+0xa0/0x2b7
kernel: [ 2580.395931]  [] ? might_fault+0x7c/0x9b
kernel: [ 2580.395938]  [] ? might_fault+0x49/0x9b
kernel: [ 2580.395945]  [] ? might_fault+0x93/0x9b
kernel: [ 2580.395954]  [] ? _copy_from_user+0x3f/0x57
kernel: [ 2580.395961]  [] xfs_ioc_bulkstat+0xba/0x15a
kernel: [ 2580.395968]  [] ? xfs_bulkstat_one_int+0x2ff/0x2ff
kernel: [ 2580.395975]  [] xfs_file_ioctl+0x6b9/0xa0d
kernel: [ 2580.395984]  [] ? dput+0x2d/0x263
kernel: [ 2580.395990]  [] ? dput+0x219/0x263
kernel: [ 2580.395999]  [] ? _raw_spin_unlock+0x22/0x30
kernel: [ 2580.396006]  [] ? dput+0x219/0x263
kernel: [ 2580.396013]  [] ? mntput+0x1d/0x28
kernel: [ 2580.396022]  [] ? terminate_walk+0x63/0x66
kernel: [ 2580.396030]  [] ? do_last+0x1a9/0xbfa
kernel: [ 2580.396036]  [] ? link_path_walk+0x54/0x6c2
kernel: [ 2580.396044]  [] ? path_openat+0xaf/0x515
kernel: [ 2580.396053]  [] ? __fd_install+0x1f/0x4a
kernel: [ 2580.396060]  [] ? xfs_ioc_getbmapx+0x9b/0x9b
kernel: [ 2580.396068]  [] do_vfs_ioctl+0x2f6/0x4cc
kernel: [ 2580.396076]  [] ? __fd_install+0x40/0x4a
kernel: [ 2580.396083]  [] ? _raw_spin_unlock+0x22/0x30
kernel: [ 2580.396090]  [] ? final_putname+0x1d/0x36
kernel: [ 2580.396097]  [] ? final_putname+0x1d/0x36
kernel: [ 2580.396104]  [] ? putname+0x23/0x2f
kernel: [ 2580.396112]  [] ? do_sys_open+0x17d/0x1d8
kernel: [ 2580.396120]  [] ? restore_all+0xf/0xf
kernel: [ 2580.396127]  [] 

Re: [PATCH] kernel/params.c: print failure information instead of 'KOBJ_ADD' to user space, when sysfs_create_file() fails.

2013-07-09 Thread Rusty Russell
Chen Gang  writes:
> When sysfs_create_file() fails, recommend to print the related failure
> information. And it is useless to still 'KOBJ_ADD' to user space.
>
> Signed-off-by: Chen Gang 

sysfs_create_file() should not fail during boot, should it?

Cheers,
Rusty.

> ---
>  kernel/params.c |8 +++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/params.c b/kernel/params.c
> index 440e65d..f5299c1 100644
> --- a/kernel/params.c
> +++ b/kernel/params.c
> @@ -845,7 +845,13 @@ static void __init version_sysfs_builtin(void)
>   mk = locate_module_kobject(vattr->module_name);
>   if (mk) {
>   err = sysfs_create_file(>kobj, >mattr.attr);
> - kobject_uevent(>kobj, KOBJ_ADD);
> + if (err)
> + printk(KERN_WARNING
> +"%s (%d): sysfs_create_file fail for %s, 
> err: %d\n",
> +__FILE__, __LINE__,
> +vattr->module_name, err);
> + else
> + kobject_uevent(>kobj, KOBJ_ADD);
>   kobject_put(>kobj);
>   }
>   }
> -- 
> 1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-net: put virtio net header inline with data

2013-07-09 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Tue, Jul 09, 2013 at 11:46:23AM +0930, Rusty Russell wrote:
>> "Michael S. Tsirkin"  writes:
>> > For small packets we can simplify xmit processing
>> > by linearizing buffers with the header:
>> > most packets seem to have enough head room
>> > we can use for this purpose.
>> > Since existing hypervisors require that header
>> > is the first s/g element, we need a feature bit
>> > for this.
>> >
>> > Signed-off-by: Michael S. Tsirkin 
>> > ---
>> >
>> > Note: this needs to be applied on top of patch
>> > defining VIRTIO_F_ANY_LAYOUT - bit to be
>> > selected by Rusty.
>> >
>> > The following patch should work for any definition of
>> > VIRTIO_F_ANY_LAYOUT - I used bit 31 for testing.
>> > Rusty, could you please pick a valid bit for VIRTIO_F_ANY_LAYOUT
>> > and squeeze this patch into 3.11?
>> 
>> Sorry, it's too late.
>> 
>> First, I've been a bit lax in sending patches via DaveM, and this is
>> definitely his territory (ie. more net than virtio).
>
> Let's do this: I'll send a patch series, you ack and we see
> what happens?

If you convince DaveM, I won't object :)

>> Secondly, it needs baking and testing time.
>> 
>> Cheers,
>> Rusty.
>
> I did some testing on this.  But proper testing just isn't happening out
> of tree.

But it will get into linux-next for the *next* merge window.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.9-rc6 ext4: free_rb_tree_fname oops

2013-07-09 Thread Zheng Liu
Hi Daniel,

On Mon, Jun 24, 2013 at 02:34:00PM +0800, Daniel J Blueman wrote:
> On 16 April 2013 15:37, Daniel J Blueman  wrote:
> > When using e4defrag on a ext4 filesystem created a month ago, I ran
> > into this fatal page fault [1]
> >  while running e4defrag on 3.9-rc6 (Ubuntu mainline).
> >
> > e2fsdump output is at http://quora.org/2012/e2fsdump.txt ; let me know
> > if you need any more info.
> 
> With 3.9.6 mainline, I got the exact same protection fault at
> free_rb_tree_fname() from ext4_htree_free_dir_info() [1]. This
> suggests use-after-free, as there's no pagetable mapping.
> 
> There is nothing special with my setups, so there is fair chance it's
> reproducible there with e4defrag on a few month old filesystem and
> recent kernels.

These days I try to reproduce this bug, but unfortunately I couldn't
hit it.  I create/read/write/delete some files in a SSD disk to simulate
a file system that has been used for a while.  Then I use e4defrag to
defrag this file system.  But I couldn't trigger the bug.  The kernel
version is the latest ext4/dev branch, and the e2fsprgs version is the
1.42.7.  Do you have a method to easily reproduce this bug?

Thanks,
- Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


virtio indirect with lots of descriptors

2013-07-09 Thread Dave Airlie
Hi Rusty,

playing with my virtio gpu, I started hitting the qemu
error_report("Too many read descriptors in indirect table");

Now I'm not sure but this doesn't seem to be a virtio limit that the
guest catches from what I can see, since my host dies quite quickly,
when I'm doing transfers in/out of a 5MB object with an sg entry per
page.

Just wondering if you can confirm if this is only a qemu limitation or
if I should just work around it at a bit of a higher level in my
driver/device?

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Information] gcc '-O2' may skip some of uninitialized variables warnings (exist almost 10 years).

2013-07-09 Thread Chen Gang
Hell All:

gcc '-O2' may skip some of uninitialized variables warnings which may be
real bugs. This is gcc issue which keep existence almost 10 years, and
now it seems can not be fixed in recent years.

Please reference: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501, for
kernel special case, also can reference:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57856

If we want to see the warnings (for finding bugs), "EXTRA_CFLAGS=-W"
seems not enough (it use '-O2'), I recommend also to try '-O0' (although
it may also print many spurious warnings)


Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ocfs2/refcounttree: add the missing NULL check of the return value of find_or_create_page()

2013-07-09 Thread Gu Zheng
On 07/10/2013 06:11 AM, Joel Becker wrote:

> On Mon, Jul 08, 2013 at 03:52:53PM +0800, Gu Zheng wrote:
>> Add the missing NULL check of the return value of find_or_create_page() in
>> function ocfs2_duplicate_clusters_by_page().
>>
>> Signed-off-by: Gu Zheng 
>> ---
>>  fs/ocfs2/refcounttree.c |6 +-
>>  1 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c
>> index 998b17e..456d0e4 100644
>> --- a/fs/ocfs2/refcounttree.c
>> +++ b/fs/ocfs2/refcounttree.c
>> @@ -2965,7 +2965,11 @@ int ocfs2_duplicate_clusters_by_page(handle_t *handle,
>>  to = map_end & (PAGE_CACHE_SIZE - 1);
>>
>>  page = find_or_create_page(mapping, page_index, GFP_NOFS);
>> -
>> +if (!page) {
>> +ret = -ENOMEM;
>> +mlog_errno(ret);
>> +break;
>> +}
>>  /*
>>   * In case PAGE_CACHE_SIZE <= CLUSTER_SIZE, This page
>>   * can't be dirtied before we CoW it out.
> 
> Put a blank line between the closing brace and the comment.  Otherwise,

Got it.:)

> Acked-by: Joel Becker 

Thanks~

Regards,
Gu

> 
> Joel
>> -- 
>> 1.7.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the ceph tree with Linus' tree

2013-07-09 Thread Stephen Rothwell
Hi Sage,

Today's linux-next merge of the ceph tree got conflicts in
drivers/block/rbd.c and net/ceph/osd_client.c because the ceph tree was
rebased before being sent to Linus and it looks like one patch
was dropped and several more added.

I just used the upstream version of the cpeh tree for today - please
clean up.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgphNOCO6cG7n.pgp
Description: PGP signature


Re: [RFC PATCH 0/5] Support multiple pages allocation

2013-07-09 Thread Zhang Yanfei
于 2013/7/10 8:31, Joonsoo Kim 写道:
> On Thu, Jul 04, 2013 at 12:00:44PM +0200, Michal Hocko wrote:
>> On Thu 04-07-13 13:24:50, Joonsoo Kim wrote:
>>> On Thu, Jul 04, 2013 at 12:01:43AM +0800, Zhang Yanfei wrote:
 On 07/03/2013 11:51 PM, Zhang Yanfei wrote:
> On 07/03/2013 11:28 PM, Michal Hocko wrote:
>> On Wed 03-07-13 17:34:15, Joonsoo Kim wrote:
>> [...]
>>> For one page allocation at once, this patchset makes allocator slower 
>>> than
>>> before (-5%). 
>>
>> Slowing down the most used path is a no-go. Where does this slow down
>> come from?
>
> I guess, it might be: for one page allocation at once, comparing to the 
> original
> code, this patch adds two parameters nr_pages and pages and will do extra 
> checks
> for the parameter nr_pages in the allocation path.
>

 If so, adding a separate path for the multiple allocations seems better.
>>>
>>> Hello, all.
>>>
>>> I modify the code for optimizing one page allocation via likely macro.
>>> I attach a new one at the end of this mail.
>>>
>>> In this case, performance degradation for one page allocation at once is 
>>> -2.5%.
>>> I guess, remained overhead comes from two added parameters.
>>> Is it unreasonable cost to support this new feature?
>>
>> Which benchmark you are using for this testing?
> 
> I use my own module which do allocation repeatedly.
> 
>>
>>> I think that readahead path is one of the most used path, so this penalty 
>>> looks
>>> endurable. And after supporting this feature, we can find more use cases.
>>
>> What about page faults? I would oppose that page faults are _much_ more
>> frequent than read ahead so you really cannot slow them down.
> 
> You mean page faults for anon?
> Yes. I also think that it is much more frequent than read ahead.
> Before futher discussion, I will try to add a separate path
> for the multiple allocations.

Some days ago, I was thinking that this multiple allocation behaviour
may be useful for vmalloc allocations. So I think it is worth trying.

> 
> Thanks.
> 
>>
>> [...]
>> -- 
>> Michal Hocko
>> SUSE Labs
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majord...@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH REPOST] kexec: return error of machine_kexec() fails

2013-07-09 Thread Zhang Yanfei
于 2013/7/10 0:16, Stephen Warren 写道:
> From: Stephen Warren 
> 
> Prior to commit 3ab8352 "kexec jump", if machine_kexec() returned,
> sys_reboot() would return -EINVAL. This patch restores this behaviour
> for the non-KEXEC_JUMP case, where machine_kexec() is not expected to
> return.
> 
> This situation can occur on ARM, where kexec requires disabling all but
> one CPU using CPU hotplug. However, if hotplug isn't supported by the
> particular HW the kernel is running on, then kexec cannot succeed.

I don't have an ARM machine, but I believe you are right by reading the
code. So

Acked-by: Zhang Yanfei 

> 
> Signed-off-by: Stephen Warren 
> Acked-by: Will Deacon 
> ---
>  kernel/kexec.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 59f7b55..bde1190 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1702,6 +1702,8 @@ int kernel_kexec(void)
>   pm_restore_console();
>   unlock_system_sleep();
>   }
> +#else
> + error = -EINVAL;
>  #endif
>  
>   Unlock:
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread David Woodhouse
On Tue, 2013-07-09 at 17:18 -0700, Guenter Roeck wrote:
> 
> I meant warning as in pr_warn or dev_warn, not WARNING as in traceback.
> Keep in mind that a casual user doesn't expect to see a traceback and will 
> tend
> to get alarmed. Several bugs have been filed against this "issue" in various
> distributions, which is not surprising given the alarmist message.
> What is the point of that ?

It is warning you that your hardware is broken. Take it back to the
place from which you purchased it, and ask for your money back if it
isn't fixed.

(Slightly) more seriously, this level of warning *does* get things
fixed, and when kerneloops was running it made it very easy to track
this kind of issue and apply pressure where it was needed to improve
quality.

Any user who has taken the trouble to file bugs has *also* taken it up
with their firmware vendor, I hope?

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature


[PATCH] usb: phy: samsung-usb2: Toggle HSIC GPIO from device tree

2013-07-09 Thread Julius Werner
This patch adds support for a new 'samsung,hsic-reset-gpio' in the
device tree, which will be interpreted as an active-low reset pin during
PHY initialization when it exists. Useful for intergrated HSIC devices
like an SMSC 3503 hub. It is necessary to add this directly to the PHY
initialization to get the timing right, since resetting a HSIC device
after it has already been enumerated can confuse the USB stack.

Also fixes PHY semaphore code to make sure we always go through the
setup at least once, even if it was already turned on (e.g. by
firmware), and changes a spinlock to a mutex to allow sleeping in the
critical section.

Change-Id: Ieecac52c27daa7a17a7ed3b2863ddba3aeb8d16f
Signed-off-by: Julius Werner 
---
 .../devicetree/bindings/usb/samsung-usbphy.txt | 10 ++
 drivers/usb/phy/phy-samsung-usb.c  | 17 ++
 drivers/usb/phy/phy-samsung-usb.h  |  7 ++--
 drivers/usb/phy/phy-samsung-usb2.c | 38 ++
 drivers/usb/phy/phy-samsung-usb3.c | 12 +++
 5 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt 
b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
index 33fd354..82e2e16 100644
--- a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
+++ b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
@@ -31,6 +31,12 @@ Optional properties:
 - ranges: allows valid translation between child's address space and parent's
  address space.
 
+- samsung,hsic-reset-gpio: an active low GPIO pin that resets a device
+   connected to the HSIC port. Useful for things like
+   an on-board SMSC3503 hub.
+- pinctrl-0: Pin control group containing the HSIC reset GPIO pin.
+- pinctrl-names: Should contain only one value - "default".
+
 - The child node 'usbphy-sys' to the node 'usbphy' is for the system controller
   interface for usb-phy. It should provide the following information required 
by
   usb-phy controller to control phy.
@@ -56,6 +62,10 @@ Example:
clocks = < 2>, < 305>;
clock-names = "xusbxti", "otg";
 
+   samsung,hsic-reset-gpio = < 4 1>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_reset>;
+
usbphy-sys {
/* USB device and host PHY_CONTROL registers */
reg = <0x10020704 0x8>;
diff --git a/drivers/usb/phy/phy-samsung-usb.c 
b/drivers/usb/phy/phy-samsung-usb.c
index ac025ca..23f1d70 100644
--- a/drivers/usb/phy/phy-samsung-usb.c
+++ b/drivers/usb/phy/phy-samsung-usb.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "phy-samsung-usb.h"
@@ -58,6 +59,22 @@ int samsung_usbphy_parse_dt(struct samsung_usbphy *sphy)
if (sphy->sysreg == NULL)
dev_warn(sphy->dev, "Can't get usb-phy sysreg cfg register\n");
 
+   /*
+* Some boards have a separate active-low reset GPIO for their HSIC USB
+* devices. If they don't, this will just stay at an invalid value and
+* the init code will ignore it.
+*/
+   sphy->hsic_reset_gpio = of_get_named_gpio(sphy->dev->of_node,
+   "samsung,hsic-reset-gpio", 0);
+   if (gpio_is_valid(sphy->hsic_reset_gpio)) {
+   if (devm_gpio_request_one(sphy->dev, sphy->hsic_reset_gpio,
+   GPIOF_OUT_INIT_LOW, "samsung_hsic_reset")) {
+   dev_err(sphy->dev, "can't request hsic reset gpio %d\n",
+   sphy->hsic_reset_gpio);
+   sphy->hsic_reset_gpio = -EINVAL;
+   }
+   }
+
of_node_put(usbphy_sys);
 
return 0;
diff --git a/drivers/usb/phy/phy-samsung-usb.h 
b/drivers/usb/phy/phy-samsung-usb.h
index 68771bf..0703878 100644
--- a/drivers/usb/phy/phy-samsung-usb.h
+++ b/drivers/usb/phy/phy-samsung-usb.h
@@ -16,6 +16,7 @@
  * GNU General Public License for more details.
  */
 
+#include 
 #include 
 
 /* Register definitions */
@@ -301,7 +302,8 @@ struct samsung_usbphy_drvdata {
  * @phy_type: Samsung SoCs specific phy types: #HOST
  * #DEVICE
  * @phy_usage: usage count for phy
- * @lock: lock for phy operations
+ * @mutex: mutex for phy operations (usb2phy must sleep, so no spinlock!)
+ * @hsic_reset_gpio: Active low GPIO that resets connected HSIC device
  */
 struct samsung_usbphy {
struct usb_phy  phy;
@@ -315,7 +317,8 @@ struct samsung_usbphy {
const struct samsung_usbphy_drvdata *drv_data;
enum samsung_usb_phy_type phy_type;
atomic_tphy_usage;
-   spinlock_t  lock;
+   struct mutexmutex;
+   int hsic_reset_gpio;
 };
 
 #define phy_to_sphy(x) container_of((x), struct samsung_usbphy, phy)
diff --git a/drivers/usb/phy/phy-samsung-usb2.c 

Re: [RFC PATCH 0/5] Support multiple pages allocation

2013-07-09 Thread Joonsoo Kim
On Thu, Jul 04, 2013 at 12:00:44PM +0200, Michal Hocko wrote:
> On Thu 04-07-13 13:24:50, Joonsoo Kim wrote:
> > On Thu, Jul 04, 2013 at 12:01:43AM +0800, Zhang Yanfei wrote:
> > > On 07/03/2013 11:51 PM, Zhang Yanfei wrote:
> > > > On 07/03/2013 11:28 PM, Michal Hocko wrote:
> > > >> On Wed 03-07-13 17:34:15, Joonsoo Kim wrote:
> > > >> [...]
> > > >>> For one page allocation at once, this patchset makes allocator slower 
> > > >>> than
> > > >>> before (-5%). 
> > > >>
> > > >> Slowing down the most used path is a no-go. Where does this slow down
> > > >> come from?
> > > > 
> > > > I guess, it might be: for one page allocation at once, comparing to the 
> > > > original
> > > > code, this patch adds two parameters nr_pages and pages and will do 
> > > > extra checks
> > > > for the parameter nr_pages in the allocation path.
> > > > 
> > > 
> > > If so, adding a separate path for the multiple allocations seems better.
> > 
> > Hello, all.
> > 
> > I modify the code for optimizing one page allocation via likely macro.
> > I attach a new one at the end of this mail.
> > 
> > In this case, performance degradation for one page allocation at once is 
> > -2.5%.
> > I guess, remained overhead comes from two added parameters.
> > Is it unreasonable cost to support this new feature?
> 
> Which benchmark you are using for this testing?

I use my own module which do allocation repeatedly.

> 
> > I think that readahead path is one of the most used path, so this penalty 
> > looks
> > endurable. And after supporting this feature, we can find more use cases.
> 
> What about page faults? I would oppose that page faults are _much_ more
> frequent than read ahead so you really cannot slow them down.

You mean page faults for anon?
Yes. I also think that it is much more frequent than read ahead.
Before futher discussion, I will try to add a separate path
for the multiple allocations.

Thanks.

> 
> [...]
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread Guenter Roeck
On Tue, Jul 09, 2013 at 05:05:11PM -0700, Chris Wright wrote:
> * Guenter Roeck (li...@roeck-us.net) wrote:
> > On Tue, Jul 09, 2013 at 04:22:52PM -0700, Chris Wright wrote:
> > > * Guenter Roeck (li...@roeck-us.net) wrote:
> > > > On Tue, Jul 09, 2013 at 03:05:39PM -0600, Bjorn Helgaas wrote:
> > > > > [+cc Joerg, David, iommu list]
> > > > > 
> > > > > On Tue, Jul 9, 2013 at 2:24 PM, Guenter Roeck  
> > > > > wrote:
> > > > > > I started seeing this problem after updating the BIOS trying fix 
> > > > > > another issue,
> > > > > > though I may have missed it earlier.
> > > > > >
> > > > > > I understand this is a BIOS bug. Would be great if someone can pass 
> > > > > > this on
> > > > > > to Intel BIOS engineers.
> > > > > 
> > > > > Maybe.  It'd be nice if Linux handled it better, though.
> > > > > 
> > > > If anyone has an idea how to do that, I'll be happy to write a patch.
> > > 
> > > I'm not sure there's much you can do.  The BIOS is saying there's a DMAR
> > > unit, and then saying the registers are at addr 0x0.  The kernel is
> > > simply warning you about the invalid DMAR table entry.
> > > 
> > > One thing I've seen is the BIOS zeroing the base register address when
> > > VT-d is disabled in BIOS.  So, Guenter, a "fix" may be simply enabling
> > > VT-d in the BIOS.
> >
> > Ah, yes, I think I may have that disabled. I'll check it tonight.
> > 
> > Does that really warrant a traceback, or would a warning message be more
> > appropriate (possibly telling the user to enable VT-d) ?
> 
> Bottom line, the BIOS is providing what we're seeing as invalid tables.
> If it's a BIOS attempt to disable VT-d is hard to glean from invalid
> tables, and not all BIOS give interface to enable/disable VT-d.
> 
> It is a warning message, BTW.  Guess I'd be inclined to leave as it is.
> 
I meant warning as in pr_warn or dev_warn, not WARNING as in traceback.
Keep in mind that a casual user doesn't expect to see a traceback and will tend
to get alarmed. Several bugs have been filed against this "issue" in various
distributions, which is not surprising given the alarmist message.
What is the point of that ?

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fixup for removing -f option in perf record

2013-07-09 Thread Namhyung Kim

Hi David,

2013-07-09 오후 11:21, David Ahern 쓴 글:

On 7/9/13 1:41 AM, Namhyung Kim wrote:

Hi Arnaldo,

You may want to merge this patch too. :)



He did. See 77d03596 and the note:

[ combined patch removing the -f usage in various sub-commands, such
as 'perf sched', etc, by Namhyung Kim ]


Oops, I didn't notice this.

Thanks,
Namhyung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: reminder

2013-07-09 Thread Stephen Rothwell
Hi all,

On Tue, 2 Jul 2013 17:09:57 +1000 Stephen Rothwell  
wrote:
>
> We are in the merge window, so please do not add anything to your
> linux-next included branches that is not destined for v3.11 until after
> v3.11-rc1 is released.

No, really :-(

i.e. maintainers should be saying "No" to any "features" that were not
ready before the merge window opened (even when the maintainer and the
developer are the same person) ...
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgp99hZ2CI79Z.pgp
Description: PGP signature


Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread Chris Wright
* Guenter Roeck (li...@roeck-us.net) wrote:
> On Tue, Jul 09, 2013 at 04:22:52PM -0700, Chris Wright wrote:
> > * Guenter Roeck (li...@roeck-us.net) wrote:
> > > On Tue, Jul 09, 2013 at 03:05:39PM -0600, Bjorn Helgaas wrote:
> > > > [+cc Joerg, David, iommu list]
> > > > 
> > > > On Tue, Jul 9, 2013 at 2:24 PM, Guenter Roeck  
> > > > wrote:
> > > > > I started seeing this problem after updating the BIOS trying fix 
> > > > > another issue,
> > > > > though I may have missed it earlier.
> > > > >
> > > > > I understand this is a BIOS bug. Would be great if someone can pass 
> > > > > this on
> > > > > to Intel BIOS engineers.
> > > > 
> > > > Maybe.  It'd be nice if Linux handled it better, though.
> > > > 
> > > If anyone has an idea how to do that, I'll be happy to write a patch.
> > 
> > I'm not sure there's much you can do.  The BIOS is saying there's a DMAR
> > unit, and then saying the registers are at addr 0x0.  The kernel is
> > simply warning you about the invalid DMAR table entry.
> > 
> > One thing I've seen is the BIOS zeroing the base register address when
> > VT-d is disabled in BIOS.  So, Guenter, a "fix" may be simply enabling
> > VT-d in the BIOS.
>
> Ah, yes, I think I may have that disabled. I'll check it tonight.
> 
> Does that really warrant a traceback, or would a warning message be more
> appropriate (possibly telling the user to enable VT-d) ?

Bottom line, the BIOS is providing what we're seeing as invalid tables.
If it's a BIOS attempt to disable VT-d is hard to glean from invalid
tables, and not all BIOS give interface to enable/disable VT-d.

It is a warning message, BTW.  Guess I'd be inclined to leave as it is.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / scan: Always call acpi_bus_scan() for bus check notifications

2013-07-09 Thread Rafael J. Wysocki
On Tuesday, July 09, 2013 01:32:42 PM Toshi Kani wrote:
> On Mon, 2013-07-08 at 02:10 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > An ACPI_NOTIFY_BUS_CHECK notification means that we should scan the
> > entire namespace starting from the given handle even if the device
> > represented by that handle is present (other devices below it may
> > just have been added).
> > 
> > For this reason, modify acpi_scan_bus_device_check() to always run
> > acpi_bus_scan() if the notification being handled is of type
> > ACPI_NOTIFY_BUS_CHECK.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> > Cc: 3.10+ 
> 
> Acked-by: Toshi Kani 
> 
> But, I think we need the additional patch below.

Yes, I think you're right.

Thanks,
Rafael



> =
> From: Toshi Kani 
> Subject: [PATCH] ACPI: Do not call attach() if device is attached
> 
> attach() of ACPI scan handlers does not expect to be called multiple
> times on a same device.  Also, the attached handler may not be changed
> without calling its detach().  Change acpi_scan_attach_handler() not
> to call attach() when the given device is already attached.
> 
> Signed-off-by: Toshi Kani 
> ---
>  drivers/acpi/scan.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 20757e0..2b9e867 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -1885,6 +1885,9 @@ static int acpi_scan_attach_handler(struct
> acpi_device *device)
>   struct acpi_hardware_id *hwid;
>   int ret = 0;
>  
> + if (device->handler)
> + return 1;
> +
>   list_for_each_entry(hwid, >pnp.ids, list) {
>   const struct acpi_device_id *devid;
>   struct acpi_scan_handler *handler;
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread Chris Wright
* Guenter Roeck (li...@roeck-us.net) wrote:
> On Tue, Jul 09, 2013 at 03:05:39PM -0600, Bjorn Helgaas wrote:
> > [+cc Joerg, David, iommu list]
> > 
> > On Tue, Jul 9, 2013 at 2:24 PM, Guenter Roeck  wrote:
> > > I started seeing this problem after updating the BIOS trying fix another 
> > > issue,
> > > though I may have missed it earlier.
> > >
> > > I understand this is a BIOS bug. Would be great if someone can pass this 
> > > on
> > > to Intel BIOS engineers.
> > 
> > Maybe.  It'd be nice if Linux handled it better, though.
> > 
> If anyone has an idea how to do that, I'll be happy to write a patch.

I'm not sure there's much you can do.  The BIOS is saying there's a DMAR
unit, and then saying the registers are at addr 0x0.  The kernel is
simply warning you about the invalid DMAR table entry.

One thing I've seen is the BIOS zeroing the base register address when
VT-d is disabled in BIOS.  So, Guenter, a "fix" may be simply enabling
VT-d in the BIOS.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 6/8] ACPI / hotplug / PCI: Drop acpiphp_handle_to_bridge()

2013-07-09 Thread Rafael J. Wysocki
On Tuesday, July 09, 2013 12:37:26 PM Mika Westerberg wrote:
> On Tue, Jul 09, 2013 at 02:20:31AM +0200, Rafael J. Wysocki wrote:
> > @@ -953,37 +937,49 @@ static void acpiphp_sanitize_bus(struct
> >   * ACPI event handlers
> >   */
> >  
> > -static acpi_status
> > -check_sub_bridges(acpi_handle handle, u32 lvl, void *context, void **rv)
> > +static acpi_status check_sub_bridges(acpi_handle handle, u32 lvl, void 
> > *data,
> > +void **rv)
> >  {
> > -   struct acpiphp_bridge *bridge;
> > -   char objname[64];
> > -   struct acpi_buffer buffer = { .length = sizeof(objname),
> > - .pointer = objname };
> > +   struct acpiphp_context *context = acpiphp_get_context(handle);
> > +
> > +   if (!context)
> > +   return AE_OK;
> >  
> > -   bridge = acpiphp_handle_to_bridge(handle);
> > -   if (bridge) {
> > +   if (context->bridge) {
> > +   struct acpiphp_bridge *bridge = context->bridge;
> > +   char objname[64];
> > +   struct acpi_buffer buffer = { .length = sizeof(objname),
> > + .pointer = objname };
> > +
> > +   get_bridge(bridge);
> > acpi_get_name(handle, ACPI_FULL_PATHNAME, );
> > -   dbg("%s: re-enumerating slots under %s\n",
> > -   __func__, objname);
> > +   dbg("%s: re-enumerating slots under %s\n", __func__, objname);
> 
> Although not related to this patch directly, how about using
> acpi_handle_debug() or similar here?

Well, we don't have acpi_handle_debug() and I remember there was a reason why,
but I can't recall what the reason was at the moment. :-)

> > acpiphp_check_bridge(bridge);
> > put_bridge(bridge);
> > }
> > +   acpiphp_put_context(context);
> > return AE_OK ;
> >  }
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Update][RFC][PATCH 4/8] ACPI / hotplug / PCI: Hotplug context objects for bridges and functions

2013-07-09 Thread Rafael J. Wysocki
On Tuesday, July 09, 2013 12:23:53 PM Mika Westerberg wrote:
> On Tue, Jul 09, 2013 at 02:18:12AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > When either a new hotplug brigde or a new hotplug function is added
> ^^
> typo
> 

Yup, thanks!

> > by the ACPI-based PCI hotplug (acpiphp) code, attach a context object
> > to its ACPI handle to store hotplug-related information in it.  To
> > start with, put the handle's bridge and function pointers into that
> > object.  Count references to the context objects and drop them when
> > they are not needed any more.
> > 
> > First of all, this makes it possible to find out if the given bridge
> > has been registered as a function already in a much more
> > straightforward way and acpiphp_bridge_handle_to_function() can be
> > dropped (Yay!).
> >
> > This also will allow some more simplifications to be made going
> > forward.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> > ---

[...]

> >  /* callback routine to register each ACPI PCI slot object */
> > -static acpi_status
> > -register_slot(acpi_handle handle, u32 lvl, void *context, void **rv)
> > +static acpi_status register_slot(acpi_handle handle, u32 lvl, void *data,
> > +void **rv)
> >  {
> > -   struct acpiphp_bridge *bridge = (struct acpiphp_bridge *)context;
> > +   struct acpiphp_bridge *bridge = (struct acpiphp_bridge *)data;
> > +   struct acpiphp_context *context;
> > struct acpiphp_slot *slot;
> > struct acpiphp_func *newfunc;
> > acpi_handle tmp;
> > @@ -229,8 +293,20 @@ register_slot(acpi_handle handle, u32 lv
> > if (!newfunc)
> > return AE_NO_MEMORY;
> >  
> > +   context = acpiphp_get_context(handle);
> > +   if (!context) {
> > +   context = acpiphp_init_context(handle);
> 
> Since acpiphp_get_context() already does acpiphp_init_context() is the
> above really necessary?

Hmm, acpiphp_get_context() wasn't supposed to be doing that.

Well, I guess I just forgot to remove that part.  Updated patch is appended.

Thanks,
Rafael

---
From: Rafael J. Wysocki 
Subject: ACPI / hotplug / PCI: Hotplug context objects for bridges and functions

When either a new hotplug bridge or a new hotplug function is added
by the ACPI-based PCI hotplug (acpiphp) code, attach a context object
to its ACPI handle to store hotplug-related information in it.  To
start with, put the handle's bridge and function pointers into that
object.  Count references to the context objects and drop them when
they are not needed any more.

First of all, this makes it possible to find out if the given bridge
has been registered as a function already in a much more
straightforward way and acpiphp_bridge_handle_to_function() can be
dropped (Yay!).

This also will allow some more simplifications to be made going
forward.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/pci/hotplug/acpiphp.h  |   10 ++
 drivers/pci/hotplug/acpiphp_glue.c |  152 ++---
 2 files changed, 119 insertions(+), 43 deletions(-)

Index: linux-pm/drivers/pci/hotplug/acpiphp.h
===
--- linux-pm.orig/drivers/pci/hotplug/acpiphp.h
+++ linux-pm/drivers/pci/hotplug/acpiphp.h
@@ -49,6 +49,7 @@
 #define info(format, arg...) printk(KERN_INFO "%s: " format, MY_NAME , ## arg)
 #define warn(format, arg...) printk(KERN_WARNING "%s: " format, MY_NAME , ## 
arg)
 
+struct acpiphp_context;
 struct acpiphp_bridge;
 struct acpiphp_slot;
 
@@ -77,6 +78,7 @@ struct acpiphp_bridge {
struct kref ref;
acpi_handle handle;
 
+   struct acpiphp_context *context;
/* Ejectable PCI-to-PCI bridge (PCI bridge and PCI function) */
struct acpiphp_func *func;
 
@@ -119,6 +121,7 @@ struct acpiphp_slot {
  * typically 8 objects per slot (i.e. for each PCI function)
  */
 struct acpiphp_func {
+   struct acpiphp_context *context;
struct acpiphp_slot *slot;  /* parent */
 
struct list_head sibling;
@@ -129,6 +132,13 @@ struct acpiphp_func {
u32 flags;  /* see below */
 };
 
+struct acpiphp_context {
+   struct kref kref;
+   acpi_handle handle;
+   struct acpiphp_func *func;
+   struct acpiphp_bridge *bridge;
+};
+
 /*
  * struct acpiphp_attention_info - device specific attention registration
  *
Index: linux-pm/drivers/pci/hotplug/acpiphp_glue.c
===
--- linux-pm.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-pm/drivers/pci/hotplug/acpiphp_glue.c
@@ -79,6 +79,59 @@ is_pci_dock_device(acpi_handle handle, u
}
 }
 
+static void acpiphp_context_handler(acpi_handle handle, void *context)
+{
+   /* Intentionally empty. */
+}
+
+static struct acpiphp_context *acpiphp_init_context(acpi_handle handle)
+{
+   struct acpiphp_context *context;
+   acpi_status status;
+
+   context = 

Re: WARNING: at drivers/iommu/dmar.c:484 warn_invalid_dmar with Intel Motherboard

2013-07-09 Thread Guenter Roeck
On Tue, Jul 09, 2013 at 04:22:52PM -0700, Chris Wright wrote:
> * Guenter Roeck (li...@roeck-us.net) wrote:
> > On Tue, Jul 09, 2013 at 03:05:39PM -0600, Bjorn Helgaas wrote:
> > > [+cc Joerg, David, iommu list]
> > > 
> > > On Tue, Jul 9, 2013 at 2:24 PM, Guenter Roeck  wrote:
> > > > I started seeing this problem after updating the BIOS trying fix 
> > > > another issue,
> > > > though I may have missed it earlier.
> > > >
> > > > I understand this is a BIOS bug. Would be great if someone can pass 
> > > > this on
> > > > to Intel BIOS engineers.
> > > 
> > > Maybe.  It'd be nice if Linux handled it better, though.
> > > 
> > If anyone has an idea how to do that, I'll be happy to write a patch.
> 
> I'm not sure there's much you can do.  The BIOS is saying there's a DMAR
> unit, and then saying the registers are at addr 0x0.  The kernel is
> simply warning you about the invalid DMAR table entry.
> 
> One thing I've seen is the BIOS zeroing the base register address when
> VT-d is disabled in BIOS.  So, Guenter, a "fix" may be simply enabling
> VT-d in the BIOS.
> 
Ah, yes, I think I may have that disabled. I'll check it tonight.

Does that really warrant a traceback, or would a warning message be more
appropriate (possibly telling the user to enable VT-d) ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: Honor min_free_kbytes set by user

2013-07-09 Thread Jiri Kosina
On Thu, 4 Jul 2013, Michal Hocko wrote:

> On Thu 04-07-13 18:16:41, Michal Hocko wrote:
> > On Thu 04-07-13 09:10:39, Joe Perches wrote:
> > > On Thu, 2013-07-04 at 18:07 +0200, Michal Hocko wrote:
> > > > A warning is printed when the new value is ignored.
> > > 
> > > []
> > > 
> > > > +   printk(KERN_WARNING "min_free_kbytes is not updated to 
> > > > %d"
> > > > +   "because user defined value %d is 
> > > > preferred\n",
> > > > +   new_min_free_kbytes, 
> > > > user_min_free_kbytes);
> > > 
> > > Please use pr_warn and coalesce the format.
> > 
> > Sure can do that. mm/page_alloc.c doesn't seem to be unified in that
> > regards (44 printks and only 4 pr_) so I used printk.
> > 
> > > You'd've noticed a missing space between %d and because.
> > 
> > True
> > 
> 
> Checkpatch fixes
> ---
> >From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Thu, 4 Jul 2013 17:15:54 +0200
> Subject: [PATCH] mm: Honor min_free_kbytes set by user
> 
> min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
> currently which is right thing to do in most cases but this could be
> unexpected if admin increased the value to prevent from allocation
> failures and the new min_free_kbytes would be decreased as a result of
> memory hotadd.
> 
> This patch saves the user defined value and allows updating
> min_free_kbytes only if it is higher than the saved one.
> 
> A warning is printed when the new value is ignored.
> 
> Signed-off-by: Michal Hocko 
> ---
>  mm/page_alloc.c | 24 +---
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 22c528e..9c011fc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
>  };
>  
>  int min_free_kbytes = 1024;
> +int user_min_free_kbytes;

Minor nit: any reason this can't be static?

>  
>  static unsigned long __meminitdata nr_kernel_pages;
>  static unsigned long __meminitdata nr_all_pages;
> @@ -5592,14 +5593,21 @@ static void __meminit 
> setup_per_zone_inactive_ratio(void)
>  int __meminit init_per_zone_wmark_min(void)
>  {
>   unsigned long lowmem_kbytes;
> + int new_min_free_kbytes;
>  
>   lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
> -
> - min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> - if (min_free_kbytes < 128)
> - min_free_kbytes = 128;
> - if (min_free_kbytes > 65536)
> - min_free_kbytes = 65536;
> + new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> +
> + if (new_min_free_kbytes > user_min_free_kbytes) {
> + min_free_kbytes = new_min_free_kbytes;
> + if (min_free_kbytes < 128)
> + min_free_kbytes = 128;
> + if (min_free_kbytes > 65536)
> + min_free_kbytes = 65536;
> + } else {
> + pr_warn("min_free_kbytes is not updated to %d because user 
> defined value %d is preferred\n",
> + new_min_free_kbytes, user_min_free_kbytes);
> + }
>   setup_per_zone_wmarks();
>   refresh_zone_stat_thresholds();
>   setup_per_zone_lowmem_reserve();
> @@ -5617,8 +5625,10 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, 
> int write,
>   void __user *buffer, size_t *length, loff_t *ppos)
>  {
>   proc_dointvec(table, write, buffer, length, ppos);
> - if (write)
> + if (write) {
> + user_min_free_kbytes = min_free_kbytes;
>   setup_per_zone_wmarks();
> + }
>   return 0;
>  }
>  
> 

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 5/8] ACPI / hotplug / PCI: Unified notify handler for hotplug events

2013-07-09 Thread Rafael J. Wysocki
On Tuesday, July 09, 2013 12:30:45 PM Mika Westerberg wrote:
> On Tue, Jul 09, 2013 at 02:19:04AM +0200, Rafael J. Wysocki wrote:
> > Index: linux-pm/drivers/pci/hotplug/acpiphp.h
> > ===
> > --- linux-pm.orig/drivers/pci/hotplug/acpiphp.h
> > +++ linux-pm/drivers/pci/hotplug/acpiphp.h
> > @@ -137,6 +137,7 @@ struct acpiphp_context {
> > acpi_handle handle;
> > struct acpiphp_func *func;
> > struct acpiphp_bridge *bridge;
> > +   bool handler_for_func:1;
> 
> Hmm, should it be just plain:
> 
>   bool handler_for_func;
> 
> ? What's the reason using bitfields for bool?

If there are more of them, they can be stored together in one int (they are
unsigned int under the hood).

I this particular case it doesn't matter and one of subsequent patches will
remove that field anyway. :-)

> >  };
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/8] KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO

2013-07-09 Thread Alexey Kardashevskiy
On 07/10/2013 01:35 AM, Alexander Graf wrote:
> On 06/27/2013 07:02 AM, Alexey Kardashevskiy wrote:
>> Signed-off-by: Alexey Kardashevskiy
>> ---
>>   include/uapi/linux/kvm.h |2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 970b1f5..0865c01 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
>>   #define KVM_CAP_PPC_RTAS 91
>>   #define KVM_CAP_IRQ_XICS 92
>>   #define KVM_CAP_SPAPR_MULTITCE 93
>> +#define KVM_CAP_SPAPR_TCE_IOMMU 94
>>
>>   #ifdef KVM_CAP_IRQ_ROUTING
>>
>> @@ -923,6 +924,7 @@ struct kvm_s390_ucas_mapping {
>>   /* Available with KVM_CAP_PPC_ALLOC_HTAB */
>>   #define KVM_PPC_ALLOCATE_HTAB  _IOWR(KVMIO, 0xa7, __u32)
>>   #define KVM_CREATE_SPAPR_TCE  _IOW(KVMIO,  0xa8, struct
>> kvm_create_spapr_tce)
>> +#define KVM_CREATE_SPAPR_TCE_IOMMU _IOW(KVMIO,  0xaf, struct
>> kvm_create_spapr_tce_iommu)
> 
> Please order them by number.

Oh. Again :( We have had this discussion with Scott Wood here already.
Where _exactly_ do you want me to put it? Many sections, not really
ordered. Thank you.



-- 
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >