Re: [PATCH v4 3/5] clk: dt: binding for basic multiplexer clock

2013-08-29 Thread Tony Lindgren
* Tero Kristo  [130829 00:06]:
> On 08/29/2013 04:14 AM, Mike Turquette wrote:
> >
> >The mux-clock binding covers a quite a few platforms that have similar
> >mux-clock programming requirements. If the DT binding is verbose enough
> >then the basic mux clock driver is sufficient to initialize all of the
> >mux clocks from DT: no new platform-specific clock driver with a bunch
> >of data is necessary.
> >
> >On the other hand if we rely on tables in C to define how mux-clock
> >parents are selected then every platform will have to write their own
> >clock driver just to define their clock data.
> >
> >Having drivers written for the sole purpose of listing out a bunch of
> >data sounds like something that DT was meant to solve, even if this
> >isn't at the board level and is at the SoC level.
> 
> +1. For my work this helps quite a bit at least.

Yes this is the way to do it. Please don't do drivers where the index
to some data table is passed in device tree. That's going to be a
nightmare in the long run.

The binding should describe a type of hardware like a dpll or a mux,
and then you just define as many instances of those as needed in the
.dts files.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] h8300/kernel/setup.c: add "linux/initrd.h" to pass compiling

2013-08-29 Thread Chen Gang
On 08/30/2013 12:32 PM, Guenter Roeck wrote:
> On 08/29/2013 08:59 PM, Chen Gang wrote:
>> The related error (allmodconfig for h8300):
>>
>>arch/h8300/kernel/setup.c: In function 'setup_arch':
>>arch/h8300/kernel/setup.c:103:3: error: 'initrd_start' undeclared
>> (first use in this function)
>>   initrd_start = memory_start;
>>   ^
>>arch/h8300/kernel/setup.c:103:3: note: each undeclared identifier
>> is reported only once for each function it appears in
>>arch/h8300/kernel/setup.c:104:3: error: 'initrd_end' undeclared
>> (first use in this function)
>>   initrd_end = memory_start += be32_to_cpu(((unsigned long *)
>> (memory_start))[2]);
>>   ^
>>
>> Signed-off-by: Chen Gang 
>> ---
>>   arch/h8300/kernel/setup.c |4 
>>   1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c
>> index d0b1607..85639a1 100644
>> --- a/arch/h8300/kernel/setup.c
>> +++ b/arch/h8300/kernel/setup.c
>> @@ -47,6 +47,10 @@
>>   #include 
>>   #endif
>>
>> +#if defined(CONFIG_BLK_DEV_INITRD)
>> +#include 
>> +#endif
>> +
> 
> Is the #ifdef/#endif really needed ? If not you should drop it.
> 

"linux/initrd.h" is needed by 'initrd_start' and 'initrd_end' when
BLK_DEV_INITRD enabled.

'memory_start' is defined within this file, and also only one place may
use "linux/initrd.h" within this file.

So if BLK_DEV_INITRD disabled, do not need "linux/initrd.h" either.


Please reference the code in "arch/h8300/kernel/setup.c" below.

102 void __init setup_arch(char **cmdline_p)
103 {
104 int bootmap_size;
105 
106 memory_start = (unsigned long) &_ramstart;
107 
108 /* allow for ROMFS on the end of the kernel */
109 if (memcmp((void *)memory_start, "-rom1fs-", 8) == 0) {
110 #if defined(CONFIG_BLK_DEV_INITRD)
111 initrd_start = memory_start;
112 initrd_end = memory_start += be32_to_cpu(((unsigned long *) 
(memory_start))[2]);
113 #else
114 memory_start += be32_to_cpu(((unsigned long *) 
memory_start)[2]);
115 #endif
116 }



> Thanks,
> Guenter
> 
> 
> 

Thanks.
-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: e820: fix memmap kernel boot parameter

2013-08-29 Thread Bob Liu
Kernel boot parameter memmap=nn[KMG]$ss[KMG] is used to mark specific memory as
reserved. Region of memory to be used is from ss to ss+nn.

But I found the action of this parameter is not as expected.
I tried on two machines.
Machine1: bootcmdline in grub.cfg "memmap=800M$0x60bfdfff", but the result of
"cat /proc/cmdline" changed to "memmap=800M/bin/bashx60bfdfff" after system
booted.

Machine2: bootcmdline in grub.cfg "memmap=0x77ff$0x88000", the result of
"cat /proc/cmdline" changed to "memmap=0x77ffx88000".

I didn't find the root cause, I think maybe grub reserved "$0" as something
special.
Replace '$' with '%' in kernel boot parameter can fix this issue.

Signed-off-by: Bob Liu 
---
 Documentation/kernel-parameters.txt |6 +++---
 arch/x86/kernel/e820.c  |2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 7f9d4f5..a96c7b1 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1604,13 +1604,13 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
[KNL,ACPI] Mark specific memory as ACPI data.
Region of memory to be used, from ss to ss+nn.
 
-   memmap=nn[KMG]$ss[KMG]
+   memmap=nn[KMG]%ss[KMG]
[KNL,ACPI] Mark specific memory as reserved.
Region of memory to be used, from ss to ss+nn.
Example: Exclude memory from 0x1869-0x1869
-memmap=64K$0x1869
+memmap=64K%0x1869
 or
-memmap=0x1$0x1869
+memmap=0x1%0x1869
 
memory_corruption_check=0/1 [X86]
Some BIOSes seem to corrupt the first 64k of
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index d32abea..8483d45 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -869,7 +869,7 @@ static int __init parse_memmap_one(char *p)
} else if (*p == '#') {
start_at = memparse(p+1, );
e820_add_region(start_at, mem_size, E820_ACPI);
-   } else if (*p == '$') {
+   } else if (*p == '%') {
start_at = memparse(p+1, );
e820_add_region(start_at, mem_size, E820_RESERVED);
} else
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] h8300/kernel/setup.c: add "linux/initrd.h" to pass compiling

2013-08-29 Thread Guenter Roeck

On 08/29/2013 08:59 PM, Chen Gang wrote:

The related error (allmodconfig for h8300):

   arch/h8300/kernel/setup.c: In function 'setup_arch':
   arch/h8300/kernel/setup.c:103:3: error: 'initrd_start' undeclared (first use 
in this function)
  initrd_start = memory_start;
  ^
   arch/h8300/kernel/setup.c:103:3: note: each undeclared identifier is 
reported only once for each function it appears in
   arch/h8300/kernel/setup.c:104:3: error: 'initrd_end' undeclared (first use 
in this function)
  initrd_end = memory_start += be32_to_cpu(((unsigned long *) 
(memory_start))[2]);
  ^

Signed-off-by: Chen Gang 
---


Maybe an odd question, but is there a way to actually compile the h8300 target
in the first place ? The cross compiler on kernel.org doesn't work, nor does
the one available through Ubuntu.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] extcon: extcon-dra7xx: Add extcon driver for USB ID detection

2013-08-29 Thread Guenter Roeck

On 08/29/2013 05:20 PM, Chanwoo Choi wrote:

Hi Guenter,


I am currently working on adding device tree support to the extcon-gpio driver.

Wonder if it would make sense to just use that driver. As far as I can see
the only missing part is support for multiple cables and cable naming through
device tree properties.

Any thoughts ? The patches are not yet clean enough to submit upstream,
but I could send them as RFC if there is interest.



I'm considering ways to use and update extcon-gpio driver.
I have interest concerning your approach.
After completing your work, send them as RFC as you mentioned.



It does what we need it to do, so I sent it out a minute ago.
I added possible bindings for multi-cable support as additional RFC.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] h8300/kernel/setup.c: add "linux/initrd.h" to pass compiling

2013-08-29 Thread Guenter Roeck

On 08/29/2013 08:59 PM, Chen Gang wrote:

The related error (allmodconfig for h8300):

   arch/h8300/kernel/setup.c: In function 'setup_arch':
   arch/h8300/kernel/setup.c:103:3: error: 'initrd_start' undeclared (first use 
in this function)
  initrd_start = memory_start;
  ^
   arch/h8300/kernel/setup.c:103:3: note: each undeclared identifier is 
reported only once for each function it appears in
   arch/h8300/kernel/setup.c:104:3: error: 'initrd_end' undeclared (first use 
in this function)
  initrd_end = memory_start += be32_to_cpu(((unsigned long *) 
(memory_start))[2]);
  ^

Signed-off-by: Chen Gang 
---
  arch/h8300/kernel/setup.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c
index d0b1607..85639a1 100644
--- a/arch/h8300/kernel/setup.c
+++ b/arch/h8300/kernel/setup.c
@@ -47,6 +47,10 @@
  #include 
  #endif

+#if defined(CONFIG_BLK_DEV_INITRD)
+#include 
+#endif
+


Is the #ifdef/#endif really needed ? If not you should drop it.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the driver-core tree with the pm tree

2013-08-29 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the driver-core tree got a conflict in
drivers/base/core.c between commit 5e33bc4165f3 ("driver core / ACPI:
Avoid device hot remove locking issues") from the  tree and commit
86df26870569 ("drivers:base:core: Moved sym export macros to respective
functions") from the driver-core tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/base/core.c
index ac419a1,c7b0925..000
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@@ -1489,21 -1450,18 +1475,6 @@@ int __init devices_init(void
return -ENOMEM;
  }
  
- EXPORT_SYMBOL_GPL(device_for_each_child);
- EXPORT_SYMBOL_GPL(device_find_child);
- 
- EXPORT_SYMBOL_GPL(device_initialize);
- EXPORT_SYMBOL_GPL(device_add);
- EXPORT_SYMBOL_GPL(device_register);
 -static DEFINE_MUTEX(device_hotplug_lock);
--
- EXPORT_SYMBOL_GPL(device_del);
- EXPORT_SYMBOL_GPL(device_unregister);
- EXPORT_SYMBOL_GPL(get_device);
- EXPORT_SYMBOL_GPL(put_device);
 -void lock_device_hotplug(void)
 -{
 -  mutex_lock(_hotplug_lock);
 -}
--
- EXPORT_SYMBOL_GPL(device_create_file);
- EXPORT_SYMBOL_GPL(device_remove_file);
 -void unlock_device_hotplug(void)
 -{
 -  mutex_unlock(_hotplug_lock);
 -}
--
  static int device_check_offline(struct device *dev, void *not_used)
  {
int ret;


pgpt3rk3djknF.pgp
Description: PGP signature


[PATCH] gpio: null pointer dereference in error handling in gpiolib.c

2013-08-29 Thread Frank Rowand


Avoid calling desc_to_gpio() if desc->chip is NULL, as this will
cause a kernel panic.

In the code above the calls, there is a test for !chip, which
comes to the 'fail' label if true. In this case, the code
panics, since desc_to_gpio() uses desc->chip to look up the
gpio number.

An RFC patch that explained the cause of one example of panic when
desc->chip is NULL and fixed that example
(http://lkml.indiana.edu/hypermail/linux/kernel/1308.3/01473.html)
was accepted.  This patch fixes the remaining locations which have
the same problem.

Signed-off-by: Frank Rowand 

---
 drivers/gpio/gpiolib.c |   33  24 +9 - 0 !
 1 file changed, 24 insertions(+), 9 deletions(-)

Index: b/drivers/gpio/gpiolib.c
===
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1676,9 +1676,14 @@ lose:
return status;
 fail:
spin_unlock_irqrestore(_lock, flags);
-   if (status)
-   pr_debug("%s: gpio-%d status %d\n", __func__,
-desc_to_gpio(desc), status);
+   if (status) {
+   if (desc->chip) {
+   pr_debug("%s: gpio-%d status %d\n", __func__,
+desc_to_gpio(desc), status);
+   } else {
+   pr_debug("%s: gpio-?? status %d\n", __func__, status);
+   }
+   }
return status;
 }

@@ -1745,9 +1750,14 @@ lose:
return status;
 fail:
spin_unlock_irqrestore(_lock, flags);
-   if (status)
-   pr_debug("%s: gpio-%d status %d\n", __func__,
-desc_to_gpio(desc), status);
+   if (status) {
+   if (desc->chip) {
+   pr_debug("%s: gpio-%d status %d\n", __func__,
+desc_to_gpio(desc), status);
+   } else {
+   pr_debug("%s: gpio-?? status %d\n", __func__, status);
+   }
+   }
return status;
 }

@@ -1795,9 +1805,14 @@ static int gpiod_set_debounce(struct gpi

 fail:
spin_unlock_irqrestore(_lock, flags);
-   if (status)
-   pr_debug("%s: gpio-%d status %d\n", __func__,
-desc_to_gpio(desc), status);
+   if (status) {
+   if (desc->chip) {
+   pr_debug("%s: gpio-%d status %d\n", __func__,
+desc_to_gpio(desc), status);
+   } else {
+   pr_debug("%s: gpio-?? status %d\n", __func__, status);
+   }
+   }

return status;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make sure to wake reaper

2013-08-29 Thread Eric W. Biederman
"Serge E. Hallyn"  writes:

> Quoting Eric W. Biederman (ebied...@xmission.com):
>> Serge Hallyn  writes:
>> 
>> > Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
>> > possible to get into a situation where a pidns reaper is
>> > , reparented to host pid 1, but never reaped.  How to
>> > reproduce this is documented at
>> >
>> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
>> > (and see
>> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
>> > In short, run repeated starts of a container whose init is
>> >
>> > Process.exit(0);
>> >
>> > sysrq-t when such a task is playing zombie shows:
>> >
>> > [  131.132978] initx 88011fc14580 0  2084   2039 
>> > 0x
>> > [  131.132978]  880116e89ea8 0002 880116e89fd8 
>> > 00014580
>> > [  131.132978]  880116e89fd8 00014580 8801172a 
>> > 8801172a
>> > [  131.132978]  8801172a0630 88011729fff0 880116e14650 
>> > 88011729fff0
>> > [  131.132978] Call Trace:
>> > [  131.132978]  [] schedule+0x29/0x70
>> > [  131.132978]  [] do_exit+0x6e1/0xa40
>> > [  131.132978]  [] ? signal_wake_up_state+0x1e/0x30
>> > [  131.132978]  [] do_group_exit+0x3f/0xa0
>> > [  131.132978]  [] SyS_exit_group+0x14/0x20
>> > [  131.132978]  [] tracesys+0xe1/0xe6
>> >
>> > Further debugging showed that every time this happened, 
>> > zap_pid_ns_processes()
>> > started with nr_hashed being 3, while we were expecting it to drop to 2.
>> > Any time it didn't happen, nr_hashed was 1 or 2.  So the reaper was
>> > waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
>> > if nr_hashed hits 1.  This patch makes free_pid() wake the reaper any
>> > time the reaper is PF_EXITING, to force it to re-test the
>> > pidns->nr_hashed = init_pids test.  Note that this is more like what
>> > __unhash_process() used to do before
>> > af4b8a83add95ef40716401395b44a1b579965f4.
>> >
>> > Signed-off-by: Serge Hallyn 
>> > Cc: "Eric W. Biederman" 
>> > ---
>> >  kernel/pid.c | 4 
>> >  1 file changed, 4 insertions(+)
>> >
>> > diff --git a/kernel/pid.c b/kernel/pid.c
>> > index 0db3e79..6b312c4 100644
>> > --- a/kernel/pid.c
>> > +++ b/kernel/pid.c
>> > @@ -274,6 +274,10 @@ void free_pid(struct pid *pid)
>> >case 0:
>> >schedule_work(>proc_work);
>> >break;
>> > +  default:
>> > +  if (ns->child_reaper->flags & PF_EXITING)
>> > +  wake_up_process(ns->child_reaper);
>> > +  break;
>> >}
>> >}
>> >spin_unlock_irqrestore(_lock, flags);
>> 
>> qSo I think the change that we actually want is just to send a wake-up
>> when we have two pids in the pid namespace as well as one pid.
>> 
>> - That can send one extraneous wake-up but that is relatively harmless.
>
> Would more than one extraneous wake-up be more harmful?

An extraneous wake-up is a waste of time but not a correctness issue.
Anything that sleeps needs to be able to handle extraneous wake-ups.

>> - We can detect the condition race free.
>> - With only two pids remaining we are guaranteed that which ever task is
>>   the child_reaper will persist through zap_pid_ns_processes.
>
> My problem is I don't really understand the assumptions behind nr_hashed.
> I *thought* it was simply >1 if the init was threaded - but are threads
> in init limited to 2?  Or am I totally wrong about what the 2 means?

nr_hashed if fundamentally the number of pids in the pid hash table of a
pid namespace.  So if init has 2 thread ids nr_hashed is greather than one.

> If init *is* threaded, and the pid_ns->child_reaper exits but the other
> thread is still alive, then find_new_reaper should set pid_ns->child_reaper
> to the not-PF_EXITING task using
>
> 509 while_each_thread(father, thread) {
> 510 if (thread->flags & PF_EXITING)
> 511 continue;
> 512 if (unlikely(pid_ns->child_reaper == father))
> 513 pid_ns->child_reaper = thread;
> 514 return thread;
> 515 }
>
> right?

Yes.

> Which seems to suggest that checking for pid_ns->child_reaper->flags &
> PF_EXITING should always give us the right answer in free_pid().

I don't know that it is wrong, but we don't always have the task_lock
which protects PF_EXITING.  In particular when we are called from
change_pid.

Even more the task_lock protects pid_ns->child_reaper.

The thread_group_leader of any process may not be reaped until all of
the other threads are dead.  All of the other threads of a
multi-threaded process self reap when they exit.

Which means before we are reduced to nr_threads == 2 it is possible
that child_reaper will be a thread that will self reap and free it's
data structures before we are done waking it up and/or testing
PF_EXITING.

>>   There are 3 cases.
>>   init-tgleader 

[PATCH] drivers: uio: Kconfig: add MMU dependancy for UIO

2013-08-29 Thread Chen Gang
The User space I/O drivers are useful, only when user space meaningful
(MMU must be enabled).

So need let it depend on MMU, or can not pass compiling, the related
error (allmodconfig for H8300):

CC [M]  drivers/uio/uio.o
  drivers/uio/uio.c: In function 'uio_mmap_physical':
  drivers/uio/uio.c:650:2: error: implicit declaration of function 
'pgprot_noncached' [-Werror=implicit-function-declaration]
  drivers/uio/uio.c:650:20: error: incompatible types when assigning to type 
'pgprot_t' from type 'int'


Signed-off-by: Chen Gang 
---
 drivers/uio/Kconfig |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig
index a81d163..f91ec11 100644
--- a/drivers/uio/Kconfig
+++ b/drivers/uio/Kconfig
@@ -1,5 +1,6 @@
 menuconfig UIO
tristate "Userspace I/O drivers"
+   depends on MMU
help
  Enable this to allow the userspace driver core code to be
  built.  This code allows userspace programs easy access to
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 2/6] vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used()

2013-08-29 Thread Jason Wang
We tend to batch the used adding and signaling in vhost_zerocopy_callback()
which may result more than 100 used buffers to be updated in
vhost_zerocopy_signal_used() in some cases. So wwitch to use
vhost_add_used_and_signal_n() to avoid multiple calls to
vhost_add_used_and_signal(). Which means much more less times of used index
updating and memory barriers.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |   13 -
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 280ee66..8a6dd0d 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -281,7 +281,7 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
 {
struct vhost_net_virtqueue *nvq =
container_of(vq, struct vhost_net_virtqueue, vq);
-   int i;
+   int i, add;
int j = 0;
 
for (i = nvq->done_idx; i != nvq->upend_idx; i = (i + 1) % UIO_MAXIOV) {
@@ -289,14 +289,17 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
vhost_net_tx_err(net);
if (VHOST_DMA_IS_DONE(vq->heads[i].len)) {
vq->heads[i].len = VHOST_DMA_CLEAR_LEN;
-   vhost_add_used_and_signal(vq->dev, vq,
- vq->heads[i].id, 0);
++j;
} else
break;
}
-   if (j)
-   nvq->done_idx = i;
+   while (j) {
+   add = min(UIO_MAXIOV - nvq->done_idx, j);
+   vhost_add_used_and_signal_n(vq->dev, vq,
+   >heads[nvq->done_idx], add);
+   nvq->done_idx = (nvq->done_idx + add) % UIO_MAXIOV;
+   j -= add;
+   }
 }
 
 static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 4/6] vhost_net: determine whether or not to use zerocopy at one time

2013-08-29 Thread Jason Wang
Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
upend_idx != done_idx we still set zcopy_used to true and rollback this choice
later. This could be avoided by determine zerocopy once by checking all
conditions at one time before.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |   46 +++---
 1 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8a6dd0d..ff60c2a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -404,43 +404,35 @@ static void handle_tx(struct vhost_net *net)
   iov_length(nvq->hdr, s), hdr_size);
break;
}
-   zcopy_used = zcopy && (len >= VHOST_GOODCOPY_LEN ||
-  nvq->upend_idx != nvq->done_idx);
+
+   zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
+   && (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx
+   && vhost_net_tx_select_zcopy(net);
 
/* use msg_control to pass vhost zerocopy ubuf info to skb */
if (zcopy_used) {
+   struct ubuf_info *ubuf;
+   ubuf = nvq->ubuf_info + nvq->upend_idx;
+
vq->heads[nvq->upend_idx].id = head;
-   if (!vhost_net_tx_select_zcopy(net) ||
-   len < VHOST_GOODCOPY_LEN) {
-   /* copy don't need to wait for DMA done */
-   vq->heads[nvq->upend_idx].len =
-   VHOST_DMA_DONE_LEN;
-   msg.msg_control = NULL;
-   msg.msg_controllen = 0;
-   ubufs = NULL;
-   } else {
-   struct ubuf_info *ubuf;
-   ubuf = nvq->ubuf_info + nvq->upend_idx;
-
-   vq->heads[nvq->upend_idx].len =
-   VHOST_DMA_IN_PROGRESS;
-   ubuf->callback = vhost_zerocopy_callback;
-   ubuf->ctx = nvq->ubufs;
-   ubuf->desc = nvq->upend_idx;
-   msg.msg_control = ubuf;
-   msg.msg_controllen = sizeof(ubuf);
-   ubufs = nvq->ubufs;
-   kref_get(>kref);
-   }
+   vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
+   ubuf->callback = vhost_zerocopy_callback;
+   ubuf->ctx = nvq->ubufs;
+   ubuf->desc = nvq->upend_idx;
+   msg.msg_control = ubuf;
+   msg.msg_controllen = sizeof(ubuf);
+   ubufs = nvq->ubufs;
+   kref_get(>kref);
nvq->upend_idx = (nvq->upend_idx + 1) % UIO_MAXIOV;
-   } else
+   } else {
msg.msg_control = NULL;
+   ubufs = NULL;
+   }
/* TODO: Check specific error and bomb out unless ENOBUFS? */
err = sock->ops->sendmsg(NULL, sock, , len);
if (unlikely(err < 0)) {
if (zcopy_used) {
-   if (ubufs)
-   vhost_net_ubuf_put(ubufs);
+   vhost_net_ubuf_put(ubufs);
nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
% UIO_MAXIOV;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 3/6] vhost: switch to use vhost_add_used_n()

2013-08-29 Thread Jason Wang
Let vhost_add_used() to use vhost_add_used_n() to reduce the code duplication.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c |   54 ++--
 1 files changed, 12 insertions(+), 42 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e58cf00..124c433 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1332,48 +1332,9 @@ EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
  * want to notify the guest, using eventfd. */
 int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 {
-   struct vring_used_elem __user *used;
+   struct vring_used_elem heads = { head, len };
 
-   /* The virtqueue contains a ring of used buffers.  Get a pointer to the
-* next entry in that used ring. */
-   used = >used->ring[vq->last_used_idx % vq->num];
-   if (__put_user(head, >id)) {
-   vq_err(vq, "Failed to write used id");
-   return -EFAULT;
-   }
-   if (__put_user(len, >len)) {
-   vq_err(vq, "Failed to write used len");
-   return -EFAULT;
-   }
-   /* Make sure buffer is written before we update index. */
-   smp_wmb();
-   if (__put_user(vq->last_used_idx + 1, >used->idx)) {
-   vq_err(vq, "Failed to increment used idx");
-   return -EFAULT;
-   }
-   if (unlikely(vq->log_used)) {
-   /* Make sure data is seen before log. */
-   smp_wmb();
-   /* Log used ring entry write. */
-   log_write(vq->log_base,
- vq->log_addr +
-  ((void __user *)used - (void __user *)vq->used),
- sizeof *used);
-   /* Log used index update. */
-   log_write(vq->log_base,
- vq->log_addr + offsetof(struct vring_used, idx),
- sizeof vq->used->idx);
-   if (vq->log_ctx)
-   eventfd_signal(vq->log_ctx, 1);
-   }
-   vq->last_used_idx++;
-   /* If the driver never bothers to signal in a very long while,
-* used index might wrap around. If that happens, invalidate
-* signalled_used index we stored. TODO: make sure driver
-* signals at least once in 2^16 and remove this. */
-   if (unlikely(vq->last_used_idx == vq->signalled_used))
-   vq->signalled_used_valid = false;
-   return 0;
+   return vhost_add_used_n(vq, , 1);
 }
 EXPORT_SYMBOL_GPL(vhost_add_used);
 
@@ -1387,7 +1348,16 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 
start = vq->last_used_idx % vq->num;
used = vq->used->ring + start;
-   if (__copy_to_user(used, heads, count * sizeof *used)) {
+   if (count == 1) {
+   if (__put_user(heads[0].id, >id)) {
+   vq_err(vq, "Failed to write used id");
+   return -EFAULT;
+   }
+   if (__put_user(heads[0].len, >len)) {
+   vq_err(vq, "Failed to write used len");
+   return -EFAULT;
+   }
+   } else if (__copy_to_user(used, heads, count * sizeof *used)) {
vq_err(vq, "Failed to write used");
return -EFAULT;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 5/6] vhost_net: poll vhost queue after marking DMA is done

2013-08-29 Thread Jason Wang
We used to poll vhost queue before making DMA is done, this is racy if vhost
thread were waked up before marking DMA is done which can result the signal to
be missed. Fix this by always poll the vhost thread before DMA is done.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index ff60c2a..d09c17c 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -308,6 +308,11 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
struct vhost_virtqueue *vq = ubufs->vq;
int cnt = atomic_read(>kref.refcount);
 
+   /* set len to mark this desc buffers done DMA */
+   vq->heads[ubuf->desc].len = success ?
+   VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
+   vhost_net_ubuf_put(ubufs);
+
/*
 * Trigger polling thread if guest stopped submitting new buffers:
 * in this case, the refcount after decrement will eventually reach 1
@@ -318,10 +323,6 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
 */
if (cnt <= 2 || !(cnt % 16))
vhost_poll_queue(>poll);
-   /* set len to mark this desc buffers done DMA */
-   vq->heads[ubuf->desc].len = success ?
-   VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
-   vhost_net_ubuf_put(ubufs);
 }
 
 /* Expects to be always run from workqueue - which acts as
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] USB2NET : SR9700 : One Chip USB 1.1 USB2NET SR9700 Device Driver Support

2013-08-29 Thread Joe Perches
On Fri, 2013-08-30 at 01:06 +0200, Francois Romieu wrote:

just trivia:

> diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c
[]
> @@ -288,19 +291,19 @@ static void sr9700_set_multicast(struct net_device *net)
[]
> +static int sr9700_set_mac_address(struct net_device *netdev, void *p)
[]
> - memcpy(net->dev_addr, addr->sa_data, net->addr_len);
> - sr_write_async(dev, PAR, 6, dev->net->dev_addr);
> + memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len);
> + sr_write_async(dev, PAR, 6, netdev->dev_addr);

s/6/ETH_ALEN/
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 0/6] vhost code cleanup and minor enhancement

2013-08-29 Thread Jason Wang
Hi all:

This series tries to unify and simplify vhost codes especially for
zerocopy. With this series, 5% - 10% improvement for per cpu throughput were
seen during netperf guest sending test.

Plase review.

Changes from V1:
- Fix the zerocopy enabling check by changing the check of upend_idx != done_idx
  to (upend_idx + 1) % UIO_MAXIOV == done_idx.
- Switch to use put_user() in __vhost_add_used_n() if there's only one used
- Keep the max pending check based on Michael's suggestion.

Jason Wang (6):
  vhost_net: make vhost_zerocopy_signal_used() returns void
  vhost_net: use vhost_add_used_and_signal_n() in
vhost_zerocopy_signal_used()
  vhost: switch to use vhost_add_used_n()
  vhost_net: determine whether or not to use zerocopy at one time
  vhost_net: poll vhost queue after marking DMA is done
  vhost_net: correctly limit the max pending buffers

 drivers/vhost/net.c   |   88 +
 drivers/vhost/vhost.c |   54 +++---
 2 files changed, 50 insertions(+), 92 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 1/6] vhost_net: make vhost_zerocopy_signal_used() returns void

2013-08-29 Thread Jason Wang
None of its caller use its return value, so let it return void.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 969a859..280ee66 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -276,8 +276,8 @@ static void copy_iovec_hdr(const struct iovec *from, struct 
iovec *to,
  * of used idx. Once lower device DMA done contiguously, we will signal KVM
  * guest used idx.
  */
-static int vhost_zerocopy_signal_used(struct vhost_net *net,
- struct vhost_virtqueue *vq)
+static void vhost_zerocopy_signal_used(struct vhost_net *net,
+  struct vhost_virtqueue *vq)
 {
struct vhost_net_virtqueue *nvq =
container_of(vq, struct vhost_net_virtqueue, vq);
@@ -297,7 +297,6 @@ static int vhost_zerocopy_signal_used(struct vhost_net *net,
}
if (j)
nvq->done_idx = i;
-   return j;
 }
 
 static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 6/6] vhost_net: correctly limit the max pending buffers

2013-08-29 Thread Jason Wang
As Michael point out, We used to limit the max pending DMAs to get better cache
utilization. But it was not done correctly since it was one done when there's no
new buffers submitted from guest. Guest can easily exceeds the limitation by
keeping sending packets.

So this patch moves the check into main loop. Tests shows about 5%-10%
improvement on per cpu throughput for guest tx. But a 5% drop on per cpu
transaction rate for a single session TCP_RR.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |   15 ---
 1 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index d09c17c..592e1f2 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -363,6 +363,10 @@ static void handle_tx(struct vhost_net *net)
if (zcopy)
vhost_zerocopy_signal_used(net, vq);
 
+   if ((nvq->upend_idx + vq->num - VHOST_MAX_PEND) % UIO_MAXIOV ==
+   nvq->done_idx)
+   break;
+
head = vhost_get_vq_desc(>dev, vq, vq->iov,
 ARRAY_SIZE(vq->iov),
 , ,
@@ -372,17 +376,6 @@ static void handle_tx(struct vhost_net *net)
break;
/* Nothing new?  Wait for eventfd to tell us they refilled. */
if (head == vq->num) {
-   int num_pends;
-
-   /* If more outstanding DMAs, queue the work.
-* Handle upend_idx wrap around
-*/
-   num_pends = likely(nvq->upend_idx >= nvq->done_idx) ?
-   (nvq->upend_idx - nvq->done_idx) :
-   (nvq->upend_idx + UIO_MAXIOV -
-nvq->done_idx);
-   if (unlikely(num_pends > VHOST_MAX_PEND))
-   break;
if (unlikely(vhost_enable_notify(>dev, vq))) {
vhost_disable_notify(>dev, vq);
continue;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] ASoC: rsnd: use regmap instead of original register mapping method

2013-08-29 Thread Kuninori Morimoto
Current Linux kernel is supporting regmap/regmap_field,
and, it is good match for Renesas Sound Gen1/Gen2 register mapping.
This patch uses regmap instead of original method for register access

Signed-off-by: Kuninori Morimoto 
---
 sound/soc/sh/rcar/core.c |   45 -
 sound/soc/sh/rcar/gen.c  |  233 ++
 2 files changed, 151 insertions(+), 127 deletions(-)

diff --git a/sound/soc/sh/rcar/core.c b/sound/soc/sh/rcar/core.c
index a357060..fc83f0f 100644
--- a/sound/soc/sh/rcar/core.c
+++ b/sound/soc/sh/rcar/core.c
@@ -106,51 +106,6 @@
(!(priv->info->func) ? -ENODEV :\
 priv->info->func(param))
 
-
-/*
- * basic function
- */
-u32 rsnd_read(struct rsnd_priv *priv,
- struct rsnd_mod *mod, enum rsnd_reg reg)
-{
-   void __iomem *base = rsnd_gen_reg_get(priv, mod, reg);
-
-   BUG_ON(!base);
-
-   return ioread32(base);
-}
-
-void rsnd_write(struct rsnd_priv *priv,
-   struct rsnd_mod *mod,
-   enum rsnd_reg reg, u32 data)
-{
-   void __iomem *base = rsnd_gen_reg_get(priv, mod, reg);
-   struct device *dev = rsnd_priv_to_dev(priv);
-
-   BUG_ON(!base);
-
-   dev_dbg(dev, "w %p : %08x\n", base, data);
-
-   iowrite32(data, base);
-}
-
-void rsnd_bset(struct rsnd_priv *priv, struct rsnd_mod *mod,
-  enum rsnd_reg reg, u32 mask, u32 data)
-{
-   void __iomem *base = rsnd_gen_reg_get(priv, mod, reg);
-   struct device *dev = rsnd_priv_to_dev(priv);
-   u32 val;
-
-   BUG_ON(!base);
-
-   val = ioread32(base);
-   val &= ~mask;
-   val |= data & mask;
-   iowrite32(val, base);
-
-   dev_dbg(dev, "s %p : %08x\n", base, val);
-}
-
 /*
  * rsnd_mod functions
  */
diff --git a/sound/soc/sh/rcar/gen.c b/sound/soc/sh/rcar/gen.c
index 331fc55..b245584 100644
--- a/sound/soc/sh/rcar/gen.c
+++ b/sound/soc/sh/rcar/gen.c
@@ -24,21 +24,112 @@ struct rsnd_gen_ops {
 struct rsnd_dai_stream *io);
 };
 
-struct rsnd_gen_reg_map {
-   int index;  /* -1 : not supported */
-   u32 offset_id;  /* offset of ssi0, ssi1, ssi2... */
-   u32 offset_adr; /* offset of SSICR, SSISR, ... */
-};
-
 struct rsnd_gen {
void __iomem *base[RSND_BASE_MAX];
 
-   struct rsnd_gen_reg_map reg_map[RSND_REG_MAX];
struct rsnd_gen_ops *ops;
+
+   struct regmap *regmap;
+   struct regmap_field *regs[RSND_REG_MAX];
 };
 
 #define rsnd_priv_to_gen(p)((struct rsnd_gen *)(p)->gen)
 
+#define RSND_REG_SET(gen, id, reg_id, offset, _id_offset, _id_size)\
+   [id] = {\
+   .reg = (unsigned int)gen->base[reg_id] + offset,\
+   .lsb = 0,   \
+   .msb = 31,  \
+   .id_size = _id_size,\
+   .id_offset = _id_offset,\
+   }
+
+#define RSND_SINGLE_REG(gen, reg, id, offset)  \
+   RSND_REG_SET(gen, RSND_REG_##id, RSND_GEN1_##reg, offset, 0, 0)
+
+#define RSND_MULTI_REG(gen, reg, id, offset, _id_offset)   \
+   RSND_REG_SET(gen, RSND_REG_##id, RSND_GEN1_##reg, offset, _id_offset, 9)
+
+/*
+ * basic function
+ */
+static int rsnd_regmap_write32(void *context, const void *_data, size_t count)
+{
+   struct rsnd_priv *priv = context;
+   struct device *dev = rsnd_priv_to_dev(priv);
+   u32 *data = (u32 *)_data;
+   u32 val = data[1];
+   void __iomem *reg = (void *)data[0];
+
+   iowrite32(val, reg);
+
+   dev_dbg(dev, "w %p : %08x\n", reg, val);
+
+   return 0;
+}
+
+static int rsnd_regmap_read32(void *context,
+ const void *_data, size_t reg_size,
+ void *_val, size_t val_size)
+{
+   struct rsnd_priv *priv = context;
+   struct device *dev = rsnd_priv_to_dev(priv);
+   u32 *data = (u32 *)_data;
+   u32 *val = (u32 *)_val;
+   void __iomem *reg = (void *)data[0];
+
+   *val = ioread32(reg);
+
+   dev_dbg(dev, "r %p : %08x\n", reg, *val);
+
+   return 0;
+}
+
+static struct regmap_bus rsnd_regmap_bus = {
+   .write  = rsnd_regmap_write32,
+   .read   = rsnd_regmap_read32,
+   .reg_format_endian_default  = REGMAP_ENDIAN_NATIVE,
+   .val_format_endian_default  = REGMAP_ENDIAN_NATIVE,
+};
+
+u32 rsnd_read(struct rsnd_priv *priv,
+ struct rsnd_mod *mod, enum rsnd_reg reg)
+{
+   struct rsnd_gen *gen = rsnd_priv_to_gen(priv);
+   u32 val;
+
+   if (regmap_fields_enable(gen->regs[reg]))
+   regmap_fields_read(gen->regs[reg], rsnd_mod_id(mod), );
+   else
+   regmap_field_read(gen->regs[reg], );
+
+   return val;
+}
+
+void 

[PATCH 3/4] ASoC: rsnd: gen: rsnd_gen_ops cares .probe and .remove

2013-08-29 Thread Kuninori Morimoto
Current rsnd_gen_ops didn't care about .probe and .remove
functions, but it was not good sense.
This patch tidyup it

Signed-off-by: Kuninori Morimoto 
---
 sound/soc/sh/rcar/gen.c |   41 -
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/sound/soc/sh/rcar/gen.c b/sound/soc/sh/rcar/gen.c
index babb203..331fc55 100644
--- a/sound/soc/sh/rcar/gen.c
+++ b/sound/soc/sh/rcar/gen.c
@@ -11,6 +11,11 @@
 #include "rsnd.h"
 
 struct rsnd_gen_ops {
+   int (*probe)(struct platform_device *pdev,
+struct rcar_snd_info *info,
+struct rsnd_priv *priv);
+   void (*remove)(struct platform_device *pdev,
+ struct rsnd_priv *priv);
int (*path_init)(struct rsnd_priv *priv,
 struct rsnd_dai *rdai,
 struct rsnd_dai_stream *io);
@@ -98,11 +103,6 @@ static int rsnd_gen1_path_exit(struct rsnd_priv *priv,
return ret;
 }
 
-static struct rsnd_gen_ops rsnd_gen1_ops = {
-   .path_init  = rsnd_gen1_path_init,
-   .path_exit  = rsnd_gen1_path_exit,
-};
-
 #define RSND_GEN1_REG_MAP(g, s, i, oi, oa) \
do {\
(g)->reg_map[RSND_REG_##i].index  = RSND_GEN1_##s;  \
@@ -163,7 +163,6 @@ static int rsnd_gen1_probe(struct platform_device *pdev,
IS_ERR(gen->base[RSND_GEN1_SSI]))
return -ENODEV;
 
-   gen->ops = _gen1_ops;
rsnd_gen1_reg_map_init(gen);
 
dev_dbg(dev, "Gen1 device probed\n");
@@ -183,6 +182,13 @@ static void rsnd_gen1_remove(struct platform_device *pdev,
 {
 }
 
+static struct rsnd_gen_ops rsnd_gen1_ops = {
+   .probe  = rsnd_gen1_probe,
+   .remove = rsnd_gen1_remove,
+   .path_init  = rsnd_gen1_path_init,
+   .path_exit  = rsnd_gen1_path_exit,
+};
+
 /*
  * Gen
  */
@@ -251,6 +257,14 @@ int rsnd_gen_probe(struct platform_device *pdev,
return -ENOMEM;
}
 
+   if (rsnd_is_gen1(priv))
+   gen->ops = _gen1_ops;
+
+   if (!gen->ops) {
+   dev_err(dev, "unknown generation R-Car sound device\n");
+   return -ENODEV;
+   }
+
priv->gen = gen;
 
/*
@@ -261,20 +275,13 @@ int rsnd_gen_probe(struct platform_device *pdev,
for (i = 0; i < RSND_REG_MAX; i++)
gen->reg_map[i].index = -1;
 
-   /*
-*  init each module
-*/
-   if (rsnd_is_gen1(priv))
-   return rsnd_gen1_probe(pdev, info, priv);
-
-   dev_err(dev, "unknown generation R-Car sound device\n");
-
-   return -ENODEV;
+   return gen->ops->probe(pdev, info, priv);
 }
 
 void rsnd_gen_remove(struct platform_device *pdev,
 struct rsnd_priv *priv)
 {
-   if (rsnd_is_gen1(priv))
-   rsnd_gen1_remove(pdev, priv);
+   struct rsnd_gen *gen = rsnd_priv_to_gen(priv);
+
+   gen->ops->remove(pdev, priv);
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] regmap: Add regmap_fields APIs

2013-08-29 Thread Kuninori Morimoto
Current Linux kernel is supporting regmap_field method
and it is very useful feature.
It needs one regmap_filed for one register access.

OTOH, there is multi port device which
has many same registers in the market.
The difference for each port register access is
only its address offset.

Current API needs many regmap_field for such device,
but it is not good.
This patch adds new regmap_fileds API which can care
about multi port/offset access via regmap.

Signed-off-by: Kuninori Morimoto 
---
 drivers/base/regmap/internal.h |3 ++
 drivers/base/regmap/regmap.c   |   97 
 include/linux/regmap.h |   12 +
 3 files changed, 112 insertions(+)

diff --git a/drivers/base/regmap/internal.h b/drivers/base/regmap/internal.h
index 29c8316..ba22cc7 100644
--- a/drivers/base/regmap/internal.h
+++ b/drivers/base/regmap/internal.h
@@ -182,6 +182,9 @@ struct regmap_field {
/* lsb */
unsigned int shift;
unsigned int reg;
+
+   unsigned int id_size;
+   unsigned int id_offset;
 };
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 7ae90d8..677e559 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -815,6 +815,8 @@ static void regmap_field_init(struct regmap_field *rm_field,
rm_field->reg = reg_field.reg;
rm_field->shift = reg_field.lsb;
rm_field->mask = ((BIT(field_bits) - 1) << reg_field.lsb);
+   rm_field->id_size = reg_field.id_size;
+   rm_field->id_offset = reg_field.id_offset;
 }
 
 /**
@@ -1380,6 +1382,68 @@ int regmap_field_update_bits(struct regmap_field *field, 
unsigned int mask, unsi
 }
 EXPORT_SYMBOL_GPL(regmap_field_write);
 
+/**
+ * regmap_fields_write(): Write a value to a single register field with port ID
+ *
+ * @field: Register field to write to
+ * @id: port ID
+ * @val: Value to be written
+ *
+ * A value of zero will be returned on success, a negative errno will
+ * be returned in error cases.
+ */
+int regmap_fields_write(struct regmap_field *field, unsigned int id,
+   unsigned int val)
+{
+   if (id >= field->id_size)
+   return -EINVAL;
+
+   return regmap_update_bits(field->regmap,
+ field->reg + (field->id_offset * id),
+ field->mask, val << field->shift);
+}
+EXPORT_SYMBOL_GPL(regmap_field_write);
+
+/**
+ * regmap_fields_update_bits():Perform a read/modify/write cycle
+ *  on the register field
+ *
+ * @field: Register field to write to
+ * @id: port ID
+ * @mask: Bitmask to change
+ * @val: Value to be written
+ *
+ * A value of zero will be returned on success, a negative errno will
+ * be returned in error cases.
+ */
+int regmap_fields_update_bits(struct regmap_field *field,  unsigned int id,
+ unsigned int mask, unsigned int val)
+{
+   if (id >= field->id_size)
+   return -EINVAL;
+
+   mask = (mask << field->shift) & field->mask;
+
+   return regmap_update_bits(field->regmap,
+ field->reg + (field->id_offset * id),
+ mask, val << field->shift);
+}
+EXPORT_SYMBOL_GPL(regmap_field_write);
+
+/**
+ * regmap_fields_enable(): query fields access
+ *
+ * @field: Query Register field
+ *
+ * A non-zero will be returned when fields access enable,
+ * a zero will be returned in single field.
+ */
+int regmap_fields_enable(struct regmap_field *field)
+{
+   return field->id_size && field->id_offset;
+}
+EXPORT_SYMBOL_GPL(regmap_fields_enable);
+
 /*
  * regmap_bulk_write(): Write multiple registers to the device
  *
@@ -1688,6 +1752,39 @@ int regmap_field_read(struct regmap_field *field, 
unsigned int *val)
 EXPORT_SYMBOL_GPL(regmap_field_read);
 
 /**
+ * regmap_fields_read(): Read a value to a single register field with port ID
+ *
+ * @field: Register field to read from
+ * @id: port ID
+ * @val: Pointer to store read value
+ *
+ * A value of zero will be returned on success, a negative errno will
+ * be returned in error cases.
+ */
+int regmap_fields_read(struct regmap_field *field, unsigned int id,
+  unsigned int *val)
+{
+   int ret;
+   unsigned int reg_val;
+
+   if (id >= field->id_size)
+   return -EINVAL;
+
+   ret = regmap_read(field->regmap,
+ field->reg + (field->id_offset * id),
+ _val);
+   if (ret != 0)
+   return ret;
+
+   reg_val &= field->mask;
+   reg_val >>= field->shift;
+   *val = reg_val;
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(regmap_field_read);
+
+/**
  * regmap_bulk_read(): Read multiple registers from the device
  *
  * @map: Register map to write to
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 1425c91..d329f50 100644
--- a/include/linux/regmap.h
+++ 

[PATCH 1/4] regmap: add regmap_field_update_bits()

2013-08-29 Thread Kuninori Morimoto
Current regmap_field is supporting read/write functions.
This patch adds new update_bits function for it.

Signed-off-by: Kuninori Morimoto 
---
 drivers/base/regmap/regmap.c |   20 
 include/linux/regmap.h   |2 ++
 2 files changed, 22 insertions(+)

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index e0d0c7d..7ae90d8 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1360,6 +1360,26 @@ int regmap_field_write(struct regmap_field *field, 
unsigned int val)
 }
 EXPORT_SYMBOL_GPL(regmap_field_write);
 
+/**
+ * regmap_field_update_bits(): Perform a read/modify/write cycle
+ *  on the register field
+ *
+ * @field: Register field to write to
+ * @mask: Bitmask to change
+ * @val: Value to be written
+ *
+ * A value of zero will be returned on success, a negative errno will
+ * be returned in error cases.
+ */
+int regmap_field_update_bits(struct regmap_field *field, unsigned int mask, 
unsigned int val)
+{
+   mask = (mask << field->shift) & field->mask;
+
+   return regmap_update_bits(field->regmap, field->reg,
+ mask, val << field->shift);
+}
+EXPORT_SYMBOL_GPL(regmap_field_write);
+
 /*
  * regmap_bulk_write(): Write multiple registers to the device
  *
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 75981d0..1425c91 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -446,6 +446,8 @@ void devm_regmap_field_free(struct device *dev, struct 
regmap_field *field);
 
 int regmap_field_read(struct regmap_field *field, unsigned int *val);
 int regmap_field_write(struct regmap_field *field, unsigned int val);
+int regmap_field_update_bits(struct regmap_field *field,
+unsigned int mask, unsigned int val);
 
 /**
  * Description of an IRQ for the generic regmap irq_chip.
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] add regmap_filelds and use it on Renesas Sound driver

2013-08-29 Thread Kuninori Morimoto

Hi Mark

These patches add new regmap_filelds API on kernel.
   ~
and use it on Renesas sound driver instead of original method.
It can care about multi port register offset via regmap.

0x + 0x40-- port 0 --
regX
regY
regZ
0x + 0x80-- port 1 --
regX
regY
regZ

This case, current API needs 2 (= port) x 3 (= regX/Y/Z) regmap_fileld,
but this new API can care about all port via 3 regmap_filelds with port ID.

I'm not sure that regmap_filelds is good naming or not.
Please let me know if you have good naming idea.

These are based on below branchs
 regmap/for-next + asoc/for-next

Kuninori Morimoto (4):
  regmap: add regmap_field_update_bits()
  regmap: Add regmap_fields APIs
  ASoC: rsnd: gen: rsnd_gen_ops cares .probe and .remove
  ASoC: rsnd: use regmap instead of original register mapping method

 drivers/base/regmap/internal.h |3 +
 drivers/base/regmap/regmap.c   |  117 +
 include/linux/regmap.h |   14 +++
 sound/soc/sh/rcar/core.c   |   45 ---
 sound/soc/sh/rcar/gen.c|  270 +---
 5 files changed, 307 insertions(+), 142 deletions(-)


Best regards
---
Kuninori Morimoto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/6] extcon-gpio: If the gpio driver/chip supports debounce, use it

2013-08-29 Thread Guenter Roeck
Signed-off-by: Guenter Roeck 
---
 drivers/extcon/extcon-gpio.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/extcon/extcon-gpio.c b/drivers/extcon/extcon-gpio.c
index 77d35a7..973600e 100644
--- a/drivers/extcon/extcon-gpio.c
+++ b/drivers/extcon/extcon-gpio.c
@@ -111,6 +111,11 @@ static int gpio_extcon_probe(struct platform_device *pdev)
if (ret < 0)
goto err;
 
+   /* Use gpio debounce if available. If so, don't debounce in software. */
+   if (pdata->debounce &&
+   !gpio_set_debounce(extcon_data->gpio, pdata->debounce * 1000))
+   extcon_data->debounce_jiffies = 0;
+
INIT_DELAYED_WORK(_data->work, gpio_extcon_work);
 
extcon_data->irq = gpio_to_irq(extcon_data->gpio);
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] h8300: include: asm: Kbuild: add default "serial.h"

2013-08-29 Thread Chen Gang
Normally, an architecture need support serial, so include the default
one, or can not pass compiling:

The related error (allmodconfig for h8300):

  In file included from drivers/staging/speakup/speakup_acntpc.c:33:0:
  drivers/staging/speakup/serialio.h:7:24: fatal error: asm/serial.h: No such 
file or directory


Signed-off-by: Chen Gang 
---
 arch/h8300/include/asm/Kbuild |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/h8300/include/asm/Kbuild b/arch/h8300/include/asm/Kbuild
index 8ada3cf..df289fb 100644
--- a/arch/h8300/include/asm/Kbuild
+++ b/arch/h8300/include/asm/Kbuild
@@ -4,5 +4,6 @@ generic-y += exec.h
 generic-y += linkage.h
 generic-y += mmu.h
 generic-y += module.h
+generic-y += serial.h
 generic-y += trace_clock.h
 generic-y += xor.h
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] extcon-gpio: Add support for active-low presence detect pins

2013-08-29 Thread Guenter Roeck
Signed-off-by: Guenter Roeck 
---
 drivers/extcon/extcon-gpio.c   |4 +++-
 include/linux/extcon/extcon-gpio.h |1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/extcon/extcon-gpio.c b/drivers/extcon/extcon-gpio.c
index 973600e..d4e3c89 100644
--- a/drivers/extcon/extcon-gpio.c
+++ b/drivers/extcon/extcon-gpio.c
@@ -34,6 +34,7 @@
 struct gpio_extcon_data {
struct extcon_dev edev;
unsigned gpio;
+   bool gpio_active_low;
const char *state_on;
const char *state_off;
int irq;
@@ -48,7 +49,7 @@ static void gpio_extcon_work(struct work_struct *work)
container_of(to_delayed_work(work), struct gpio_extcon_data,
 work);
 
-   state = gpio_get_value(data->gpio);
+   state = gpio_get_value(data->gpio) ^ data->gpio_active_low;
extcon_set_state(>edev, state);
 }
 
@@ -96,6 +97,7 @@ static int gpio_extcon_probe(struct platform_device *pdev)
 
extcon_data->edev.name = pdata->name;
extcon_data->gpio = pdata->gpio;
+   extcon_data->gpio_active_low = pdata->gpio_active_low;
extcon_data->state_on = pdata->state_on;
extcon_data->state_off = pdata->state_off;
if (pdata->state_on && pdata->state_off)
diff --git a/include/linux/extcon/extcon-gpio.h 
b/include/linux/extcon/extcon-gpio.h
index 2d8307f..1613899 100644
--- a/include/linux/extcon/extcon-gpio.h
+++ b/include/linux/extcon/extcon-gpio.h
@@ -41,6 +41,7 @@
 struct gpio_extcon_platform_data {
const char *name;
unsigned gpio;
+   bool gpio_active_low;
unsigned long debounce;
unsigned long irq_flags;
 
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 4/6] extcon-gpio: Add devicetree support

2013-08-29 Thread Guenter Roeck
Signed-off-by: Guenter Roeck 
---
 drivers/extcon/extcon-gpio.c |   59 --
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/drivers/extcon/extcon-gpio.c b/drivers/extcon/extcon-gpio.c
index d4e3c89..16951fe 100644
--- a/drivers/extcon/extcon-gpio.c
+++ b/drivers/extcon/extcon-gpio.c
@@ -30,6 +30,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 struct gpio_extcon_data {
struct extcon_dev edev;
@@ -77,14 +79,66 @@ static ssize_t extcon_gpio_print_state(struct extcon_dev 
*edev, char *buf)
return -EINVAL;
 }
 
+#ifdef CONFIG_OF_GPIO
+
+static struct gpio_extcon_platform_data *
+extcon_gpio_config_of(struct device *dev)
+{
+   struct gpio_extcon_platform_data *pdata;
+   struct device_node *np = dev->of_node;
+   enum of_gpio_flags flags;
+   int gpio, ret;
+   u32 debounce;
+
+   gpio = of_get_named_gpio_flags(np, "presence-detect-gpios", 0, );
+   if (gpio < 0)
+   return ERR_PTR(gpio);
+
+   pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
+   if (!pdata)
+   return ERR_PTR(-ENOMEM);
+
+   pdata->gpio = gpio;
+   pdata->gpio_active_low = flags & OF_GPIO_ACTIVE_LOW;
+   pdata->irq_flags = IRQ_TYPE_EDGE_BOTH;
+
+   if (!of_property_read_u32(np, "debounce-interval", ))
+   pdata->debounce = debounce;
+
+   ret = of_property_read_string(np, "name", >name);
+   if (ret < 0)
+   return ERR_PTR(ret);
+
+   of_property_read_string(np, "state-on", >state_on);
+   of_property_read_string(np, "state-off", >state_off);
+
+   return pdata;
+}
+
+static const struct of_device_id of_gpio_extcon_match[] = {
+   { .compatible = "gpio-connector", },
+   {},
+};
+#else /* CONFIG_OF_GPIO */
+static struct gpio_extcon_platform_data *
+extcon_gpio_config_of(struct device *pdev)
+{
+   return ERR_PTR(-ENODEV);
+}
+#endif /* CONFIG_OF_GPIO */
+
 static int gpio_extcon_probe(struct platform_device *pdev)
 {
struct gpio_extcon_platform_data *pdata = pdev->dev.platform_data;
struct gpio_extcon_data *extcon_data;
int ret;
 
-   if (!pdata)
-   return -EBUSY;
+   if (!pdata) {
+   pdata = extcon_gpio_config_of(>dev);
+   if (IS_ERR(pdata))
+   return PTR_ERR(pdata);
+   }
+
if (!pdata->irq_flags) {
dev_err(>dev, "IRQ flag is not specified.\n");
return -EINVAL;
@@ -161,6 +215,7 @@ static struct platform_driver gpio_extcon_driver = {
.driver = {
.name   = "extcon-gpio",
.owner  = THIS_MODULE,
+   .of_match_table = of_match_ptr(of_gpio_extcon_match),
},
 };
 
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 6/6] extcon-gpio: Describe possible properties to support multi-type cables

2013-08-29 Thread Guenter Roeck
This is purely a possible description and an RFC; there is no code (yet).

Signed-off-by: Guenter Roeck 
---
 .../devicetree/bindings/extcon/extcon-gpio |   26 
 1 file changed, 26 insertions(+)

diff --git a/Documentation/devicetree/bindings/extcon/extcon-gpio 
b/Documentation/devicetree/bindings/extcon/extcon-gpio
index 091ddc6..5836ac2 100644
--- a/Documentation/devicetree/bindings/extcon/extcon-gpio
+++ b/Documentation/devicetree/bindings/extcon/extcon-gpio
@@ -21,3 +21,29 @@ Example node:
state-on = "connected";
state-on = "disconnected";
};
+
+---
+TBD: Add support for multiple connectors
+
+An example node with multiple connectors might look as follows.
+
+   some-connector {
+   #size-cells = <1>;
+   compatible = "gpio-connector";
+   presence-detect-gpios = < 7 1>;
+   id-gpios = < 8 0>;
+   debounce-interval = <1>;
+   state-on = "connected";
+   state-on = "disconnected";
+
+   USB {
+   reg = <0>;
+   };
+   USB-Host {
+   reg = <1>;
+   };
+   };
+
+This describes a cable with a (low-active) presence detect pin and an ID pin.
+If the value returned by the ID pin is 0, the connected cable type is "USB".
+If the value is 1, the connected cable type is "USB-Host".
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 5/6] extcon-gpio: Describe devicetree bindings

2013-08-29 Thread Guenter Roeck
Signed-off-by: Guenter Roeck 
---
 .../devicetree/bindings/extcon/extcon-gpio |   23 
 1 file changed, 23 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/extcon/extcon-gpio

diff --git a/Documentation/devicetree/bindings/extcon/extcon-gpio 
b/Documentation/devicetree/bindings/extcon/extcon-gpio
new file mode 100644
index 000..091ddc6
--- /dev/null
+++ b/Documentation/devicetree/bindings/extcon/extcon-gpio
@@ -0,0 +1,23 @@
+Device-Tree bindings for extcon/extcon-gpio driver
+
+Required properties:
+   - compatible = "gpio-connector";
+   - presence-detect-gpios - presence detect gpio pin
+
+Optional properties:
+   - debounce-interval - debounce interval in milli-seconds
+   - state-on - on (connected) state
+   - state-off - off (disconnected) state
+ Depending on the type of connector or cable, states may
+ for example be reported as "connected"/"disconnected"
+ or "inserted"/"removed".
+
+Example node:
+
+   some-connector {
+   compatible = "gpio-connector";
+   presence-detect-gpios = < 7 1>;
+   debounce-interval = <1>;
+   state-on = "connected";
+   state-on = "disconnected";
+   };
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Query] CPUFreq: Does these machines have separate clock domains for CPUs?

2013-08-29 Thread Mikael Starvik
Cris are also UP

29 aug 2013 kl. 12:16 skrev "Viresh Kumar" :

> Hi,
> 
> I have been doing some CPUFreq cleanup work and
> wanted to know if the below mentioned machines have separate
> clock domains for their CPUs or all share the same domain?
> 
> So, that we can use some generic routines for these drivers which
> would eventually do:
> 
> cpumask_setall(policy->cpus);
> 
> And I wanted to make sure that this doesn't break them.. :)
> 
> ..
> 
> The drivers are:
> 
> drivers/cpufreq/at32ap-cpufreq.c
> drivers/cpufreq/blackfin-cpufreq.c
> drivers/cpufreq/cris-artpec3-cpufreq.c
> drivers/cpufreq/cris-etraxfs-cpufreq.c
> drivers/cpufreq/davinci-cpufreq.c
> drivers/cpufreq/e_powersaver.c
> drivers/cpufreq/elanfreq.c
> drivers/cpufreq/longhaul.c
> drivers/cpufreq/loongson2_cpufreq.c
> drivers/cpufreq/pmac32-cpufreq.c
> drivers/cpufreq/powernow-k6.c
> drivers/cpufreq/powernow-k7.c
> drivers/cpufreq/pxa2xx-cpufreq.c
> drivers/cpufreq/pxa3xx-cpufreq.c
> drivers/cpufreq/sc520_freq.c
> drivers/cpufreq/sh-cpufreq.c
> drivers/cpufreq/sparc-us2e-cpufreq.c
> drivers/cpufreq/sparc-us3-cpufreq.c
> 
> --
> Viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] extcon-gpio: Do not unnecessarily initialize variables

2013-08-29 Thread Guenter Roeck
Signed-off-by: Guenter Roeck 
---
 drivers/extcon/extcon-gpio.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/extcon/extcon-gpio.c b/drivers/extcon/extcon-gpio.c
index 02bec32..77d35a7 100644
--- a/drivers/extcon/extcon-gpio.c
+++ b/drivers/extcon/extcon-gpio.c
@@ -80,7 +80,7 @@ static int gpio_extcon_probe(struct platform_device *pdev)
 {
struct gpio_extcon_platform_data *pdata = pdev->dev.platform_data;
struct gpio_extcon_data *extcon_data;
-   int ret = 0;
+   int ret;
 
if (!pdata)
return -EBUSY;
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] extcon-gpio: Add devicetree support

2013-08-29 Thread Guenter Roeck
The first three patches of this series are either cleanup or add functionality
unrelated to devicetree support and should hopefully be acceptable as-is or with
minor modifications.

Patch 1 of this series is a tiny cleanup patch.
Patch 2 adds support for hardware debounce. If the gpio chip supports debounce,
use it instead of software debounce.
Patch 3 adds support for low-active presence detect signals.

Patch 4 and 5 add devicetree support. Patch 4 is the actual code and patch 5
describes devicetree bindings. Both could possibly be merged into a single
patch.

Patch 6 adds possible properties for connectors supporting multiple cable
types to the devicetree bindings document. This is purely for discussion
and would be implemented in a separate patch. While I don't need this code
for our application, I would be happy to provide it if the community sees
value in it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Kconfig.debug: Add FRAME_POINTER anti-dependency for ARC

2013-08-29 Thread Vineet Gupta
On 08/29/2013 08:48 PM, Dave Hansen wrote:
>
> I assume you're sending this my way since getmaintainer.pl has me tagged
> I moved a bunch of code in there. :)

Indeed :-)

> The Kconfig.debug stuff has no real maintainer.  It would probably be OK
> if you just stick this in your architecture's next git pull request.

Will do, thx.

> Although, it would be nice if someone, at some point, actually
> abstracted that out so that the individual architectures could take care
> of this without editing the main files:
>
> # Kconfig.debug:
>
> config ARCH_FRAME_POINTER_UNAVAILABLE
>   def_bool y

def_bool n

> ...
> select FRAME_POINTER if !ARCH_FRAME_POINTER_UNAVAILABLE
>
> # arch/.../Kconfig
>
> select ARCH_FRAME_POINTER_UNAVAILABLE

Sure, I was thinking the same. I'll send out a patchset soon, although next week
is gonna be tight due to upcoming merge window.

-Vineet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 2/6] powerpc, perf: Enable conditional branch filter for POWER8

2013-08-29 Thread Anshuman Khandual
Enables conditional branch filter support for POWER8
utilizing MMCRA register based filter and also invalidates
a BHRB branch filter combination involving conditional
branches.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/perf/power8-pmu.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 2ee4a70..6e28587 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -580,11 +580,21 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
return -1;
 
+   /* Invalid branch filter combination - HW does not support */
+   if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
+   (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
+   return -1;
+
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
return pmu_bhrb_filter;
}
 
+   if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+   pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
+   return pmu_bhrb_filter;
+   }
+
/* Every thing else is unsupported */
return -1;
 }
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 3/6] perf, tool: Conditional branch filter 'cond' added to perf record

2013-08-29 Thread Anshuman Khandual
Adding perf record support for new branch stack filter criteria
PERF_SAMPLE_BRANCH_COND.

Signed-off-by: Anshuman Khandual 
---
 tools/perf/builtin-record.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ecca62e..802d11d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -625,6 +625,7 @@ static const struct branch_mode branch_modes[] = {
BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+   BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
BRANCH_END
 };
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 4/6] x86, perf: Add conditional branch filtering support

2013-08-29 Thread Anshuman Khandual
This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.

Signed-off-by: Anshuman Khandual 
Reviewed-by: Stephane Eranian 
---
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c 
b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index d5be06a..9723773 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -371,6 +371,9 @@ static void intel_pmu_setup_sw_lbr_filter(struct perf_event 
*event)
if (br_type & PERF_SAMPLE_BRANCH_NO_TX)
mask |= X86_BR_NO_TX;
 
+   if (br_type & PERF_SAMPLE_BRANCH_COND)
+   mask |= X86_BR_JCC;
+
/*
 * stash actual user request into reg, it may
 * be used by fixup code for some CPU
@@ -665,6 +668,7 @@ static const int nhm_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
 * NHM/WSM erratum: must include IND_JMP to capture IND_CALL
 */
[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL | LBR_IND_JMP,
+   [PERF_SAMPLE_BRANCH_COND] = LBR_JCC,
 };
 
 static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
@@ -676,6 +680,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
[PERF_SAMPLE_BRANCH_ANY_CALL]   = LBR_REL_CALL | LBR_IND_CALL
| LBR_FAR,
[PERF_SAMPLE_BRANCH_IND_CALL]   = LBR_IND_CALL,
+   [PERF_SAMPLE_BRANCH_COND]   = LBR_JCC,
 };
 
 /* core */
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 6/6] powerpc, perf: Enable SW filtering in branch stack sampling framework

2013-08-29 Thread Anshuman Khandual
This patch enables SW based post processing of BHRB captured branches
to be able to meet more user defined branch filtration criteria in perf
branch stack sampling framework. This changes increase the number of
filters and their valid combinations on powerpc64 platform with BHRB
support. Summary of code changes described below.

(1) struct cpu_hw_events

Introduced two new variables and modified one to track various filters.

a) bhrb_hw_filter   Tracks PMU based HW branch filter flags.
Computed from PMU dependent call back.
b) bhrb_sw_filter   Tracks SW based instruction filter flags
Computed from PPC64 generic SW filter.
c) filter_mask  Tracks overall filter flags for PPC64

(2) Creating HW event with BHRB request

Kernel would try to figure out supported HW filters through a PMU call
back ppmu->bhrb_filter_map(). Here it would only invalidate unsupported
HW filter combinations. In future we could process one element from the
combination in HW and one in SW. Meanwhile cpuhw->filter_mask would be
tracking the overall supported branch filter requests on the PMU.

Kernel would also process the user request against available SW filters
for PPC64. Then we would process filter_mask to verify whether all the
user requested branch filters have been taken care of either in HW or in
SW.

(3) BHRB SW filter processing

During the BHRB data capture inside the PMU interrupt context, each
of the captured "perf_branch_entry.from" would be checked for compliance
with applicable SW branch filters. If the entry does not confirm to the
filter requirements, it would be discarded from the final perf branch
stack buffer.

(4) Instruction classification for proposed SW filters

Here are the list of category of instructions which have been classified
under the proposed SW filters.

(a) PERF_SAMPLE_BRANCH_ANY_RETURN

(i) [Un]conditional branch to LR without setting the LR
(1) blr
(2) bclr
(3) btlr
(4) bflr
(5) bdnzlr
(6) bdnztlr
(7) bdnzflr
(8) bdzlr
(9) bdztlr
(10) bdzflr
(11) bltlr
(12) blelr
(13) beqlr
(14) bgelr
(15) bgtlr
(16) bnllr
(17) bnelr
(18) bnglr
(19) bsolr
(20) bnslr
(21) biclr
(22) bnilr
(23) bunlr
(24) bnulr

(b) PERF_SAMPLE_BRANCH_IND_CALL

(i) [Un]conditional branch to CTR with setting the link
(1) bctrl
(2) bcctrl
(3) btctrl
(4) bfctrl
(5) bltctrl
(6) blectrl
(7) beqctrl
(8) bgectrl
(9) bgtctrl
(10) bnlctrl
(11) bnectrl
(12) bngctrl
(13) bsoctrl
(14) bnsctrl
(15) bicctrl
(16) bnictrl
(17) bunctrl
(18) bnuctrl

(ii) [Un]conditional branch to LR setting the link
(0) bclrl
(1) blrl
(2) btlrl
(3) bflrl
(4) bdnzlrl
(5) bdnztlrl
(6) bdnzflrl
(7) bdzlrl
(8) bdztlrl
(9) bdzflrl
(10) bltlrl
(11) blelrl
(12) beqlrl
(13) bgelrl
(14) bgtlrl
(15) bnllrl
(16) bnelrl
(17) bnglrl
(18) bsolrl
(19) bnslrl
(20) biclrl
(21) bnilrl
(22) bunlrl
(23) bnulrl

(iii) [Un]conditional branch to TAR setting the link
(1) btarl
(2) bctarl

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/include/asm/perf_event_server.h |   2 +-
 arch/powerpc/perf/core-book3s.c  | 200 +--
 

[PATCH V2 1/6] perf: New conditional branch filter criteria in branch stack sampling

2013-08-29 Thread Anshuman Khandual
POWER8 PMU based BHRB supports filtering for conditional branches.
This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
will extend the existing perf ABI. Other architectures can provide
this functionality with either HW filtering support (if present) or
with SW filtering of instructions.

Signed-off-by: Anshuman Khandual 
Reviewed-by: Stephane Eranian 
---
 include/uapi/linux/perf_event.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..5da52b6 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -160,8 +160,9 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
PERF_SAMPLE_BRANCH_IN_TX= 1U << 8, /* in transaction */
PERF_SAMPLE_BRANCH_NO_TX= 1U << 9, /* not in transaction */
+   PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
 
-   PERF_SAMPLE_BRANCH_MAX  = 1U << 10, /* non-ABI */
+   PERF_SAMPLE_BRANCH_MAX  = 1U << 11, /* non-ABI */
 };
 
 #define PERF_SAMPLE_BRANCH_PLM_ALL \
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 5/6] perf, documentation: Description for conditional branch filter

2013-08-29 Thread Anshuman Khandual
Adding documentation support for conditional branch filter.

Signed-off-by: Anshuman Khandual 
Reviewed-by: Stephane Eranian 
---
 tools/perf/Documentation/perf-record.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index e297b74..59ca8d0 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -163,12 +163,13 @@ following filters are defined:
 - any_call: any function call or system call
 - any_ret: any function return or system call return
 - ind_call: any indirect branch
+- cond: conditional branches
 - u:  only when the branch target is at the user level
 - k: only when the branch target is in the kernel
 - hv: only when the target is at the hypervisor level
 
 +
-The option requires at least one branch type among any, any_call, any_ret, 
ind_call.
+The option requires at least one branch type among any, any_call, any_ret, 
ind_call, cond.
 The privilege levels may be omitted, in which case, the privilege levels of 
the associated
 event are applied to the branch filter. Both kernel (k) and hypervisor (hv) 
privilege
 levels are subject to permissions.  When sampling on multiple events, branch 
stack sampling
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V2 0/6] perf: New conditional branch filter

2013-08-29 Thread Anshuman Khandual
This patchset is the re-spin of the original branch stack sampling
patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
also enables SW based branch filtering support for PPC64 platforms which have
branch stack sampling support. With this new enablement, the branch filter 
support
for PPC64 platforms have been extended to include all these combinations 
discussed
below with a sample test application program.


(1) perf record -e branch-misses:u -b ./cprog
# Overhead  Command  Source Shared Object  Source Symbol  Target Shared 
Object  Target Symbol
#   ...    .  
  .
#
 4.42%cprog  cprog [k] sw_4_2 cprog 
[k] lr_addr  
 4.41%cprog  cprog [k] symbol2cprog 
[k] hw_1_2   
 4.41%cprog  cprog [k] ctr_addr   cprog 
[k] sw_4_1   
 4.41%cprog  cprog [k] lr_addrcprog 
[k] sw_4_2   
 4.41%cprog  cprog [k] sw_4_2 cprog 
[k] callme   
 4.41%cprog  cprog [k] symbol1cprog 
[k] hw_1_1   
 4.41%cprog  cprog [k] success_3_1_3  cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_4_1 cprog 
[k] ctr_addr 
 2.43%cprog  cprog [k] hw_1_2 cprog 
[k] symbol2  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_2   
 2.43%cprog  cprog [k] address1   cprog 
[k] back1
 2.43%cprog  cprog [k] back1  cprog 
[k] callme   
 2.43%cprog  cprog [k] hw_2_1 cprog 
[k] address1 
 2.43%cprog  cprog [k] sw_3_1_1   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_2   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_3   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_1 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_2 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_3 
 2.43%cprog  cprog [k] callme cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] callme cprog 
[k] sw_4_2   
 2.43%cprog  cprog [k] hw_1_1 cprog 
[k] symbol1  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_1   
 2.42%cprog  cprog [k] sw_3_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] success_3_1_1  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_1
 1.99%cprog  cprog [k] address2   cprog 
[k] back2
 1.99%cprog  cprog [k] hw_2_2 cprog 
[k] address2 
 1.99%cprog  cprog [k] back2  cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] main 
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_3
 1.99%cprog  cprog [k] hw_1_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_3_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] sw_3_2   
 1.99%cprog  cprog [k] success_3_1_2  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_2
 1.99%cprog  cprog [k] hw_1_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_4_1 cprog 
[k] callme   
 0.02%cprog  [unknown] [k] 0xf7ba2328 

[PATCH 2/2] w1: ds1wm: use dev_get_platdata()

2013-08-29 Thread Jingoo Han
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.

Signed-off-by: Jingoo Han 
---
 drivers/w1/masters/ds1wm.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/w1/masters/ds1wm.c b/drivers/w1/masters/ds1wm.c
index 96cab6ac..a606dba 100644
--- a/drivers/w1/masters/ds1wm.c
+++ b/drivers/w1/masters/ds1wm.c
@@ -255,17 +255,17 @@ static int ds1wm_find_divisor(int gclk)
 static void ds1wm_up(struct ds1wm_data *ds1wm_data)
 {
int divisor;
-   struct ds1wm_driver_data *plat = ds1wm_data->pdev->dev.platform_data;
+   struct device *dev = _data->pdev->dev;
+   struct ds1wm_driver_data *plat = dev_get_platdata(dev);
 
if (ds1wm_data->cell->enable)
ds1wm_data->cell->enable(ds1wm_data->pdev);
 
divisor = ds1wm_find_divisor(plat->clock_rate);
-   dev_dbg(_data->pdev->dev,
-   "found divisor 0x%x for clock %d\n", divisor, plat->clock_rate);
+   dev_dbg(dev, "found divisor 0x%x for clock %d\n",
+   divisor, plat->clock_rate);
if (divisor == 0) {
-   dev_err(_data->pdev->dev,
-   "no suitable divisor for %dHz clock\n",
+   dev_err(dev, "no suitable divisor for %dHz clock\n",
plat->clock_rate);
return;
}
@@ -481,7 +481,7 @@ static int ds1wm_probe(struct platform_device *pdev)
ds1wm_data->cell = mfd_get_cell(pdev);
if (!ds1wm_data->cell)
return -ENODEV;
-   plat = pdev->dev.platform_data;
+   plat = dev_get_platdata(>dev);
if (!plat)
return -ENODEV;
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] w1: w1-gpio: use dev_get_platdata()

2013-08-29 Thread Jingoo Han
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.

Signed-off-by: Jingoo Han 
---
 drivers/w1/masters/w1-gpio.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/w1/masters/w1-gpio.c b/drivers/w1/masters/w1-gpio.c
index f54ece2..101a366 100644
--- a/drivers/w1/masters/w1-gpio.c
+++ b/drivers/w1/masters/w1-gpio.c
@@ -56,7 +56,7 @@ MODULE_DEVICE_TABLE(of, w1_gpio_dt_ids);
 
 static int w1_gpio_probe_dt(struct platform_device *pdev)
 {
-   struct w1_gpio_platform_data *pdata = pdev->dev.platform_data;
+   struct w1_gpio_platform_data *pdata = dev_get_platdata(>dev);
struct device_node *np = pdev->dev.of_node;
 
pdata = devm_kzalloc(>dev, sizeof(*pdata), GFP_KERNEL);
@@ -87,7 +87,7 @@ static int w1_gpio_probe(struct platform_device *pdev)
}
}
 
-   pdata = pdev->dev.platform_data;
+   pdata = dev_get_platdata(>dev);
 
if (!pdata) {
dev_err(>dev, "No configuration data\n");
@@ -157,7 +157,7 @@ static int w1_gpio_probe(struct platform_device *pdev)
 static int w1_gpio_remove(struct platform_device *pdev)
 {
struct w1_bus_master *master = platform_get_drvdata(pdev);
-   struct w1_gpio_platform_data *pdata = pdev->dev.platform_data;
+   struct w1_gpio_platform_data *pdata = dev_get_platdata(>dev);
 
if (pdata->enable_external_pullup)
pdata->enable_external_pullup(0);
@@ -176,7 +176,7 @@ static int w1_gpio_remove(struct platform_device *pdev)
 
 static int w1_gpio_suspend(struct platform_device *pdev, pm_message_t state)
 {
-   struct w1_gpio_platform_data *pdata = pdev->dev.platform_data;
+   struct w1_gpio_platform_data *pdata = dev_get_platdata(>dev);
 
if (pdata->enable_external_pullup)
pdata->enable_external_pullup(0);
@@ -186,7 +186,7 @@ static int w1_gpio_suspend(struct platform_device *pdev, 
pm_message_t state)
 
 static int w1_gpio_resume(struct platform_device *pdev)
 {
-   struct w1_gpio_platform_data *pdata = pdev->dev.platform_data;
+   struct w1_gpio_platform_data *pdata = dev_get_platdata(>dev);
 
if (pdata->enable_external_pullup)
pdata->enable_external_pullup(1);
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] drivers:net: Convert dma_alloc_coherent(...__GFP_ZERO) to dma_zalloc_coherent

2013-08-29 Thread Joe Perches
On Thu, 2013-08-29 at 22:09 -0400, David Miller wrote:
> Applied, thanks a lot Joe.

Too bad I didn't know there was a dma_zalloc_coherent
the first time...

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] drivers: uio_pruss: use dev_get_platdata()

2013-08-29 Thread Jingoo Han
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.

Signed-off-by: Jingoo Han 
---
 drivers/uio/uio_pruss.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio_pruss.c b/drivers/uio/uio_pruss.c
index df75346..f519da9 100644
--- a/drivers/uio/uio_pruss.c
+++ b/drivers/uio/uio_pruss.c
@@ -121,7 +121,7 @@ static int pruss_probe(struct platform_device *dev)
struct uio_pruss_dev *gdev;
struct resource *regs_prussio;
int ret = -ENODEV, cnt = 0, len;
-   struct uio_pruss_pdata *pdata = dev->dev.platform_data;
+   struct uio_pruss_pdata *pdata = dev_get_platdata(>dev);
 
gdev = kzalloc(sizeof(struct uio_pruss_dev), GFP_KERNEL);
if (!gdev)
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] drivers: uio_pdrv_genirq: use dev_get_platdata()

2013-08-29 Thread Jingoo Han
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.

Signed-off-by: Jingoo Han 
---
 drivers/uio/uio_pdrv_genirq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio_pdrv_genirq.c b/drivers/uio/uio_pdrv_genirq.c
index 4eb8eaf..90ff17a 100644
--- a/drivers/uio/uio_pdrv_genirq.c
+++ b/drivers/uio/uio_pdrv_genirq.c
@@ -104,7 +104,7 @@ static int uio_pdrv_genirq_irqcontrol(struct uio_info 
*dev_info, s32 irq_on)
 
 static int uio_pdrv_genirq_probe(struct platform_device *pdev)
 {
-   struct uio_info *uioinfo = pdev->dev.platform_data;
+   struct uio_info *uioinfo = dev_get_platdata(>dev);
struct uio_pdrv_genirq_platdata *priv;
struct uio_mem *uiomem;
int ret = -EINVAL;
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] drivers: uio_dmem_genirq: use dev_get_platdata()

2013-08-29 Thread Jingoo Han
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.

Signed-off-by: Jingoo Han 
---
 drivers/uio/uio_dmem_genirq.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio_dmem_genirq.c b/drivers/uio/uio_dmem_genirq.c
index 125d0e5..1270f3b 100644
--- a/drivers/uio/uio_dmem_genirq.c
+++ b/drivers/uio/uio_dmem_genirq.c
@@ -146,7 +146,7 @@ static int uio_dmem_genirq_irqcontrol(struct uio_info 
*dev_info, s32 irq_on)
 
 static int uio_dmem_genirq_probe(struct platform_device *pdev)
 {
-   struct uio_dmem_genirq_pdata *pdata = pdev->dev.platform_data;
+   struct uio_dmem_genirq_pdata *pdata = dev_get_platdata(>dev);
struct uio_info *uioinfo = >uioinfo;
struct uio_dmem_genirq_platdata *priv;
struct uio_mem *uiomem;
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] drivers: parport: Kconfig: exclude h8300 for PARPORT_PC

2013-08-29 Thread Chen Gang
h8300 does not support PARPORT_PC.

The related error (with allmodconfig for h8300):

CC [M]  drivers/parport/parport_pc.o
  drivers/parport/parport_pc.c:67:25: fatal error: asm/parport.h: No such file 
or directory


Signed-off-by: Chen Gang 
---
 drivers/parport/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/parport/Kconfig b/drivers/parport/Kconfig
index dc82ef0..70694ce 100644
--- a/drivers/parport/Kconfig
+++ b/drivers/parport/Kconfig
@@ -37,7 +37,7 @@ config PARPORT_PC
tristate "PC-style hardware"
depends on (!SPARC64 || PCI) && !SPARC32 && !M32R && !FRV && !S390 && \
(!M68K || ISA) && !MN10300 && !AVR32 && !BLACKFIN && \
-   !XTENSA && !CRIS
+   !XTENSA && !CRIS && !H8300
 
---help---
  You should say Y here if you have a PC-style parallel port. All
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] f2fs: optimize gc for better performance

2013-08-29 Thread Jin Xu

Hi Jaegeuk,

On 08/29/2013 07:56 PM, Jaegeuk Kim wrote:

Hi,

2013-08-29 (목), 08:48 +0800, Jin Xu:

From: Jin Xu 

This patch improves the foreground gc efficiency by optimizing the
victim selection policy. With this optimization, the random re-write
performance could increase up to 20%.


[...]


In this patch, it does not search a constant number of dirty segments
anymore, instead it calculates the number based on the total segments,
dirty segments and a threshold. Following is the pseudo formula.
,-- nr_dirty_segments, if total_segments < threshold
(# of search) = |
`-- (nr_dirty_segments * threshold) / total_segments,
 Otherwise


Nice catch, but,
I don't understand why we search the # of segments in proportion to the
# of dirty segments.
How about the case where threshold = 10 and total_segments = 10?
Or threshold = 100 and total_segments = 100?
For this, we need to define additional MIN/MAX thresholds and another
handling codes as your proposal.



Firstly, calculating the # of search this way could constraint the
searching overhead in a certain level even when the total segments is
too many.
Secondly, when there are more dirty segments, the same # of garbage
blocks might be more  scattered than when there are less dirty segments.
So we search the # of the segments in proportion to the # of dirty
segments.

[...]


It seems that we can obtain the performance gain just by setting the
MAX_VICTIM_SEARCH to 4096, for example.
So, how about just adding an ending criteria like below?



I agree that we could get the performance improvement by simply
enlarging the MAX_VICTIM_SEARCH to 4096, but I am concerning the
scalability a little bit. Because it might always searching the whole
bitmap in some cases, for example, when dirty segments is 4000 and
total segments is 409600.


[snip]

[...]


if (p->max_search > MAX_VICTIM_SEARCH)
p->max_search = MAX_VICTIM_SEARCH;



The optimization does not apply to SSR mode. There has a reason.
As noticed in the test, when SSR selected the segments that have most
garbage blocks, then when gc is needed, all the dirty segments might
have very less garbage blocks, thus the gc overhead is high. This might
lead to performance degradation. So the patch does not change the
victim selection policy for SSR.

What do you think now?


#define MAX_VICTIM_SEARCH 4096 /* covers 8GB */


p->offset = sbi->last_victim[p->gc_mode];
@@ -243,6 +245,8 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
struct victim_sel_policy p;
unsigned int secno, max_cost;
int nsearched = 0;
+   unsigned int max_search = MAX_VICTIM_SEARCH;
+   unsigned int nr_dirty;

p.alloc_mode = alloc_mode;
select_policy(sbi, gc_type, type, );
@@ -258,6 +262,27 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
goto got_it;
}

+   nr_dirty = dirty_i->nr_dirty[p.dirty_type];
+   if (p.gc_mode == GC_GREEDY && p.alloc_mode != SSR) {
+   if (TOTAL_SEGS(sbi) <= FULL_VICTIM_SEARCH_THRESH)
+   max_search = nr_dirty; /* search all the dirty segs */
+   else {
+   /*
+* With more dirty segments, garbage blocks are likely
+* more scattered, thus search harder for better
+* victim.
+*/
+   max_search = div_u64 ((nr_dirty *
+   FULL_VICTIM_SEARCH_THRESH), TOTAL_SEGS(sbi));
+   if (max_search < MIN_VICTIM_SEARCH_GREEDY)
+   max_search = MIN_VICTIM_SEARCH_GREEDY;
+   }
+   }
+
+   /* no more than the total dirty segments */
+   if (max_search > nr_dirty)
+   max_search = nr_dirty;
+
while (1) {
unsigned long cost;
unsigned int segno;
@@ -290,7 +315,7 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (cost == max_cost)
continue;

-   if (nsearched++ >= MAX_VICTIM_SEARCH) {
+   if (nsearched++ >= max_search) {


if (nsearched++ >= p.max_search) {


sbi->last_victim[p.gc_mode] = segno;
break;
}
diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
index 2c6a6bd..2f525aa 100644
--- a/fs/f2fs/gc.h
+++ b/fs/f2fs/gc.h
@@ -20,7 +20,9 @@
  #define LIMIT_FREE_BLOCK  40 /* percentage over invalid + free space */

  /* Search max. number of dirty segments to select a victim segment */
-#define MAX_VICTIM_SEARCH  20
+#define MAX_VICTIM_SEARCH  20
+#define MIN_VICTIM_SEARCH_GREEDY   20
+#define FULL_VICTIM_SEARCH_THRESH  4096

  struct f2fs_gc_kthread {
struct task_struct *f2fs_gc_task;
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h

Re: [PATCH 5/5] mm/cgroup: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
On 2013/8/30 11:44, Jianguo Wu wrote:

> Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
> we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
> and N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> 
> The code here need to handle with the nodes which have memory,
> we should use N_MEMORY instead.
> 
> Signed-off-by: Xishi Qiu 

Sorry, it's should be "Signed-off-by: Jianguo Wu "

> ---
>  mm/page_cgroup.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 6d757e3..f6f7603 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -116,7 +116,7 @@ static void *__meminit alloc_page_cgroup(size_t size, int 
> nid)
>   return addr;
>   }
>  
> - if (node_state(nid, N_HIGH_MEMORY))
> + if (node_state(nid, N_MEMORY))
>   addr = vzalloc_node(size, nid);
>   else
>   addr = vzalloc(size);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] h8300/kernel/setup.c: add "linux/initrd.h" to pass compiling

2013-08-29 Thread Chen Gang
The related error (allmodconfig for h8300):

  arch/h8300/kernel/setup.c: In function 'setup_arch':
  arch/h8300/kernel/setup.c:103:3: error: 'initrd_start' undeclared (first use 
in this function)
 initrd_start = memory_start;
 ^
  arch/h8300/kernel/setup.c:103:3: note: each undeclared identifier is reported 
only once for each function it appears in
  arch/h8300/kernel/setup.c:104:3: error: 'initrd_end' undeclared (first use in 
this function)
 initrd_end = memory_start += be32_to_cpu(((unsigned long *) 
(memory_start))[2]);
 ^

Signed-off-by: Chen Gang 
---
 arch/h8300/kernel/setup.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c
index d0b1607..85639a1 100644
--- a/arch/h8300/kernel/setup.c
+++ b/arch/h8300/kernel/setup.c
@@ -47,6 +47,10 @@
 #include 
 #endif
 
+#if defined(CONFIG_BLK_DEV_INITRD)
+#include 
+#endif
+
 #define STUBSIZE 0xc000
 
 unsigned long rom_length;
-- 
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] [media] uvcvideo: quirk PROBE_DEF for Dell SP2008WFP monitor.

2013-08-29 Thread Greg KH
On Fri, Aug 30, 2013 at 02:41:17AM +0200, Laurent Pinchart wrote:
> Hi Joseph,
> 
> Thank you for the patch.
> 
> On Thursday 29 August 2013 11:17:41 Joseph Salisbury wrote:
> > BugLink: http://bugs.launchpad.net/bugs/1217957
> > 
> > Add quirk for Dell SP2008WFP monitor: 05a9:2641
> > 
> > Signed-off-by: Joseph Salisbury 
> > Tested-by: Christopher Townsend 
> > Cc: Laurent Pinchart 
> > Cc: Mauro Carvalho Chehab 
> > Cc: linux-me...@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: sta...@vger.kernel.org
> 
> Acked-by: Laurent Pinchart 
> 
> I've applied it to my tree. Given that we're too close to the v3.12 merge 
> window I will push it for v3.13.

A quirk has to wait that long?  That's not ok, they should go in much
sooner than that...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf_swevent_add warn_on.

2013-08-29 Thread Dave Jones
WARNING: CPU: 2 PID: 7498 at kernel/events/core.c:5485 
perf_swevent_add+0x18d/0x1a0()
Modules linked in: snd_seq_dummy 8021q garp stp fuse tun rfcomm hidp nfnetlink 
ipt_ULOG nfc caif_socket caif af_802154 phonet af_rxrpc bnep bluetooth can_bcm 
rfkill can_raw can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt 
scsi_transport_iscsi rds af_key rose x25 atm netrom appletalk ipx p8023 psnap 
p8022 llc ax25 xfs libcrc32c snd_hda_codec_realtek snd_hda_intel snd_hda_codec 
snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd e1000e 
usb_debug ptp soundcore pps_core pcspkr
CPU: 2 PID: 7498 Comm: trinity-child1 Not tainted 3.11.0-rc7+ #32 
 81a228d4 88022ea55a70 816fa71e 
 88022ea55aa8 81052ebd 88022d147000 88024d9cf3e0
 0003 0001 09527bd7 88022ea55ab8
Call Trace:
 [] dump_stack+0x54/0x74
 [] warn_slowpath_common+0x7d/0xa0
 [] warn_slowpath_null+0x1a/0x20
 [] perf_swevent_add+0x18d/0x1a0
 [] event_sched_in.isra.75+0x87/0x1c0
 [] group_sched_in+0x6a/0x1c0
 [] ctx_sched_in+0x101/0x290
 [] perf_event_sched_in+0x60/0x90
 [] perf_event_context_sched_in+0x7b/0xc0
 [] __perf_event_task_sched_in+0x477/0x490
 [] ? arch_vtime_task_switch+0x94/0xa0
 [] finish_task_switch+0xf0/0x130
 [] __schedule+0x34d/0x970
 [] schedule+0x29/0x70
 [] do_cpu_nanosleep+0x154/0x250
 [] ? do_cpu_nanosleep+0x10b/0x250
 [] posix_cpu_nsleep+0x69/0x120
 [] process_cpu_nsleep+0x13/0x20
 [] SyS_clock_nanosleep+0x9a/0x110
 [] tracesys+0xdd/0xe2
---[ end trace 809ae0fc0cbb6fcc ]---


5484 head = find_swevent_head(swhash, event);
5485 if (WARN_ON_ONCE(!head))
5486 return -EINVAL;
5487 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 2/3] mmc: dw_mmc: Honor requests to set the clock to 0 (turn off clock)

2013-08-29 Thread Seungwon Jeon
On Fri, August 30, 2013, Doug Anderson wrote:
> Seungwon,
> 
> On Thu, Aug 29, 2013 at 12:04 AM, Seungwon Jeon  wrote:
> >> I'd really still rather honor the MMC subsystem's request.  It
> >> shouldn't _hurt_ to turn the clock off when the subsystem requests it,
> > Even though turning off by clock programming doesn't hurt,
> > it is costly behavior when considering low power mode of host's own support.
> 
> It is costly?  We are talking about these two commands, right?
> 
>   mci_writel(host, CLKENA, 0);
>   mci_send_cmd(slot,
>   SDMMC_CMD_UPD_CLK | SDMMC_CMD_PRV_DAT_WAIT, 0);
I mean that because host supports auto clock gating, we don't need to
clock programming with setting '0'. If MMC_CLKGATE is enabled, clock programming
will be executed with between '0' clock and working clock frequently. Actually 
the result is same.
Of course, if host didn't support this feature, we would have considered that 
manually to save the power consumption.

> 
> Do you have a reason to believe that these are more costly than all of
> the rest of the code that's executed when the user defines
> CONFIG_MMC_CLKGATE?  You're still proposing doing all of the updates
> of the clock when slot->clock is non-zero, right?  ...so at best
No, origin condition should be remained;
Required clock should be different with current clock.

> skipping this code will be 33% faster since the re-enable code
> disables and then reenables the clock.  If it's the
> "SDMMC_CMD_PRV_DAT_WAIT" that you're worried about then skipping this
> code will only be 25% faster since there are already three calls with
> SDMMC_CMD_PRV_DAT_WAIT in the enable code.
> 
> 
> > Just now, how about focusing on the problem clock isn't updated properly 
> > after suspend/resume?
> 
> I tried to do that in the original patches, but you pointed out
> (correctly) that we should do the correct fix rather than a hackier
> fix.  IMHO the most correct fix is to honor the MMC core's request to
> turn the clock off.  Partially honoring the MMC core (as you suggest)
> is certainly less hacky that my original proposal but I still think
> turning the clock off is better.
> 
> 
> >> right?  One reason to honor the mmc core is that it will make things
> >> cleaner if/when we support a voltage change operation.  The MMC core
> >> has the logic for the voltage change, and part of that involves
> >> turning off the clock.  We'll already need a bunch of special case
> >> code in dw_mmc for voltage change, but it would be nice to avoid one
> >> extra bit.
> > Turning off clock during voltage switching would be another procedure.
> > I guess it could be discussed later.
> 
> Agreed that we're not trying to get voltage switching done here, but
> forward thinking is nice.  If there's no reason _not_ to turn the
> clock off and it will help us later, let's do it. Also, we've already
> agreed that MMC_CLKGATE isn't so useful for dw_mmc, so trying to do
> something awkward to make MMC_CLKGATE slightly faster doesn't seem
> worth it.
> 
> 
> > I want to fix some minor change to prevent frequent message that Jaehoon 
> > pointed.
> 
> As far as I can tell, the frequent messages and whether or not to
> actually turn the clock off are unrelated.  I will send up a patch
> that fixes the frequent messages by caching the last value printed and
> only printing if it changed.  I have verified that this works and that
> the system still functions OK (can boot to prompt) with
> CONFIG_MMC_CLKGATE.
> 
> 
> Note: re-reading over some of the previous messages, it sounds like
> you're proposing using the patch from your email directly, AKA:
> 
> http://article.gmane.org/gmane.linux.kernel/1542482
> 
> Did you test that patch?  Did it work for you?  It doesn't actually
> compile cleanly for me (you removed the "force_clkinit" param in the
> function but not the callers).  That's easy to fix, but implies that
> this patch was just a proposal and not a tested solution.
> 
> ...but aside from not compiling cleanly, I don't think it will work
> for the same reasons that the original code didn't work.  Specifically
> it doesn't address the core problem that we need to update
> host->current_speed when the clock is 0.  Otherwise we won't re-init
> and we run into the original problem, right?  To be certain I took
> your patch and applied it, then fixed the callers of
> dw_mci_setup_bus() not to pass a second parameter.  I did a
> suspend/resume with no card in and then plugged a card in.  It didn't
> work.

Some change proposed from me are mixed with both current existing part and your 
new patch.
It's not whole code to replace your patch. But if it makes you confused, sorry 
about that.
Important thing I intended is that if required clock(slot->clock) is '0', not 
try to update clock.
with considering automatic clock gating. Please check first comment from me on 
v6.

Thanks,
Seungwon Jeon
> 
> 
> As I said above, new patch coming shortly.  As always: feel free to
> point out any glaring mistakes I made in 

Re: [PATCH] h8300/kernel/setup.c: add "linux/initrd.h" to pass compiling

2013-08-29 Thread Chen Gang
On 08/26/2013 06:31 PM, Chen Gang wrote:
> Need add "linux/initrd.h" to pass compiling.
> 
> The related error (allmodconfig for h8300):
> 
>   arch/h8300/kernel/setup.c: In function 'setup_arch':
>   arch/h8300/kernel/setup.c:103:3: error: 'initrd_start' undeclared (first 
> use in this function)
>  initrd_start = memory_start;
>  ^
>   arch/h8300/kernel/setup.c:103:3: note: each undeclared identifier is 
> reported only once for each function it appears in
>   arch/h8300/kernel/setup.c:104:3: error: 'initrd_end' undeclared (first use 
> in this function)
>  initrd_end = memory_start += be32_to_cpu(((unsigned long *) 
> (memory_start))[2]);
>  ^
> 
> Signed-off-by: Chen Gang 
> ---
>  arch/h8300/kernel/setup.c |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c
> index d0b1607..684e734 100644
> --- a/arch/h8300/kernel/setup.c
> +++ b/arch/h8300/kernel/setup.c
> @@ -47,6 +47,9 @@
>  #include 
>  #endif
>  
> +#if defined(CONFIG_BLK_DEV_INITRD)
> +#include 
> +#endif
>  #define STUBSIZE 0xc000
>  

Oh, it need an empty line after "#endif" to mach "current coding style".

I will send patch v2 for it.

Thanks.

>  unsigned long rom_length;
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Linus Torvalds
On Thu, Aug 29, 2013 at 8:12 PM, Waiman Long  wrote:
> On 08/29/2013 07:42 PM, Linus Torvalds wrote:
>>
>> Waiman? Mind looking at this and testing? Linus
>
> Sure, I will try out the patch tomorrow morning and see how it works out for
> my test case.

Ok, thanks, please use this slightly updated patch attached here.

It improves on the previous version in actually handling the
"unlazy_walk()" case with native lockref handling, which means that
one other not entirely odd case (symlink traversal) avoids the d_lock
contention.

It also refactored the __d_rcu_to_refcount() to be more readable, and
adds a big comment about what the heck is going on. The old code was
clever, but I suspect not very many people could possibly understand
what it actually did. Plus it used nested spinlocks because it wanted
to avoid checking the sequence count twice. Which is stupid, since
nesting locks is how you get really bad contention, and the sequence
count check is really cheap anyway. Plus the nesting *really* didn't
work with the whole lockref model.

With this, my stupid thread-lookup thing doesn't show any spinlock
contention even for the "look up symlink" case.

It also avoids the unnecessary aligned u64 for when we don't actually
use cmpxchg at all.

It's still one single patch, since I was working on lots of small
cleanups. I think it's pretty close to done now (assuming your testing
shows it performs fine - the powerpc numbers are promising, though),
so I'll split it up into proper chunks rather than random commit
points. But I'm done for today at least.

NOTE NOTE NOTE! My test coverage really has been pretty pitiful. You
may hit cases I didn't test. I think it should be *stable*, but maybe
there's some other d_lock case that your tuned waiting hid, and that
my "fastpath only for unlocked case" version ends up having problems
with.

 Linus


patch.diff
Description: Binary data


[PATCH 5/5] mm/cgroup: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
and N_HIGH_MEMORY stands for the nodes that has normal or high memory.

The code here need to handle with the nodes which have memory,
we should use N_MEMORY instead.

Signed-off-by: Xishi Qiu 
---
 mm/page_cgroup.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 6d757e3..f6f7603 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -116,7 +116,7 @@ static void *__meminit alloc_page_cgroup(size_t size, int 
nid)
return addr;
}
 
-   if (node_state(nid, N_HIGH_MEMORY))
+   if (node_state(nid, N_MEMORY))
addr = vzalloc_node(size, nid);
else
addr = vzalloc(size);
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] mm/ia64: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
and N_HIGH_MEMORY stands for the nodes that has normal or high memory.

The code here need to handle with the nodes which have memory,
we should use N_MEMORY instead.

Signed-off-by: Jianguo Wu 
---
 arch/ia64/kernel/uncached.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/kernel/uncached.c b/arch/ia64/kernel/uncached.c
index a96bcf8..d2e5545 100644
--- a/arch/ia64/kernel/uncached.c
+++ b/arch/ia64/kernel/uncached.c
@@ -196,7 +196,7 @@ unsigned long uncached_alloc_page(int starting_nid, int 
n_pages)
nid = starting_nid;
 
do {
-   if (!node_state(nid, N_HIGH_MEMORY))
+   if (!node_state(nid, N_MEMORY))
continue;
uc_pool = _pools[nid];
if (uc_pool->pool == NULL)
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + mm-sparse-introduce-alloc_usemap_and_memmap-fix-2.patch added to -mm tree

2013-08-29 Thread Yinghai Lu
On Thu, Aug 29, 2013 at 4:35 PM, Wanpeng Li  wrote:

> On Thu, Aug 29, 2013 at 01:44:18PM -0700, Yinghai Lu wrote:
>>>  mm/sparse.c |   41 +++--
>>>  1 file changed, 15 insertions(+), 26 deletions(-)
>>>
>>> diff -puN mm/sparse.c~mm-sparse-introduce-alloc_usemap_and_memmap-fix-2 
>>> mm/sparse.c
>>> --- a/mm/sparse.c~mm-sparse-introduce-alloc_usemap_and_memmap-fix-2
>>> +++ a/mm/sparse.c
>>> @@ -339,13 +339,14 @@ static void __init check_usemap_section_
>>>  }
>>>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>>>
>>> -static void __init sparse_early_usemaps_alloc_node(unsigned 
>>> long**usemap_map,
>>> +static void __init sparse_early_usemaps_alloc_node(void **data,
>>>  unsigned long pnum_begin,
>>>  unsigned long pnum_end,
>>>  unsigned long usemap_count, int nodeid)
>>>  {
>>> void *usemap;
>>> unsigned long pnum;
>>> +   unsigned long **usemap_map = (unsigned long **)data;
>>
>>Can you check if (void *data) will work?
>>
>>void ** looks strange.
>
> The original patch you give me has (void *data), however, there is compile 
> warning.
>
> mm/sparse.c: In function  $B!F (Bsparse_init $B!G (B: mm/sparse.c:552:8: 
> warning: passing argument 1 of  $B!F (Balloc_usemap_and_memmap $B!G (B
> from incompatible pointer type [enabled by default]
> mm/sparse.c:469:20: note: expected  $B!F (Bvoid (*)(void **, long unsigned 
> int, long unsigned int,  long unsigned int,  int) $B!G (B
> but argument is of type  $B!F (Bvoid (*)(void *, long unsigned int,  long 
> unsigned int,  long unsigned int,  int) $B!G (B
>
> The void ** fix it. ;-)

should change both to void *

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] mm/vmemmap: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
and N_HIGH_MEMORY stands for the nodes that has normal or high memory.

The code here need to handle with the nodes which have memory,
we should use N_MEMORY instead.

Signed-off-by: Jianguo Wu 
---
 mm/sparse-vmemmap.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 27eeab3..ca8f46b 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -52,7 +52,7 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int 
node)
if (slab_is_available()) {
struct page *page;
 
-   if (node_state(node, N_HIGH_MEMORY))
+   if (node_state(node, N_MEMORY))
page = alloc_pages_node(
node, GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
get_order(size));
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] mm/sparse: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
and N_HIGH_MEMORY stands for the nodes that has normal or high memory.

The code here need to handle with the nodes which have memory,
we should use N_MEMORY instead.

Signed-off-by: Jianguo Wu 
---
 mm/sparse.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 308d503..8519d6a 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -64,7 +64,7 @@ static struct mem_section noinline __init_refok 
*sparse_index_alloc(int nid)
   sizeof(struct mem_section);
 
if (slab_is_available()) {
-   if (node_state(nid, N_HIGH_MEMORY))
+   if (node_state(nid, N_MEMORY))
section = kzalloc_node(array_size, GFP_KERNEL, nid);
else
section = kzalloc(array_size, GFP_KERNEL);
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] mm/vmalloc: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-29 Thread Jianguo Wu
Since commit 8219fc48a(mm: node_states: introduce N_MEMORY),
we introduced N_MEMORY, now N_MEMORY stands for the nodes that has any memory,
and N_HIGH_MEMORY stands for the nodes that has normal or high memory.

The code here need to handle with the nodes which have memory,
we should use N_MEMORY instead.

Signed-off-by: Jianguo Wu 
---
 mm/vmalloc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 13a5495..1152947 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2573,7 +2573,7 @@ static void show_numa_info(struct seq_file *m, struct 
vm_struct *v)
for (nr = 0; nr < v->nr_pages; nr++)
counters[page_to_nid(v->pages[nr])]++;
 
-   for_each_node_state(nr, N_HIGH_MEMORY)
+   for_each_node_state(nr, N_MEMORY)
if (counters[nr])
seq_printf(m, " N%u=%u", nr, counters[nr]);
}
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: soft lockup - CPU#25 stuck for 23s! [memcg_process_s:5859]

2013-08-29 Thread Zhouping Liu
Hello All,

I hit the following errors when running memcg_stress_test.sh comes from LTP 
test suite on v3.11-rc7+:

 snip -
[ 2163.674483] BUG: soft lockup - CPU#25 stuck for 23s! [memcg_process_s:5859]
[ 2163.674489] Modules linked in: xt_CHECKSUM tun bridge stp llc ebtable_nat 
nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat 
nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables sg xfs libcrc32c netxen_nic 
amd64_edac_mod hpilo hpwdt edac_mce_amd sp5100_tco shpchp pcspkr edac_core 
serio_raw microcode i2c_piix4 acpi_power_meter k10temp acpi_cpufreq mperf 
radeon sd_mod i2c_algo_bit crc_t10dif drm_kms_helper ttm ata_generic drm 
pata_acpi ahci libahci pata_atiixp libata hpsa i2c_core dm_mirror 
dm_region_hash dm_log dm_mod
[ 2163.674531] CPU: 25 PID: 5859 Comm: memcg_process_s Not tainted 3.11.0-rc7+ 
#1
[ 2163.674531] Hardware name: HP ProLiant DL585 G7, BIOS A16 12/31/2011
[ 2163.674532] task: 884831c99fe0 ti: 88483358a000 task.ti: 
88483358a000
[ 2163.674533] RIP: 0010:[]  [] 
smp_call_function_many+0x25e/0x2c0
[ 2163.674536] RSP: :88483358ba18  EFLAGS: 0202
[ 2163.674537] RAX: 0008 RBX: 0282 RCX: 880237c98cb0
[ 2163.674538] RDX: 0008 RSI: 0030 RDI: 
[ 2163.674539] RBP: 88483358ba68 R08: 882cd089fe00 R09: 882cebc17540
[ 2163.674540] R10: ea0120ceb200 R11: 812fc0d9 R12: 81107cf0
[ 2163.674540] R13: 88483358b9b8 R14: 0206 R15: 884831391378
[ 2163.674542] FS:  7ff5779ec740() GS:882cebc0() 
knlGS:
[ 2163.674542] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2163.674543] CR2: 7f763b3dc469 CR3: 0017cd6a6000 CR4: 06e0
[ 2163.674544] Stack:
[ 2163.674544]  0001 00015200 8114b0b0 
88483bcd5200
[ 2163.674556]  0202 81d6ca80 8114b0b0 

[ 2163.674564]  0019 0001 88483358ba98 
810cd87a
[ 2163.674571] Call Trace:
[ 2163.674573]  [] ? drain_pages+0xb0/0xb0
[ 2163.674576]  [] ? drain_pages+0xb0/0xb0
[ 2163.674580]  [] on_each_cpu_mask+0x2a/0x60
[ 2163.674583]  [] drain_all_pages+0xb5/0xc0
[ 2163.674587]  [] __alloc_pages_nodemask+0x70e/0xa00
[ 2163.674591]  [] alloc_pages_current+0xa9/0x170
[ 2163.674595]  [] __page_cache_alloc+0x87/0xb0
[ 2163.674598]  [] filemap_fault+0x185/0x400
[ 2163.674602]  [] __do_fault+0x71/0x4f0
[ 2163.674605]  [] ? load_balance+0x109/0x7e0
[ 2163.674608]  [] handle_pte_fault+0x93/0xa40
[ 2163.674612]  [] handle_mm_fault+0x291/0x660
[ 2163.674615]  [] __do_page_fault+0x146/0x510
[ 2163.674619]  [] ? do_nanosleep+0x92/0x130
[ 2163.674623]  [] ? hrtimer_nanosleep+0xad/0x170
[ 2163.674626]  [] ? hrtimer_get_res+0x50/0x50
[ 2163.674629]  [] do_page_fault+0xe/0x10
[ 2163.674633]  [] page_fault+0x28/0x30
[ 2163.674635] Code: 48 94 00 89 c2 39 f0 0f 8d 2d fe ff ff 48 98 49 8b 4d 00 
48 03 0c c5 40 62 a0 81 f6 41 20 01 74 cc 0f 1f 40 00 f3 90 f6 41 20 01 <75> f8 
48 63 35 91 48 94 00 eb b7 0f b6 4d b4 48 8b 75 c0 4c 89 
[ 2163.710494] BUG: soft lockup - CPU#26 stuck for 23s! [memcg_process_s:5915]
[ 2163.710499] Modules linked in: xt_CHECKSUM tun bridge stp llc ebtable_nat 
nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat 
nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter ip_tables sg xfs libcrc32c netxen_nic 
amd64_edac_mod hpilo hpwdt edac_mce_amd sp5100_tco shpchp pcspkr edac_core 
serio_raw microcode i2c_piix4 acpi_power_meter k10temp acpi_cpufreq mperf 
radeon sd_mod i2c_algo_bit crc_t10dif drm_kms_helper ttm ata_generic drm 
pata_acpi ahci libahci pata_atiixp libata hpsa i2c_core dm_mirror 
dm_region_hash dm_log dm_mod
[ 2163.710543] CPU: 26 PID: 5915 Comm: memcg_process_s Not tainted 3.11.0-rc7+ 
#1
[ 2163.710543] Hardware name: HP ProLiant DL585 G7, BIOS A16 12/31/2011
[ 2163.710544] task: 884a2c42dfa0 ti: 884a2e7f task.ti: 
884a2e7f
[ 2163.710545] RIP: 0010:[]  [] 
smp_call_function_many+0x25e/0x2c0
[ 2163.710548] RSP: :884a2e7f1960  EFLAGS: 0202
[ 2163.710549] RAX: 0008 RBX: 88470bd15210 RCX: 880237c98cd8
[ 2163.710550] RDX: 0008 RSI: 0030 RDI: 
[ 2163.710551] RBP: 884a2e7f19b0 R08: 8846f888fe00 R09: 88470bc17540
[ 2163.710552] R10: ea011bdd9e00 R11: 812fc0d9 R12: 884a2e7f1918
[ 2163.710552] R13: 81107cf0 R14: 884a2e7f1900 R15: 

Re: [PATCH] hwmon: (htu21) Add Measurement Specialties HTU21D support

2013-08-29 Thread Guenter Roeck

On 08/29/2013 04:51 AM, Markezana, William wrote:

From: William Markezana 

hwmon: (htu21) Add Measurement Specialties HTU21D support
Signed-off-by: William Markezana 
---


Applied [ with minor formatting changes ] to -next.

Thanks,
Guenter


diff --git a/Documentation/hwmon/htu21 b/Documentation/hwmon/htu21
new file mode 100644
index 000..8082879
--- /dev/null
+++ b/Documentation/hwmon/htu21
@@ -0,0 +1,46 @@
+Kernel driver htu21
+===
+
+Supported chips:
+  * Measurement Specialties HTU21D
+Prefix: 'htu21'
+Addresses scanned: none
+Datasheet: Publicly available at the Measurement Specialties website
+ http://www.meas-spec.com/downloads/HTU21D.pdf
+
+
+Author:
+  William Markezana 
+
+Description
+---
+
+The HTU21D is a humidity and temperature sensor in a DFN package of
+only 3 x 3 mm footprint and 0.9 mm height.
+
+The devices communicate with the I2C protocol. All sensors are set to the
+same I2C address 0x40, so an entry with I2C_BOARD_INFO("htu21", 0x40) can
+be used in the board setup code.
+
+This driver does not auto-detect devices. You will have to instantiate the
+devices explicitly. Please see Documentation/i2c/instantiating-devices
+for details."
+
+sysfs-Interface
+---
+
+temp1_input - temperature input
+humidity1_input - humidity input
+
+Notes
+-
+
+The driver uses the default resolution settings of 12 bit for humidity and 14
+bit for temperature, which results in typical measurement times of 11 ms for
+humidity and 44 ms for temperature. To keep self heating below 0.1 degree
+Celsius, the device should not be active for more than 10% of the time. For
+this reason, the driver performs no more than two measurements per second and
+reports cached information if polled more frequently.
+
+Different resolutions, the on-chip heater, using the CRC checksum and reading
+the serial number are not supported yet.
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index e989f7f..55973cd 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -511,6 +511,16 @@ config SENSORS_HIH6130
   This driver can also be built as a module.  If so, the module
   will be called hih6130.

+config SENSORS_HTU21
+   tristate "Measurement Specialties HTU21D humidity/temperature sensors"
+   depends on I2C
+   help
+ If you say yes here you get support for the Measurement Specialties
+ HTU21D humidity and temperature sensors.
+
+ This driver can also be built as a module.  If so, the module
+ will be called htu21.
+
  config SENSORS_CORETEMP
 tristate "Intel Core/Core2/Atom temperature sensor"
 depends on X86
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index 4f0fb52..ec7cde0 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -65,6 +65,7 @@ obj-$(CONFIG_SENSORS_GL518SM) += gl518sm.o
  obj-$(CONFIG_SENSORS_GL520SM)  += gl520sm.o
  obj-$(CONFIG_SENSORS_GPIO_FAN) += gpio-fan.o
  obj-$(CONFIG_SENSORS_HIH6130)  += hih6130.o
+obj-$(CONFIG_SENSORS_HTU21)+= htu21.o
  obj-$(CONFIG_SENSORS_ULTRA45)  += ultra45_env.o
  obj-$(CONFIG_SENSORS_I5K_AMB)  += i5k_amb.o
  obj-$(CONFIG_SENSORS_IBMAEM)   += ibmaem.o
diff --git a/drivers/hwmon/htu21.c b/drivers/hwmon/htu21.c
new file mode 100644
index 000..743bf32
--- /dev/null
+++ b/drivers/hwmon/htu21.c
@@ -0,0 +1,201 @@
+/*
+ * Measurement Specialties HTU21D humidity and temperature sensor driver
+ *
+ * Copyright (C) 2013 William Markezana 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* HTU21 Commands */
+#define HTU21_T_MEASUREMENT_HM 0xE3
+#define HTU21_RH_MEASUREMENT_HM0xE5
+
+struct htu21 {
+   struct device *hwmon_dev;
+   struct mutex lock;
+   bool valid;
+   unsigned long last_update;
+   int temperature;
+   int humidity;
+};
+
+static inline int htu21_temp_ticks_to_millicelsius(int ticks)
+{
+   ticks &= ~0x0003; /* clear status bits */
+   /*
+* Formula T = -46.85 + 175.72 * ST / 2^16 from datasheet p14,
+* optimized for integer fixed point (3 digits) arithmetic
+*/
+   return ((21965 * ticks) >> 13) - 46850;
+}
+
+static inline int htu21_rh_ticks_to_per_cent_mille(int ticks)
+{
+   ticks &= ~0x0003; /* clear status bits */
+   /*
+* Formula RH = -6 + 125 * SRH / 2^16 from datasheet p14,
+* optimized for integer 

Re: Memory synchronization vs. interrupt handlers

2013-08-29 Thread H. Peter Anvin
On 08/29/2013 04:51 PM, Paul E. McKenney wrote:
> On Wed, Aug 28, 2013 at 01:28:08PM -0700, H. Peter Anvin wrote:
>> On 08/28/2013 12:16 PM, Alan Stern wrote:
>>> Russell, Peter, and Ingo:
>>>
>>> Can you folks enlighten us regarding this issue for some common 
>>> architectures?
>>
>> On x86, IRET is a serializing instruction; it guarantees hard
>> serialization of absolutely everything.
> 
> So a second interrupt from this same device could not appear to happen
> before the IRET, no matter what device and/or I/O bus?  Or is IRET
> defined to synchronize all the way out to the whatever device is
> generating the next interrupt?

The second interrupt from this same device can occur as soon as the EOI
cycle is done, which happens before the IRET.  The EOI cycle is an I/O
operation and since integer operations to memory are strongly ordered
that implies all other effects are globally visible.

In addition, there is usually synchronization that happens due to
reading an interrupt status register or something else.

>> I would expect architectures that have weak memory ordering to put
>> appropriate barriers in the IRQ entry/exit code.
> 
> Adding a few on CC.  Also restating the question as I understand it:
> 
>   Suppose that a given device generates an interrupt on CPU 0,
>   but that before CPU 0's interrupt handler completes, this device
>   wants to generate a second interrupt on CPU 1.  This can happen
>   as soon as CPU 0's handler does an EOI or equivalent.
> 
>   Can CPU 1's interrupt handler assume that all the in-memory effects
>   of CPU 0's interrupt handler will be visible, even if neither
>   interrupt handler uses locking or memory barriers?
> 

On x86 it certainly can.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 3/4] usb: phy-tegra-usb: use platform_{get,set}_drvdata()

2013-08-29 Thread Libo Chen

Use the wrapper functions for getting and setting the driver data using
platform_device instead of using dev_{get,set}_drvdata() with >dev,
so we can directly pass a struct platform_device.

Signed-off-by: Libo Chen 
---
 drivers/usb/phy/phy-tegra-usb.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

rebase on usb-next tree

diff --git a/drivers/usb/phy/phy-tegra-usb.c b/drivers/usb/phy/phy-tegra-usb.c
index 3bfb3d1..e9cb1cb 100644
--- a/drivers/usb/phy/phy-tegra-usb.c
+++ b/drivers/usb/phy/phy-tegra-usb.c
@@ -1064,7 +1064,7 @@ static int tegra_usb_phy_probe(struct platform_device 
*pdev)
tegra_phy->u_phy.shutdown = tegra_usb_phy_close;
tegra_phy->u_phy.set_suspend = tegra_usb_phy_suspend;

-   dev_set_drvdata(>dev, tegra_phy);
+   platform_set_drvdata(pdev, tegra_phy);

err = usb_add_phy_dev(_phy->u_phy);
if (err < 0) {
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 2/4] usb: r8a66597-hcd: use platform_{get,set}_drvdata()

2013-08-29 Thread Libo Chen

Use the wrapper functions for getting and setting the driver data using
platform_device instead of using dev_{get,set}_drvdata() with >dev,
so we can directly pass a struct platform_device.

Signed-off-by: Libo Chen 
---
 drivers/usb/host/r8a66597-hcd.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

rebase on usb-next tree

diff --git a/drivers/usb/host/r8a66597-hcd.c b/drivers/usb/host/r8a66597-hcd.c
index a9eef68..2ad004a 100644
--- a/drivers/usb/host/r8a66597-hcd.c
+++ b/drivers/usb/host/r8a66597-hcd.c
@@ -2393,7 +2393,7 @@ static const struct dev_pm_ops r8a66597_dev_pm_ops = {

 static int r8a66597_remove(struct platform_device *pdev)
 {
-   struct r8a66597 *r8a66597 = dev_get_drvdata(>dev);
+   struct r8a66597 *r8a66597 = platform_get_drvdata(pdev);
struct usb_hcd  *hcd = r8a66597_to_hcd(r8a66597);

del_timer_sync(>rh_timer);
@@ -2466,7 +2466,7 @@ static int r8a66597_probe(struct platform_device *pdev)
}
r8a66597 = hcd_to_r8a66597(hcd);
memset(r8a66597, 0, sizeof(struct r8a66597));
-   dev_set_drvdata(>dev, r8a66597);
+   platform_set_drvdata(pdev, r8a66597);
r8a66597->pdata = dev_get_platdata(>dev);
r8a66597->irq_sense_low = irq_trigger == IRQF_TRIGGER_LOW;

-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] vhost_net: remove the max pending check

2013-08-29 Thread Jason Wang
On 08/25/2013 07:53 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 23, 2013 at 04:55:49PM +0800, Jason Wang wrote:
>> On 08/20/2013 10:48 AM, Jason Wang wrote:
>>> On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>>> We used to limit the max pending DMAs to prevent guest from pinning too 
>>> many
>>> pages. But this could be removed since:
>>>
>>> - We have the sk_wmem_alloc check in both tun/macvtap to do the same 
>>> work
>>> - This max pending check were almost useless since it was one done when 
>>> there's
>>>   no new buffers coming from guest. Guest can easily exceeds the 
>>> limitation.
>>> - We've already check upend_idx != done_idx and switch to non zerocopy 
>>> then. So
>>>   even if all vq->heads were used, we can still does the packet 
>>> transmission.
> We can but performance will suffer.
>>> The check were in fact only done when no new buffers submitted from
>>> guest. So if guest keep sending, the check won't be done.
>>>
>>> If we really want to do this, we should do it unconditionally. Anyway, I
>>> will do test to see the result.
>> There's a bug in PATCH 5/6, the check:
>>
>> nvq->upend_idx != nvq->done_idx
>>
>> makes the zerocopy always been disabled since we initialize both
>> upend_idx and done_idx to zero. So I change it to:
>>
>> (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx.
> But what I would really like to try is limit ubuf_info to VHOST_MAX_PEND.
> I think this has a chance to improve performance since
> we'll be using less cache.
> Of course this means we must fix the code to really never submit
> more than VHOST_MAX_PEND requests.
>
> Want to try?

The result is, I see about 5%-10% improvement for per cpu throughput on
guest tx. But about 5% degradation on per cpu transaction rate on TCP_RR.
>> With this change on top, I didn't see performance difference w/ and w/o
>> this patch.
> Did you try small message sizes btw (like 1K)? Or just netperf
> default of 64K?
>

5%-10% improvement on for per cpu throughput on guest rx, but some
regressions (5%) on guest tx. So we'd better keep and make it doing
properly.

Will post V2 for your reviewing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v2 1/2] qspinlock: Introducing a 4-byte queue spinlock implementation

2013-08-29 Thread Waiman Long

On 08/29/2013 01:03 PM, Alexander Fyodorov wrote:

29.08.2013, 19:25, "Waiman Long":

What I have been thinking is to set a flag in an architecture specific
header file to tell if the architecture need a memory barrier. The
generic code will then either do a smp_mb() or barrier() depending on
the presence or absence of the flag. I would prefer to do more in the
generic code, if possible.

If you use flag then you'll have to check it manually. It is better to add new 
smp_mb variant, I suggest calling it smp_mb_before_store(), and define it to 
barrier() on x86.


I am sorry that I was not clear in my previous mail. I mean a flag/macro 
for compile time checking rather than doing runtime checking.


Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Waiman Long

On 08/29/2013 07:42 PM, Linus Torvalds wrote:
Waiman? Mind looking at this and testing? Linus 


Sure, I will try out the patch tomorrow morning and see how it works out 
for my test case.


Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the sound tree with the drm tree

2013-08-29 Thread Stephen Rothwell
Hi Takashi,

Today's linux-next merge of the sound tree got a conflict in
sound/pci/hda/hda_intel.c between commit 246efa4a072f ("snd/hda: add
runtime suspend/resume on optimus support (v4)") from the drm tree and
commit 7d4f606c50ff ("ALSA: hda - WAKEEN feature enabling for runtime
pm") from the sound tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc sound/pci/hda/hda_intel.c
index bf5e58e,c6c9829..000
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@@ -2975,12 -2971,10 +2975,16 @@@ static int azx_runtime_suspend(struct d
struct snd_card *card = dev_get_drvdata(dev);
struct azx *chip = card->private_data;
  
 +  if (chip->disabled)
 +  return 0;
 +
 +  if (!(chip->driver_caps & AZX_DCAPS_PM_RUNTIME))
 +  return 0;
 +
+   /* enable controller wake up event */
+   azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) |
+ STATESTS_INT_MASK);
+ 
azx_stop_chip(chip);
azx_enter_link_reset(chip);
azx_clear_irq_pending(chip);
@@@ -2993,17 -2987,31 +2997,37 @@@ static int azx_runtime_resume(struct de
  {
struct snd_card *card = dev_get_drvdata(dev);
struct azx *chip = card->private_data;
+   struct hda_bus *bus;
+   struct hda_codec *codec;
+   int status;
  
 +  if (chip->disabled)
 +  return 0;
 +
 +  if (!(chip->driver_caps & AZX_DCAPS_PM_RUNTIME))
 +  return 0;
 +
if (chip->driver_caps & AZX_DCAPS_I915_POWERWELL)
hda_display_power(true);
+ 
+   /* Read STATESTS before controller reset */
+   status = azx_readw(chip, STATESTS);
+ 
azx_init_pci(chip);
azx_init_chip(chip, 1);
+ 
+   bus = chip->bus;
+   if (status && bus) {
+   list_for_each_entry(codec, >codec_list, list)
+   if (status & (1 << codec->addr))
+   queue_delayed_work(codec->bus->workq,
+  >jackpoll_work, 
codec->jackpoll_interval);
+   }
+ 
+   /* disable controller Wake Up event*/
+   azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) &
+   ~STATESTS_INT_MASK);
+ 
return 0;
  }
  


pgpSa1vIKsc4T.pgp
Description: PGP signature


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Benjamin Herrenschmidt
On Thu, 2013-08-29 at 19:35 -0700, Linus Torvalds wrote:
> That said, on power, you have that "ACCESS_ONCE()" implicit in the
> *type*, not in the code, so an "arch_spinlock_t" is fundamentally
> volatile in itself. It's one of the reasons I despise "volatile":
> things like volatility are _not_ attributes of a variable or a type,
> but of the code in question. Something can be volatile in one context,
> but not in another (one context might be locked, for example).

Right, we can probably change that to use ACCESS_ONCE... volatile tend
to never quite do what you expect anyway.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/4] Unify CPU hotplug lock interface

2013-08-29 Thread Yasuaki Ishimatsu
(2013/08/30 9:22), Toshi Kani wrote:
> lock_device_hotplug() was recently introduced to serialize CPU & Memory
> online/offline and hotplug operations, along with sysfs online interface
> restructure (commit 4f3549d7).  With this new locking scheme,
> cpu_hotplug_driver_lock() is redundant and is no longer necessary.
> 
> This patchset makes sure that lock_device_hotplug() covers all CPU online/
> offline interfaces, and then removes cpu_hotplug_driver_lock().
> 
> v2:
>   - Rebased to the pm tree, bleeding-edge.
>   - Changed patch 2/4 to use lock_device_hotplug_sysfs().
> 
> ---
> Toshi Kani (4):
>hotplug, x86: Fix online state in cpu0 debug interface
>hotplug, x86: Add hotplug lock to missing places
>hotplug, x86: Disable ARCH_CPU_PROBE_RELEASE on x86
>hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()
> 
> ---
The patch-set looks good to me.

Acked-by: Yasuaki Ishimatsu 

Thanks,
Yasuaki Ishimatsu


>   arch/powerpc/kernel/smp.c  | 12 --
>   arch/powerpc/platforms/pseries/dlpar.c | 40 
> +-
>   arch/x86/Kconfig   |  4 
>   arch/x86/kernel/smpboot.c  | 21 --
>   arch/x86/kernel/topology.c | 11 ++
>   drivers/base/cpu.c | 34 +++--
>   include/linux/cpu.h| 13 ---
>   7 files changed, 45 insertions(+), 90 deletions(-)
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Benjamin Herrenschmidt
On Thu, 2013-08-29 at 19:31 -0700, Linus Torvalds wrote:

> Also, on x86, there are no advantages to cmpxchg over a spinlock -
> they are both exactly one equally serializing instruction. If
> anything, cmpxchg is worse due to having a cache read before the
> write, and a few cycles slower anyway. So I actually expect the x86
> code to slow down a tiny bit for the single-threaded case, although
> that should be hopefully unmeasurable.
> 
> On POWER, you may have much less serialization for the cmpxchg. That
> may sadly be something we'll need to fix - the serialization between
> getting a lockref and checking sequence counts etc may need some extra
> work.

> So it may be that you are seeing unrealistically good numbers, and
> that we will need to add a memory barrier or two. On x86, due to the
> locked instruction semantics, that just isn't an issue.

Dunno, our cmpxhg has both acquire and release barriers. It basically
does release, xchg, then acquire. So it is equivalent to an unlock
followed by a lock.

> > The numbers move around about 10% from run to run.
> 
> Please note that the whole "dentry hash chains may be better" for one
> run vs another, and that's something that will _persist_ between
> subsequent runs, so you may see "only 10% variability", but there may
> be a bigger picture variability that you're not noticing because you
> had to reboot in between.
> 
> To be really comparable, you should really run the stupid benchmark
> after fairly equal boot up sequences. If the machine had been up for
> several days for one set of numbers, and freshly rebooted for the
> other, it can be a very unfair comparison.
> 
> (I long ago had a nice "L1 dentry cache" patch that helped with the
> fact that the dentry chains *can* get long especially if you have tons
> of memory, and that helped with this kind of variability a lot - and
> improved performance too. It was slightly racy, though, which is why
> it never got merged).
> 
> > powerpc patch below. I'm using arch_spin_is_locked() to implement
> > arch_spin_value_unlocked().
> 
> Your "slock" is of type "volatile unsigned int slock", so it may well
> cause those temporaries to be written to memory.
> 
> It probably doesn't matter, but you may want to check that the result
> of "make lib/lockref.s" looks ok.
> 
>  Linus


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Linus Torvalds
On Thu, Aug 29, 2013 at 7:30 PM, Benjamin Herrenschmidt
 wrote:
>
> Or we can keep both completely separate like Linus does on x86.

I did it that way mainly to minimize the patch.

I agree with you that it probably makes sense to layer them the other
way around from what Michael's patch did, iow implement
arch_spin_is_locked() in terms of arch_spin_value_unlocked().

That said, on power, you have that "ACCESS_ONCE()" implicit in the
*type*, not in the code, so an "arch_spinlock_t" is fundamentally
volatile in itself. It's one of the reasons I despise "volatile":
things like volatility are _not_ attributes of a variable or a type,
but of the code in question. Something can be volatile in one context,
but not in another (one context might be locked, for example).

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Benjamin Herrenschmidt
On Fri, 2013-08-30 at 12:06 +1000, Michael Neuling wrote:

> powerpc patch below. I'm using arch_spin_is_locked() to implement
> arch_spin_value_unlocked().

>  
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> + return !arch_spin_is_locked();
> +}
> +

Arguably, it should be done the other way around :-) 

arch_spin_value_unlocked semantics is to basically operate on an already
read copy of the value, while arch_spin_is_locked() has ACCESS_ONE
semantics on *top* of that.

Or we can keep both completely separate like Linus does on x86.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Linus Torvalds
On Thu, Aug 29, 2013 at 7:06 PM, Michael Neuling  wrote:
>
> Running on a POWER7 here with 32 threads (8 cores x 4 threads) I'm
> getting some good improvements:

That's *much* better than I get. But I literally just have a single
socket with two cores (and HT, so four threads) in my test machine, so
I really have a hard time getting any real contention. And the main
advantage of the patch should be when you actually have CPU's spinning
on that dentry d_lock.

Also, on x86, there are no advantages to cmpxchg over a spinlock -
they are both exactly one equally serializing instruction. If
anything, cmpxchg is worse due to having a cache read before the
write, and a few cycles slower anyway. So I actually expect the x86
code to slow down a tiny bit for the single-threaded case, although
that should be hopefully unmeasurable.

On POWER, you may have much less serialization for the cmpxchg. That
may sadly be something we'll need to fix - the serialization between
getting a lockref and checking sequence counts etc may need some extra
work.

So it may be that you are seeing unrealistically good numbers, and
that we will need to add a memory barrier or two. On x86, due to the
locked instruction semantics, that just isn't an issue.

> The numbers move around about 10% from run to run.

Please note that the whole "dentry hash chains may be better" for one
run vs another, and that's something that will _persist_ between
subsequent runs, so you may see "only 10% variability", but there may
be a bigger picture variability that you're not noticing because you
had to reboot in between.

To be really comparable, you should really run the stupid benchmark
after fairly equal boot up sequences. If the machine had been up for
several days for one set of numbers, and freshly rebooted for the
other, it can be a very unfair comparison.

(I long ago had a nice "L1 dentry cache" patch that helped with the
fact that the dentry chains *can* get long especially if you have tons
of memory, and that helped with this kind of variability a lot - and
improved performance too. It was slightly racy, though, which is why
it never got merged).

> powerpc patch below. I'm using arch_spin_is_locked() to implement
> arch_spin_value_unlocked().

Your "slock" is of type "volatile unsigned int slock", so it may well
cause those temporaries to be written to memory.

It probably doesn't matter, but you may want to check that the result
of "make lib/lockref.s" looks ok.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] drivers:net: Convert dma_alloc_coherent(...__GFP_ZERO) to dma_zalloc_coherent

2013-08-29 Thread David Miller
From: Joe Perches 
Date: Mon, 26 Aug 2013 22:45:23 -0700

> __GFP_ZERO is an uncommon flag and perhaps is better
> not used.  static inline dma_zalloc_coherent exists
> so convert the uses of dma_alloc_coherent with __GFP_ZERO
> to the more common kernel style with zalloc.
> 
> Remove memset from the static inline dma_zalloc_coherent
> and add just one use of __GFP_ZERO instead.
> 
> Trivially reduces the size of the existing uses of
> dma_zalloc_coherent.
> 
> Realign arguments as appropriate.
> 
> Signed-off-by: Joe Perches 

Applied, thanks a lot Joe.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] mm/arch: use NUMA_NODE

2013-08-29 Thread Jianguo Wu
Use more appropriate NUMA_NO_NODE instead of -1 in some archs' module_alloc()

Signed-off-by: Jianguo Wu 
---
 arch/arm/kernel/module.c|2 +-
 arch/arm64/kernel/module.c  |2 +-
 arch/mips/kernel/module.c   |2 +-
 arch/parisc/kernel/module.c |2 +-
 arch/s390/kernel/module.c   |2 +-
 arch/sparc/kernel/module.c  |2 +-
 arch/x86/kernel/module.c|2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 85c3fb6..8f4cff3 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -40,7 +40,7 @@
 void *module_alloc(unsigned long size)
 {
return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-   GFP_KERNEL, PAGE_KERNEL_EXEC, -1,
+   GFP_KERNEL, PAGE_KERNEL_EXEC, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 #endif
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index ca0e3d5..8f898bd 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -29,7 +29,7 @@
 void *module_alloc(unsigned long size)
 {
return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-   GFP_KERNEL, PAGE_KERNEL_EXEC, -1,
+   GFP_KERNEL, PAGE_KERNEL_EXEC, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 
diff --git a/arch/mips/kernel/module.c b/arch/mips/kernel/module.c
index 977a623..b507e07 100644
--- a/arch/mips/kernel/module.c
+++ b/arch/mips/kernel/module.c
@@ -46,7 +46,7 @@ static DEFINE_SPINLOCK(dbe_lock);
 void *module_alloc(unsigned long size)
 {
return __vmalloc_node_range(size, 1, MODULE_START, MODULE_END,
-   GFP_KERNEL, PAGE_KERNEL, -1,
+   GFP_KERNEL, PAGE_KERNEL, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 #endif
diff --git a/arch/parisc/kernel/module.c b/arch/parisc/kernel/module.c
index 2a625fb..50dfafc 100644
--- a/arch/parisc/kernel/module.c
+++ b/arch/parisc/kernel/module.c
@@ -219,7 +219,7 @@ void *module_alloc(unsigned long size)
 * init_data correctly */
return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
GFP_KERNEL | __GFP_HIGHMEM,
-   PAGE_KERNEL_RWX, -1,
+   PAGE_KERNEL_RWX, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c
index 7845e15..b89b591 100644
--- a/arch/s390/kernel/module.c
+++ b/arch/s390/kernel/module.c
@@ -50,7 +50,7 @@ void *module_alloc(unsigned long size)
if (PAGE_ALIGN(size) > MODULES_LEN)
return NULL;
return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-   GFP_KERNEL, PAGE_KERNEL, -1,
+   GFP_KERNEL, PAGE_KERNEL, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 #endif
diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
index 4435488..97655e0 100644
--- a/arch/sparc/kernel/module.c
+++ b/arch/sparc/kernel/module.c
@@ -29,7 +29,7 @@ static void *module_map(unsigned long size)
if (PAGE_ALIGN(size) > MODULES_LEN)
return NULL;
return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
-   GFP_KERNEL, PAGE_KERNEL, -1,
+   GFP_KERNEL, PAGE_KERNEL, NUMA_NO_NODE,
__builtin_return_address(0));
 }
 #else
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 216a4d7..18be189 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -49,7 +49,7 @@ void *module_alloc(unsigned long size)
return NULL;
return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL_EXEC,
-   -1, __builtin_return_address(0));
+   NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 #ifdef CONFIG_X86_32
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Michael Neuling
> Anyway, I'm attaching my completely mindless test program. It has
> hacky things like "unsigned long count[MAXTHREADS][32]" which are
> purely to just spread out the counts so that they aren't in the same
> cacheline etc.
> 
> Also note that the performance numbers it spits out depend a lot on
> tings like how long the dcache hash chains etc are, so they are not
> really reliable. Running the test-program right after reboot when the
> dentries haven't been populated can result in much higher numbers -
> without that having anything to do with contention or locking at all.

Running on a POWER7 here with 32 threads (8 cores x 4 threads) I'm
getting some good improvements:

  Without patch:
# ./t
Total loops: 3730618

  With patch:
# ./t
Total loops: 16826271

The numbers move around about 10% from run to run.  I didn't change your
program at all, so it's still running with MAXTHREADS 16.

powerpc patch below. I'm using arch_spin_is_locked() to implement
arch_spin_value_unlocked().

Mikey

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9cf59816d..4a3f86b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -139,6 +139,7 @@ config PPC
select OLD_SIGSUSPEND
select OLD_SIGACTION if PPC32
select HAVE_DEBUG_STACKOVERFLOW
+   select ARCH_USE_CMPXCHG_LOCKREF
 
 config EARLY_PRINTK
bool
diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 5b23f91..65c25272 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -156,6 +156,11 @@ extern void arch_spin_unlock_wait(arch_spinlock_t *lock);
do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0)
 #endif
 
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+   return !arch_spin_is_locked();
+}
+
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8] rcu: eliminate deadlock for rcu read site

2013-08-29 Thread Paul E. McKenney
On Mon, Aug 26, 2013 at 10:39:32AM +0800, Lai Jiangshan wrote:
> On 08/26/2013 01:43 AM, Paul E. McKenney wrote:
> > On Sun, Aug 25, 2013 at 11:19:37PM +0800, Lai Jiangshan wrote:
> >> Hi, Steven
> >>
> >> Any comments about this patch?
> > 
> > For whatever it is worth, it ran without incident for two hours worth
> > of rcutorture on my P5 test (boosting but no CPU hotplug).
> > 
> > Lai, do you have a specific test for this patch?  
> 
> Also rcutorture.
> (A special module is added to ensure all paths of my code are covered.)

OK, good!  Could you please send along your rcutorture changes as well?

Also, it would be good to have Steven Rostedt's Acked-by or Reviewed-by.

> > Your deadlock
> > scenario looks plausible, but is apparently not occurring in the
> > mainline kernel.
> 
> Yes, you can leave this possible bug until the real problem happens
> or just disallow overlapping.
> I can write some debug code for it which allow us find out
> the problems earlier.
> 
> I guess this is an useful usage pattern of rcu:
> 
> again:
>   rcu_read_lock();
>   obj = read_dereference(ptr);
>   spin_lock_XX(obj->lock);
>   if (obj is invalid) {
>   spin_unlock_XX(obj->lock);
>   rcu_read_unlock();
>   goto again;
>   }
>   rcu_read_unlock();
>   # use obj
>   spin_unlock_XX(obj->lock);
> 
> If we encourage this pattern, we should fix all the related problems.

Given that I have had to ask people to move away from this pattern,
it would be good to allow it to work.  The transformation to currently
permitted usage is as follows, for whatever it is worth:

again:
disable_XX();
rcu_read_lock();
obj = read_dereference(ptr);
spin_lock(obj->lock);
if (obj is invalid) {
spin_unlock_XX(obj->lock);
rcu_read_unlock();
goto again;
}
rcu_read_unlock();
# use obj
spin_unlock_XX(obj->lock);

In mainline, this prevents preemption within the RCU read-side critical
section, avoiding the problem.

That said, if we allow your original pattern, that would be even better!

Thanx, Paul

> Thanks,
> Lai
> 
> > 
> > Thanx, Paul
> > 
> >> Thanks,
> >> Lai
> >>
> >>
> >> On Fri, Aug 23, 2013 at 2:26 PM, Lai Jiangshan  
> >> wrote:
> >>
> >>> [PATCH] rcu/rt_mutex: eliminate a kind of deadlock for rcu read site
> >>>
> >>> Current rtmutex's lock->wait_lock doesn't disables softirq nor irq, it 
> >>> will
> >>> cause rcu read site deadlock when rcu overlaps with any
> >>> softirq-context/irq-context lock.
> >>>
> >>> @L is a spinlock of softirq or irq context.
> >>>
> >>> CPU1cpu2(rcu boost)
> >>> rcu_read_lock() rt_mutext_lock()
> >>>   raw_spin_lock(lock->wait_lock)
> >>> spin_lock_XX(L)>>> irq>
> >>> rcu_read_unlock() do_softirq()
> >>>   rcu_read_unlock_special()
> >>> rt_mutext_unlock()
> >>>   raw_spin_lock(lock->wait_lock)spin_lock_XX(L)  **DEADLOCK**
> >>>
> >>> This patch fixes this kind of deadlock by removing rt_mutext_unlock() from
> >>> rcu_read_unlock(), new rt_mutex_rcu_deboost_unlock() is called instead.
> >>> Thus rtmutex's lock->wait_lock will not be called from rcu_read_unlock().
> >>>
> >>> This patch does not eliminate all kinds of rcu-read-site deadlock,
> >>> if @L is a scheduler lock, it will be deadlock, we should apply Paul's 
> >>> rule
> >>> in this case.(avoid overlapping or preempt_disable()).
> >>>
> >>> rt_mutex_rcu_deboost_unlock() requires the @waiter is queued, so we
> >>> can't directly call rt_mutex_lock() in the rcu_boost thread,
> >>> we split rt_mutex_lock() into two steps just like pi-futex.
> >>> This result a internal state in rcu_boost thread and cause
> >>> rcu_boost thread a bit more complicated.
> >>>
> >>> Thanks
> >>> Lai
> >>>
> >>> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> >>> index 5cd0f09..8830874 100644
> >>> --- a/include/linux/init_task.h
> >>> +++ b/include/linux/init_task.h
> >>> @@ -102,7 +102,7 @@ extern struct group_info init_groups;
> >>>
> >>>  #ifdef CONFIG_RCU_BOOST
> >>>  #define INIT_TASK_RCU_BOOST()  \
> >>> -   .rcu_boost_mutex = NULL,
> >>> +   .rcu_boost_waiter = NULL,
> >>>  #else
> >>>  #define INIT_TASK_RCU_BOOST()
> >>>  #endif
> >>> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >>> index e9995eb..1eca99f 100644
> >>> --- a/include/linux/sched.h
> >>> +++ b/include/linux/sched.h
> >>> @@ -1078,7 +1078,7 @@ struct task_struct {
> >>> struct rcu_node *rcu_blocked_node;
> >>>  #endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> >>>  #ifdef CONFIG_RCU_BOOST
> >>> -   struct rt_mutex *rcu_boost_mutex;
> >>> +   struct rt_mutex_waiter *rcu_boost_waiter;

Re: [PATCH] mmc/dw_mmc: Add support for ARC

2013-08-29 Thread Chris Ball
Hi,

On Thu, Aug 29 2013, Seungwon Jeon wrote:
> On Thu, August 29, 2013, Mischa Jonker wrote:
>> Adapt Kconfig to include ARC in supported architectures
>> 
>> Signed-off-by: Mischa Jonker 
> Acked-by: Seungwon Jeon 

Thanks, pushed to mmc-next for 3.12.

- Chris.
-- 
Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Avoid useless inodes and dentries reclamation

2013-08-29 Thread Dave Chinner
On Thu, Aug 29, 2013 at 11:36:10AM -0700, Dave Hansen wrote:
> The new shrinker infrastructure in mmotm looks like it will make this
> problem worse.
> 
> old code:
> shrink_slab()
>   for_each_shrinker {
>   do_shrinker_shrink(); // one per batch
>   prune_super()
>   grab_super_passive()
>   }
> }

I think you've simplified it down too far. The current code does:

for_each_shrinker {
max_pass = do_shrinker_shrink(0);
// ^^ does grab_super_passive()

while(total_scan >= batch_size) {
do_shrinker_shrink(0)
// ^^ does grab_super_passive()
do_shrinker_shrink(batch_size)
// ^^ does grab_super_passive()
}
}

> Which means we've got at _most_ one grab_super_passive() per batch.

No, there's two. one count, one scan per batch.

> The new code is something like this:
>
> shrink_slab()
> {
>   list_for_each_entry(shrinker, _list, list) {
> for_each_node_mask(... shrinkctl->nodes_to_scan) {
>   shrink_slab_node()
>   }
>   }
> }

Right, but what you are missing here is that the nodemask passed in
to shrink_slab() only has a single node bit set during reclaim -
the bit that matches the zone being reclaimed from.

drop_slab(), OTOH, does:

nodes_setall(shrink.nodes_to_scan);

before calling shrink_slab in a loopi because it's trying to free
*everything*, and that's why the shrink_slab() code handles that
case.

> shrink_slab_node()
> {
> max_pass = shrinker->count_objects(shrinker, shrinkctl);
>   // ^^ does grab_super_passive()
>   ...
>   while (total_scan >= batch_size) {
>   ret = shrinker->scan_objects(shrinker, shrinkctl);
>   // ^^ does grab_super_passive()
>   }
> }
> 
> We've got an extra grab_super_passive()s in the case where we are
> actually doing a scan, plus we've got the extra for_each_node_mask()
> loop.  That means even more lock acquisitions in the multi-node NUMA
> case, which is exactly where we want to get rid of global lock acquisitions.

I disagree.  With direct memory reclaim, we have an identical number
of calls to shrink_slab() occurring, and each target a single node.
hence there is typically a 1:1 call ratio for
shrink_slab:shrink_slab_node. An because shrink_slab_node() has one
less callout per batch iteration, there is an overall reduction in
the number of grab_super_passive calls from the shrinker. Worst case
is no change, best case is a 50% reduction in the number of calls.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] include/asm-generic/gpio.h: remove the call for __gpio_get_value() and __gpio_set_value() when GPIOLIB disabled

2013-08-29 Thread Chen Gang
On 08/29/2013 08:08 PM, Linus Walleij wrote:
> On Mon, Aug 26, 2013 at 12:08 PM, Chen Gang  wrote:
> 
>> When GPIOLIB disabled, __gpio_get_value() and __gpio_set_value() will
>> not implement, so need remove them, or compiling fails.
>>
>> e.g. (allmodconfig for h8300)
>>
>> CC  arch/h8300/kernel/h8300_ksyms.o
>>   In file included from arch/h8300/include/generated/asm/gpio.h:1:0,
>>from arch/h8300/kernel/h8300_ksyms.c:17:
>>   include/asm-generic/gpio.h: In function 'gpio_get_value_cansleep':
>>   include/asm-generic/gpio.h:270:2: error: implicit declaration of function 
>> '__gpio_get_value' [-Werror=implicit-function-declaration]
>> return __gpio_get_value(gpio);
>> ^
>>
>> For __gpio_get_value(), according to its implementation, it is enough
>> to use "return 0" instead of, and for __gpio_set_value(), just remove
>> directly.
>>
>>
>> Signed-off-by: Chen Gang 
> 
> NAK, this is not how you do it. This fallback path is for GENERIC_GPIO
> without GPIOLIB. Including it from a file indicates that you *are*
> using GENERIC_GPIO when GPIOLIB is not activated. It can not be
> used to stub out gpiolib.
> 

Hmm... what you said above sounds reasonable to me (at least, it can be
acceptable).


> This is a better alternative: let h8300 select
> ARCH_HAVE_CUSTOM_GPIO_H
> 
> This is quite common:
> $ git grep ARCH_HAVE_CUSTOM_GPIO_H
> arch/arm/Kconfig:   select ARCH_HAVE_CUSTOM_GPIO_H
> arch/avr32/Kconfig: select ARCH_HAVE_CUSTOM_GPIO_H
> arch/blackfin/Kconfig:  select ARCH_HAVE_CUSTOM_GPIO_H
> arch/m68k/Kconfig.cpu:  select ARCH_HAVE_CUSTOM_GPIO_H
> arch/mips/Kconfig:  select ARCH_HAVE_CUSTOM_GPIO_H
> arch/sh/Kconfig:select ARCH_HAVE_CUSTOM_GPIO_H
> arch/unicore32/Kconfig: select ARCH_HAVE_CUSTOM_GPIO_H
> 
> Then put your stubs for __gpio_get_value() etc in
> arch/h8300/include/asm/gpio.h
> 
> This way the h8300 will have a stub implementation if GPIOLIB
> is not selected.
> 
> Be sure to put a comment about this in that file.
> 

That sounds a standard way for it, thanks.

Hmm... but for h8300, it seems it may not include "gpio.h" which is just
discussing about it with Geert in another thread (it seems what he said
is correct, and now I am just proving it).

Thanks.


> Note: I'm still a bit rookie as GPIO maintainer so if Grant or
> Russell tells me I'm telling you wrong things: listen to them.
> 

Thank you for your modesty and honesty.

And welcome other members' suggestions or completions.


Thanks.

> Yours,
> Linus Walleij
> 
> 


-- 
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: Bug in error handling in gpiolib.c

2013-08-29 Thread Alexandre Courbot
On Fri, Aug 30, 2013 at 2:45 AM, Linus Walleij  wrote:
> On Tue, Aug 27, 2013 at 9:17 PM, Tim Bird  wrote:
>
>> Hi all,
>>
>> There appears to be a bug in the error handling in
>> drivers/gpi/gpiolib.c  In certain error cases
>> desc_to_gpio() is called to get the gpio number
>> for an error message, but this may happen on code
>> paths where desc->chip is NULL.  This causes a panic
>> on my system in gpiod_request(), as follows:
> (...)
>> Here's my patch:
>> Subject: [PATCH] gpio: avoid panic on NULL desc->chip in gpiod_request
>
> Patch applied. Unless we come up with something better,
> there is some parallel discussion on how to handle NULL
> descriptors.
>
> Alexandre: OK to apply this?

For this particular case I think the following would be preferable:

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index ff0fd65..d900bf1 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1398,7 +1398,7 @@ static int gpiod_request(struct gpio_desc *desc,
const char *label)
int status = -EPROBE_DEFER;
unsigned long   flags;

-   if (!desc) {
+   if (!desc || !desc->chip) {
pr_warn("%s: invalid GPIO\n", __func__);
return -EINVAL;
}
@@ -1406,8 +1406,6 @@ static int gpiod_request(struct gpio_desc *desc,
const char *label)
spin_lock_irqsave(_lock, flags);

chip = desc->chip;
-   if (chip == NULL)
-   goto done;

if (!try_module_get(chip->owner))
goto done;

Since we are going to fail because the chip is missing anyway, we can
as well do it from the start. A descriptor without a chip is invalid
anyway.

But this (and the other thread) stresses the fact that error handling
in gpiolib needs some more love. I'm convinced it can be simplified -
will try to look at it sometime soon.

Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 5/5] clk: dt: binding for basic gate clock

2013-08-29 Thread Haojian Zhuang
On 22 August 2013 13:53, Mike Turquette  wrote:
> Device Tree binding for the basic clock gate, plus the setup function to
> register the clock.  Based on the existing fixed-clock binding.
>
> A different approach to this was proposed in 2012[1] and a similar
> binding was proposed more recently[2] if anyone wants some extra
> reading.
>
> [1] http://article.gmane.org/gmane.linux.documentation/5679
> [2] 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/137878.html
>
> Tero Kristo contributed helpful bug fixes to this patch.
>
> Signed-off-by: Mike Turquette 
> Tested-by: Heiko Stuebner 
> Reviewed-by: Heiko Stuebner 
> ---
> Changes since v3:
> * replaced underscores with dashes in DT property names
> * bail from of clock setup function early if of_iomap fails
> * removed unecessary explict cast
>
>  .../devicetree/bindings/clock/gate-clock.txt   | 36 +
>  drivers/clk/clk-gate.c | 47 
> ++
>  include/linux/clk-provider.h   |  2 +
>  3 files changed, 85 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/clock/gate-clock.txt
>
> diff --git a/Documentation/devicetree/bindings/clock/gate-clock.txt 
> b/Documentation/devicetree/bindings/clock/gate-clock.txt
> new file mode 100644
> index 000..1c0d4d5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/clock/gate-clock.txt
> @@ -0,0 +1,36 @@
> +Binding for simple gate clock.
> +
> +This binding uses the common clock binding[1].  It assumes a
> +register-mapped clock gate controlled by a single bit that has only one
> +input clock or parent.  By default setting the specified bit gates the
> +clock signal and clearing the bit ungates it.
> +
> +The binding must provide the register to control the gate and the bit
> +shift for the corresponding gate control bit. Some clocks set the bit to
> +gate the clock signal, and clear it to ungate the clock signal. The
> +optional "set-bit-to-disable" specifies this behavior.
> +
> +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt
> +
> +Required properties:
> +- compatible : shall be "gate-clock".
> +- #clock-cells : from common clock binding; shall be set to 0.
> +- clocks : link to phandle of parent clock
> +- reg : base address for register controlling adjustable gate
> +- bit-shift : bit shift for programming the clock gate
> +
> +Optional properties:
> +- clock-output-names : from common clock binding.
> +- set-bit-to-disable : inverts default gate programming. Setting the bit
> +  gates the clock and clearing the bit ungates the clock.
> +- hiword-mask : lower half of the register controls the gate, upper half
> +  of the register indicates bits that were updated in the lower half
> +
> +Examples:
> +   clock_foo: clock_foo@4a008100 {
> +   compatible = "gate-clock";
> +   #clock-cells = <0>;
> +   clocks = <_bar>;
> +   reg = <0x4a008100 0x4>
> +   bit-shift = <3>

There's some argument on my clock binding patch set of Hi3620.

I defined each clock as one clock node and some of them are sharing one
register. Stephen attacked on this since it means multiple clock node sharing
one register.

Now the same thing also exists in Mike's patch. Mike's patch could also
support this behavior. And it's very common that one register is sharing among
multiple clocks in every SoC. Which one should I follow?


> +   };
> diff --git a/drivers/clk/clk-gate.c b/drivers/clk/clk-gate.c
> index 2b28a00..63641c2 100644
> --- a/drivers/clk/clk-gate.c
> +++ b/drivers/clk/clk-gate.c
> @@ -15,6 +15,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>
>  /**
>   * DOC: basic gatable clock which can gate and ungate it's ouput
> @@ -162,3 +164,48 @@ struct clk *clk_register_gate(struct device *dev, const 
> char *name,
> return clk;
>  }
>  EXPORT_SYMBOL_GPL(clk_register_gate);
> +
> +#ifdef CONFIG_OF
> +/**
> + * of_gate_clk_setup() - Setup function for simple gate rate clock
> + */
> +void of_gate_clk_setup(struct device_node *node)
> +{
> +   struct clk *clk;
> +   const char *clk_name = node->name;
> +   void __iomem *reg;
> +   const char *parent_name;
> +   u8 clk_gate_flags = 0;
> +   u32 bit_idx = 0;
> +
> +   of_property_read_string(node, "clock-output-names", _name);
> +
> +   parent_name = of_clk_get_parent_name(node, 0);
> +
> +   reg = of_iomap(node, 0);

I suggest not using of_iomap for each clock node.

If each clock is one node, it means hundreds of clock node existing in
device tree. Hundreds of mapping page only cost unnecessary mapping.

Maybe we can resolve it by this way.

DTS file:
clock register bank {
reg = <{start} {size}>;
#address-cells = <1>;
#size-cells = <0>; /* each clock only
exists in one register */

clock node {
compatible = 

Re: [PATCH] Avoid useless inodes and dentries reclamation

2013-08-29 Thread Dave Chinner
On Thu, Aug 29, 2013 at 11:07:56AM -0700, Tim Chen wrote:
> 
> > > Signed-off-by: Tim Chen 
> > > ---
> > >  fs/super.c | 8 
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/fs/super.c b/fs/super.c
> > > index 68307c0..70fa26c 100644
> > > --- a/fs/super.c
> > > +++ b/fs/super.c
> > > @@ -53,6 +53,7 @@ static char *sb_writers_name[SB_FREEZE_LEVELS] = {
> > >   * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence 
> > > we
> > >   * take a passive reference to the superblock to avoid this from 
> > > occurring.
> > >   */
> > > +#define SB_CACHE_LOW 5
> > >  static int prune_super(struct shrinker *shrink, struct shrink_control 
> > > *sc)
> > >  {
> > >   struct super_block *sb;
> > > @@ -68,6 +69,13 @@ static int prune_super(struct shrinker *shrink, struct 
> > > shrink_control *sc)
> > >   if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
> > >   return -1;
> > >  
> > > + /*
> > > +  * Don't prune if we have few cached objects to reclaim to
> > > +  * avoid useless sb_lock contention
> > > +  */
> > > + if ((sb->s_nr_dentry_unused + sb->s_nr_inodes_unused) <= SB_CACHE_LOW)
> > > + return -1;
> > 
> > Those counters no longer exist in the current mmotm tree and the
> > shrinker infrastructure is somewhat different, so this patch isn't
> > the right way to solve this problem.
> 
> These changes in mmotm tree do complicate solutions for this
> scalability issue.
> 
> > 
> > Given that superblock LRUs and shrinkers in mmotm are node aware,
> > there may even be more pressure on the sblock in such a workload.  I
> > think the right way to deal with this is to give the shrinker itself
> > a "minimum call count" so that we can avoid even attempting to
> > shrink caches that does have enough entries in them to be worthwhile
> > shrinking.
> 
> By "minimum call count", do you mean tracking the number of free
> entries per node in the shrinker, and invoking shrinker 
> only when the number of free entries
> exceed "minimum call count"?

The new shrinker infrastructure has a ->count_objects() callout to
specifically return the number of objects in the cache.
shrink_slab_node() can check that return value against the "minimum
call count" and determine whether it needs to call ->scan_objects()
at all.

Actually, the shrinker already behaves like this with the batch_size
variable - the shrinker has to be asking for more items to be
scanned than the batch size. That means the problem is that counting
callouts are causing the problem, not the scanning callouts.

With the new code in the mmotm tree, for counting purposes we
probably don't need to grab a passive superblock reference at all -
the superblock and LRUs are guaranteed to be valid if the shrinker
is currently running, but we don't really care if the superblock is
being shutdown and the values that come back are invalid because the
->scan_objects() callout will fail to grab the superblock to do
anything with the calculated values.

Indeed, I've made no attempt to optimise the code in the mmotm tree
- I've been concerned with correctness. The fact that without any
optimisation it significantly lessens contention in my testing has
been sufficient to move forward with.

> There is some cost in
> list_lru_count_node to get the free entries, as we need
> to acquire the node's lru lock.

Right, but we really don't need the node's lru lock to get the count
- reading the count is racy from an external perspective, anyway, so
we can do lockless counting here.  See this patch I proposed a
while back, for example:

https://lkml.org/lkml/2013/7/31/7

> > That said, the memcg guys have been saying that even small numbers
> > of items per cache can be meaningful in terms of memory reclaim
> > (e.g. when there are lots of memcgs) then such a threshold might
> > only be appropriate for caches that are not memcg controlled. 
> 
> I've done some experiment with the CACHE thresholds.  Even setting
> the threshold at 0 (i.e. there're no free entries) remove almost all 
> the needless contentions.  That should make the memcg guys happy by
> not holding any extra free entries.

Ok, that's good to know, and it further indicates that it is the
->count_objects() callout that is the issue, not the
scanning/freeing.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] x86/srat: use NUMA_NO_NODE

2013-08-29 Thread Jianguo Wu
setup_node() return NUMA_NO_NODE or valid node id(>=0), So use more appropriate
"if (node == NUMA_NO_NODE)" instead of "if (node < 0)"

Signed-off-by: Jianguo Wu 
---
 arch/x86/mm/srat.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index cdd0da9..97fea4e 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -76,7 +76,7 @@ acpi_numa_x2apic_affinity_init(struct 
acpi_srat_x2apic_cpu_affinity *pa)
return;
}
node = setup_node(pxm);
-   if (node < 0) {
+   if (node == NUMA_NO_NODE) {
printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
bad_srat();
return;
@@ -112,7 +112,7 @@ acpi_numa_processor_affinity_init(struct 
acpi_srat_cpu_affinity *pa)
if (acpi_srat_revision >= 2)
pxm |= *((unsigned int*)pa->proximity_domain_hi) << 8;
node = setup_node(pxm);
-   if (node < 0) {
+   if (node == NUMA_NO_NODE) {
printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
bad_srat();
return;
@@ -164,7 +164,7 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
pxm &= 0xff;
 
node = setup_node(pxm);
-   if (node < 0) {
+   if (node == NUMA_NO_NODE) {
printk(KERN_ERR "SRAT: Too many proximity domains.\n");
goto out_err_bad_srat;
}
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ARM: SoC fixes for 3.11

2013-08-29 Thread Olof Johansson
Hi Linus,

The following changes since commit 30ca2226bea6f0db519dc53381b893cd66cb5b66:

  ARM: tegra: always enable USB VBUS regulators (2013-08-21 21:36:19 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/fixes-for-linus

for you to fetch changes up to f8ab658b5da3fd11893cad085e0e21b67987c10b:

  arm: prima2: drop nr_irqs in mach as we moved to linear irqdomain (2013-08-29 
09:48:36 -0700)


ARM: SoC fixes for 3.11

Two straggling fixes that I had missed as they were posted a couple of
weeks ago, causing problems with interrupts (breaking them completely)
on the CSR SiRF platforms.


Barry Song (2):
  irqchip: sirf: move from legacy mode to linear irqdomain
  arm: prima2: drop nr_irqs in mach as we moved to linear irqdomain

 arch/arm/mach-prima2/common.c |  2 --
 drivers/irqchip/irq-sirfsoc.c | 18 ++
 2 files changed, 10 insertions(+), 10 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] mm/acpi: use NUMA_NO_NODE

2013-08-29 Thread Jianguo Wu
Use more appropriate NUMA_NO_NODE instead of -1

Signed-off-by: Jianguo Wu 
---
 drivers/acpi/acpi_memhotplug.c |2 +-
 drivers/acpi/numa.c|4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 999adb5..c00a3a7 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -281,7 +281,7 @@ static void acpi_memory_remove_memory(struct 
acpi_memory_device *mem_device)
if (!info->enabled)
continue;
 
-   if (nid < 0)
+   if (nid == NUMA_NO_NODE)
nid = memory_add_physaddr_to_nid(info->start_addr);
 
acpi_unbind_memory_blocks(info, handle);
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 33e609f..09f79a2 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -73,7 +73,7 @@ int acpi_map_pxm_to_node(int pxm)
 {
int node = pxm_to_node_map[pxm];
 
-   if (node < 0) {
+   if (node == NUMA_NO_NODE) {
if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
return NUMA_NO_NODE;
node = first_unset_node(nodes_found_map);
@@ -334,7 +334,7 @@ int acpi_get_pxm(acpi_handle h)
 
 int acpi_get_node(acpi_handle *handle)
 {
-   int pxm, node = -1;
+   int pxm, node = NUMA_NO_NODE;
 
pxm = acpi_get_pxm(handle);
if (pxm >= 0 && pxm < MAX_PXM_DOMAINS)
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unusually high system CPU usage with recent kernels

2013-08-29 Thread Paul E. McKenney
On Tue, Aug 27, 2013 at 10:05:42PM +0200, Tibor Billes wrote:
> > From: Paul E. McKenney Sent: 08/26/13 06:28 AM
> > On Sun, Aug 25, 2013 at 09:50:21PM +0200, Tibor Billes wrote:
> > > From: Paul E. McKenney Sent: 08/24/13 11:03 PM
> > > > On Sat, Aug 24, 2013 at 09:59:45PM +0200, Tibor Billes wrote:
> > > > > From: Paul E. McKenney Sent: 08/22/13 12:09 AM
> > > > > > On Wed, Aug 21, 2013 at 11:05:51PM +0200, Tibor Billes wrote:
> > > > > > > > From: Paul E. McKenney Sent: 08/21/13 09:12 PM
> > > > > > > > On Wed, Aug 21, 2013 at 08:14:46PM +0200, Tibor Billes wrote:
> > > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 11:43 PM
> > > > > > > > > > On Tue, Aug 20, 2013 at 10:52:26PM +0200, Tibor Billes 
> > > > > > > > > > wrote:
> > > > > > > > > > > > From: Paul E. McKenney Sent: 08/20/13 04:53 PM
> > > > > > > > > > > > On Tue, Aug 20, 2013 at 08:01:28AM +0200, Tibor Billes 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I was using the 3.9.7 stable release and tried to 
> > > > > > > > > > > > > upgrade to the 3.10.x series.
> > > > > > > > > > > > > The 3.10.x series was showing unusually high (>75%) 
> > > > > > > > > > > > > system CPU usage in some
> > > > > > > > > > > > > situations, making things really slow. The latest 
> > > > > > > > > > > > > stable I tried is 3.10.7.
> > > > > > > > > > > > > I also tried 3.11-rc5, they both show this behaviour. 
> > > > > > > > > > > > > This behaviour doesn't
> > > > > > > > > > > > > show up when the system is idling, only when doing 
> > > > > > > > > > > > > some CPU intensive work,
> > > > > > > > > > > > > like compiling with multiple threads. Compiling with 
> > > > > > > > > > > > > only one thread seems not
> > > > > > > > > > > > > to trigger this behaviour.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > To be more precise I did a `perf record -a` while 
> > > > > > > > > > > > > compiling a large C++ program
> > > > > > > > > > > > > with scons using 4 threads, the result is appended at 
> > > > > > > > > > > > > the end of this email.
> > > > > > > > > > > > 
> > > > > > > > > > > > New one on me! You are running a mainstream system 
> > > > > > > > > > > > (x86_64), so I am
> > > > > > > > > > > > surprised no one else noticed.
> > > > > > > > > > > > 
> > > > > > > > > > > > Could you please send along your .config file?
> > > > > > > > > > > 
> > > > > > > > > > > Here it is
> > > > > > > > > > 
> > > > > > > > > > Interesting. I don't see RCU stuff all that high on the 
> > > > > > > > > > list, but
> > > > > > > > > > the items I do see lead me to suspect RCU_FAST_NO_HZ, which 
> > > > > > > > > > has some
> > > > > > > > > > relevance to the otherwise inexplicable group of commits 
> > > > > > > > > > you located
> > > > > > > > > > with your bisection. Could you please rerun with 
> > > > > > > > > > CONFIG_RCU_FAST_NO_HZ=n?
> > > > > > > > > > 
> > > > > > > > > > If that helps, there are some things I could try.
> > > > > > > > > 
> > > > > > > > > It did help. I didn't notice anything unusual when running 
> > > > > > > > > with CONFIG_RCU_FAST_NO_HZ=n.
> > > > > > > > 
> > > > > > > > Interesting. Thank you for trying this -- and we at least have a
> > > > > > > > short-term workaround for this problem. I will put a patch 
> > > > > > > > together
> > > > > > > > for further investigation.
> > > > > > > 
> > > > > > > I don't specifically need this config option so I'm fine without 
> > > > > > > it in
> > > > > > > the long term, but I guess it's not supposed to behave like that.
> > > > > > 
> > > > > > OK, good, we have a long-term workload for your specific case,
> > > > > > even better. ;-)
> > > > > > 
> > > > > > But yes, there are situations where RCU_FAST_NO_HZ needs to work
> > > > > > a bit better. I hope you will bear with me with a bit more
> > > > > > testing...
> > > > > >
> > > > > > > > In the meantime, could you please tell me how you were measuring
> > > > > > > > performance for your kernel builds? Wall-clock time required to 
> > > > > > > > complete
> > > > > > > > one build? Number of builds completed per unit time? Something 
> > > > > > > > else?
> > > > > > > 
> > > > > > > Actually, I wasn't all this sophisticated. I have a system monitor
> > > > > > > applet on my top panel (using MATE, Linux Mint), four little 
> > > > > > > graphs,
> > > > > > > one of which shows CPU usage. Different colors indicate different 
> > > > > > > kind
> > > > > > > of CPU usage. Blue shows user space usage, red shows system 
> > > > > > > usage, and
> > > > > > > two more for nice and iowait. During a normal compile it's almost
> > > > > > > completely filled with blue user space CPU usage, only the top few
> > > > > > > pixels show some iowait and system usage. With 
> > > > > > > CONFIG_RCU_FAST_NO_HZ
> > > > > > > set, about 3/4 of the graph was red system CPU usage, the rest was
> > > > > > > blue. So I just looked for a pile of red on my graphs 

[PATCH 1/4] mm/vmalloc: use NUMA_NO_NODE

2013-08-29 Thread Jianguo Wu
Use more appropriate "if (node == NUMA_NO_NODE)" instead of "if (node < 0)"

Signed-off-by: Jianguo Wu 
---
 mm/vmalloc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 13a5495..f5483f8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1582,7 +1582,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, 
gfp_t gfp_mask,
struct page *page;
gfp_t tmp_mask = gfp_mask | __GFP_NOWARN;
 
-   if (node < 0)
+   if (node == NUMA_NO_NODE)
page = alloc_page(tmp_mask);
else
page = alloc_pages_node(node, tmp_mask, order);
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] Expand keyring capacity and provide support for libkrb5

2013-08-29 Thread James Morris
On Thu, 29 Aug 2013, David Howells wrote:

> James Morris  wrote:
> 
> > > Could you pull these patches into the security tree?
> > 
> > 944 files changed, 17114 insertions(+), 9157 deletions(-)
> 
> Ummm...  Where did that come from?  That doesn't look like what's in my
> tree...

Is your tree based on mine?


-- 
James Morris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bcache sleeps forever on random writes

2013-08-29 Thread kernel neophyte
Another one:

[ 3243.199791] kworker/u64:0   D 81813a60 0  1780  2 0x
[ 3243.199806] Workqueue: bch_btree_io btree_node_write_work
[ 3243.199810]  882e15ed9778 0046 882e15ed9738
882f88757058
[ 3243.199816]  882f884cb320 882e15ed9fd8 882e15ed9fd8
882e15ed9fd8
[ 3243.199821]  882fa6ae8000 882f884cb320 882f88757000
882c9b560d98
[ 3243.199827] Call Trace:
[ 3243.199840]  [] schedule+0x29/0x70
[ 3243.199845]  [] schedule_preempt_disabled+0xe/0x10
[ 3243.199851]  [] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.199857]  [] ? ata_scsiop_mode_sense+0x380/0x380
[ 3243.199862]  [] mutex_lock+0x2a/0x50
[ 3243.199867]  [] bch_mca_shrink+0x1b5/0x2f0
[ 3243.199874]  [] ? prune_super+0x162/0x1b0
[ 3243.199884]  [] shrink_slab+0x154/0x300
[ 3243.199891]  [] ? resched_task+0x68/0x70
[ 3243.199897]  [] ? check_preempt_curr+0x75/0xa0
[ 3243.199903]  [] ? fragmentation_index+0x19/0x70
[ 3243.199910]  [] do_try_to_free_pages+0x20f/0x4b0
[ 3243.199915]  [] try_to_free_pages+0xe4/0x1a0
[ 3243.199925]  [] __alloc_pages_nodemask+0x60c/0x9b0
[ 3243.199934]  [] alloc_pages_current+0xba/0x170
[ 3243.199940]  [] __get_free_pages+0xe/0x40
[ 3243.199946]  [] __btree_sort+0x48/0x230
[ 3243.199951]  [] ? __bch_btree_iter_init+0x7c/0xc0
[ 3243.199957]  [] bch_btree_sort_partial+0x101/0x120
[ 3243.199962]  [] ? __btree_node_write_done+0x100/0x100
[ 3243.199967]  [] bch_btree_sort_lazy+0x68/0x90
[ 3243.199971]  [] bch_btree_node_write+0x36a/0x4a0
[ 3243.199979]  [] ? idle_balance+0xeb/0x150
[ 3243.199986]  [] ? pwq_activate_delayed_work+0x4c/0xb0
[ 3243.11]  [] btree_node_write_work+0x57/0x80
[ 3243.15]  [] process_one_work+0x174/0x490
[ 3243.20]  [] worker_thread+0x11b/0x370
[ 3243.26]  [] ? manage_workers.isra.23+0x2d0/0x2d0
[ 3243.200011]  [] kthread+0xc0/0xd0
[ 3243.200016]  [] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200024]  [] ret_from_fork+0x7c/0xb0
[ 3243.200028]  [] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200034] INFO: task bcache_allocato:1868 blocked for more than
120 seconds.
[ 3243.200039] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200043] bcache_allocato D 0001 0  1868  2 0x
[ 3243.200048]  882f89f7fd88 0046 882f89f7fda8
810808ad
[ 3243.200053]  882fa3328000 882f89f7ffd8 882f89f7ffd8
882f89f7ffd8
[ 3243.200058]  882f89e13320 882fa3328000 882f865eb320
882c9b560d98
[ 3243.200064] Call Trace:
[ 3243.200069]  [] ? dequeue_task_fair+0x2cd/0x530
[ 3243.200075]  [] schedule+0x29/0x70
[ 3243.200080]  [] schedule_preempt_disabled+0xe/0x10
[ 3243.200085]  [] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.200090]  [] mutex_lock+0x2a/0x50
[ 3243.200095]  [] bch_allocator_thread+0x10f/0xe20
[ 3243.200100]  [] ? bch_bucket_add_unused+0xe0/0xe0
[ 3243.200104]  [] kthread+0xc0/0xd0
[ 3243.200108]  [] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200114]  [] ret_from_fork+0x7c/0xb0
[ 3243.200118]  [] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200123] INFO: task kworker/3:2:2548 blocked for more than 120 seconds.
[ 3243.200128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200132] kworker/3:2 D 81813d40 0  2548  2 0x
[ 3243.200142] Workqueue: bcache bch_data_insert_keys
[ 3243.200144]  882fa310b3d8 0046 882fa310b3f8
90108000
[ 3243.200150]  882f865eb320 882fa310bfd8 882fa310bfd8
882fa310bfd8
[ 3243.200155]  882fa66f9990 882f865eb320 882f865eb320
882c9b560d98
[ 3243.200160] Call Trace:
[ 3243.200166]  [] schedule+0x29/0x70
[ 3243.200171]  [] schedule_preempt_disabled+0xe/0x10
[ 3243.200175]  [] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.200180]  [] mutex_lock+0x2a/0x50
[ 3243.200185]  [] bch_mca_shrink+0x1b5/0x2f0
[ 3243.200190]  [] ? prune_super+0x162/0x1b0
[ 3243.200195]  [] shrink_slab+0x154/0x300
[ 3243.200200]  [] ? resched_task+0x68/0x70
[ 3243.200205]  [] ? check_preempt_curr+0x75/0xa0
[ 3243.200210]  [] ? fragmentation_index+0x19/0x70
[ 3243.200215]  [] do_try_to_free_pages+0x20f/0x4b0
[ 3243.200221]  [] try_to_free_pages+0xe4/0x1a0
[ 3243.200226]  [] ? zone_statistics+0x99/0xc0
[ 3243.200232]  [] __alloc_pages_nodemask+0x60c/0x9b0
[ 3243.200239]  [] alloc_pages_current+0xba/0x170
[ 3243.200244]  [] __get_free_pages+0xe/0x40
[ 3243.200249]  [] mca_data_alloc+0x73/0x1d0
[ 3243.200253]  [] mca_alloc+0x277/0x470
[ 3243.200258]  [] bch_btree_node_alloc+0x8c/0x1c0
[ 3243.200263]  [] ? __bch_bset_search+0x1d1/0x480
[ 3243.200270]  [] btree_node_alloc_replacement+0x2d/0x60
[ 3243.200275]  [] btree_split+0x7b/0x5c0
[ 3243.200282]  [] ? dequeue_entity+0x1a8/0x5c0
[ 3243.200287]  [] ? bch_keylist_pop_front+0x47/0x50
[ 3243.200293]  [] bch_btree_insert_node+0xb2/0x2f0
[ 3243.200297]  [] btree_insert_fn+0x28/0x50
[ 3243.200302]  [] bch_btree_map_nodes_recurse+0x6c/0x170
[ 3243.200307]  [] ? bch_btree_insert_node+0x2f0/0x2f0
[ 

Re: [REVIEW][PATCH 2/5] userns: Allow PR_CAPBSET_DROP in a user namespace.

2013-08-29 Thread Serge E. Hallyn
Quoting Eric W. Biederman (ebied...@xmission.com):
> 
> As the capabilites and capability bounding set are per user namespace
> properties it is safe to allow changing them with just CAP_SETPCAP
> permission in the user namespace.
> 
> Signed-off-by: "Eric W. Biederman" 
> Tested-by: Richard Weinberger 

Acked-by: Serge Hallyn 

> ---
>  security/commoncap.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/security/commoncap.c b/security/commoncap.c
> index c44b6fe..9fccf71 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -824,7 +824,7 @@ int cap_task_setnice(struct task_struct *p, int nice)
>   */
>  static long cap_prctl_drop(struct cred *new, unsigned long cap)
>  {
> - if (!capable(CAP_SETPCAP))
> + if (!ns_capable(current_user_ns(), CAP_SETPCAP))
>   return -EPERM;
>   if (!cap_valid(cap))
>   return -EINVAL;
> -- 
> 1.7.5.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REVIEW][PATCH 5/5] userns: Kill nsown_capable it makes the wrong thing easy

2013-08-29 Thread Serge E. Hallyn
Quoting Eric W. Biederman (ebied...@xmission.com):
> 
> nsown_capable is a special case of ns_capable essentially for just CAP_SETUID 
> and
> CAP_SETGID.  For the existing users it doesn't noticably simplify things and
> from the suggested patches I have seen it encourages people to do the wrong
> thing.  So remove nsown_capable.
> 
> Signed-off-by: "Eric W. Biederman" 

Yeah I've had the same thought before.  You rarely want nsown_capable, and
it wants to be mis-used.

Acked-by: Serge Hallyn 

> ---
>  fs/namespace.c |4 ++--
>  fs/open.c  |2 +-
>  include/linux/capability.h |1 -
>  ipc/namespace.c|2 +-
>  kernel/capability.c|   12 
>  kernel/groups.c|2 +-
>  kernel/pid_namespace.c |2 +-
>  kernel/sys.c   |   20 ++--
>  kernel/uid16.c |2 +-
>  kernel/utsname.c   |2 +-
>  net/core/net_namespace.c   |2 +-
>  net/core/scm.c |4 ++--
>  12 files changed, 21 insertions(+), 34 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 877e427..dc519a1 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2929,8 +2929,8 @@ static int mntns_install(struct nsproxy *nsproxy, void 
> *ns)
>   struct path root;
>  
>   if (!ns_capable(mnt_ns->user_ns, CAP_SYS_ADMIN) ||
> - !nsown_capable(CAP_SYS_CHROOT) ||
> - !nsown_capable(CAP_SYS_ADMIN))
> + !ns_capable(current_user_ns(), CAP_SYS_CHROOT) ||
> + !ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>   return -EPERM;
>  
>   if (fs->users != 1)
> diff --git a/fs/open.c b/fs/open.c
> index 9156cb0..2a57580 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -443,7 +443,7 @@ retry:
>   goto dput_and_out;
>  
>   error = -EPERM;
> - if (!nsown_capable(CAP_SYS_CHROOT))
> + if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT))
>   goto dput_and_out;
>   error = security_path_chroot();
>   if (error)
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index d9a4f7f..a6ee1f9 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -210,7 +210,6 @@ extern bool has_ns_capability_noaudit(struct task_struct 
> *t,
> struct user_namespace *ns, int cap);
>  extern bool capable(int cap);
>  extern bool ns_capable(struct user_namespace *ns, int cap);
> -extern bool nsown_capable(int cap);
>  extern bool inode_capable(const struct inode *inode, int cap);
>  extern bool file_ns_capable(const struct file *file, struct user_namespace 
> *ns, int cap);
>  
> diff --git a/ipc/namespace.c b/ipc/namespace.c
> index 7ee61bf..4be6581 100644
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -171,7 +171,7 @@ static int ipcns_install(struct nsproxy *nsproxy, void 
> *new)
>  {
>   struct ipc_namespace *ns = new;
>   if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN) ||
> - !nsown_capable(CAP_SYS_ADMIN))
> + !ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>   return -EPERM;
>  
>   /* Ditch state from the old ipc namespace */
> diff --git a/kernel/capability.c b/kernel/capability.c
> index f6c2ce5..6fc1c8a 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -433,18 +433,6 @@ bool capable(int cap)
>  EXPORT_SYMBOL(capable);
>  
>  /**
> - * nsown_capable - Check superior capability to one's own user_ns
> - * @cap: The capability in question
> - *
> - * Return true if the current task has the given superior capability
> - * targeted at its own user namespace.
> - */
> -bool nsown_capable(int cap)
> -{
> - return ns_capable(current_user_ns(), cap);
> -}
> -
> -/**
>   * inode_capable - Check superior capability over inode
>   * @inode: The inode in question
>   * @cap: The capability in question
> diff --git a/kernel/groups.c b/kernel/groups.c
> index 6b2588d..90cf1c3 100644
> --- a/kernel/groups.c
> +++ b/kernel/groups.c
> @@ -233,7 +233,7 @@ SYSCALL_DEFINE2(setgroups, int, gidsetsize, gid_t __user 
> *, grouplist)
>   struct group_info *group_info;
>   int retval;
>  
> - if (!nsown_capable(CAP_SETGID))
> + if (!ns_capable(current_user_ns(), CAP_SETGID))
>   return -EPERM;
>   if ((unsigned)gidsetsize > NGROUPS_MAX)
>   return -EINVAL;
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index 6917e8e..ee1f6bb 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -329,7 +329,7 @@ static int pidns_install(struct nsproxy *nsproxy, void 
> *ns)
>   struct pid_namespace *ancestor, *new = ns;
>  
>   if (!ns_capable(new->user_ns, CAP_SYS_ADMIN) ||
> - !nsown_capable(CAP_SYS_ADMIN))
> + !ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>   return -EPERM;
>  
>   /*
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 771129b..c18ecca 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c

Re: [-next] openvswitch BUILD_BUG_ON failed

2013-08-29 Thread Jesse Gross
On Thu, Aug 29, 2013 at 3:10 PM, David Miller  wrote:
> From: Jesse Gross 
> Date: Thu, 29 Aug 2013 14:42:22 -0700
>
>> On Thu, Aug 29, 2013 at 2:21 PM, Geert Uytterhoeven
>>  wrote:
>>> However, I have some doubts about other alignment "enforcements":
>>>
>>> "__aligned(__alignof__(long))" makes the whole struct aligned to the
>>> alignment rule for "long":
>>>1. This is only 2 bytes on m68k, i.e. != sizeof(long).
>>>2. This is 4 bytes on many 32-bit platforms, which may be less than the
>>>   default alignment for "__be64" (cfr. some members of struct
>>>   ovs_key_ipv4_tunnel), so this may make those 64-bit members unaligned.
>>
>> Do any of those 32-bit architectures actually care about alignment of
>> 64 bit values? On 32-bit x86, a long is 32 bits but the alignment
>> requirement of __be64 is also 32 bit.
>
> All except x86-32 do, it is in fact the odd man out with respect to this
> issue.

Thanks, good to know.

Andy, do you want to modify your patch to just drop the alignment
specification as Geert suggested (but definitely keep the new build
assert that you added)? It's probably better to just send the patch to
netdev (against net-next) as well since you'll likely get better
comments there and we can fix this faster if you cut out the
middleman.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make sure to wake reaper

2013-08-29 Thread Serge E. Hallyn
Quoting Eric W. Biederman (ebied...@xmission.com):
> Serge Hallyn  writes:
> 
> > Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
> > possible to get into a situation where a pidns reaper is
> > , reparented to host pid 1, but never reaped.  How to
> > reproduce this is documented at
> >
> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
> > (and see
> > https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
> > In short, run repeated starts of a container whose init is
> >
> > Process.exit(0);
> >
> > sysrq-t when such a task is playing zombie shows:
> >
> > [  131.132978] initx 88011fc14580 0  2084   2039 
> > 0x
> > [  131.132978]  880116e89ea8 0002 880116e89fd8 
> > 00014580
> > [  131.132978]  880116e89fd8 00014580 8801172a 
> > 8801172a
> > [  131.132978]  8801172a0630 88011729fff0 880116e14650 
> > 88011729fff0
> > [  131.132978] Call Trace:
> > [  131.132978]  [] schedule+0x29/0x70
> > [  131.132978]  [] do_exit+0x6e1/0xa40
> > [  131.132978]  [] ? signal_wake_up_state+0x1e/0x30
> > [  131.132978]  [] do_group_exit+0x3f/0xa0
> > [  131.132978]  [] SyS_exit_group+0x14/0x20
> > [  131.132978]  [] tracesys+0xe1/0xe6
> >
> > Further debugging showed that every time this happened, 
> > zap_pid_ns_processes()
> > started with nr_hashed being 3, while we were expecting it to drop to 2.
> > Any time it didn't happen, nr_hashed was 1 or 2.  So the reaper was
> > waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
> > if nr_hashed hits 1.  This patch makes free_pid() wake the reaper any
> > time the reaper is PF_EXITING, to force it to re-test the
> > pidns->nr_hashed = init_pids test.  Note that this is more like what
> > __unhash_process() used to do before
> > af4b8a83add95ef40716401395b44a1b579965f4.
> >
> > Signed-off-by: Serge Hallyn 
> > Cc: "Eric W. Biederman" 
> > ---
> >  kernel/pid.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/kernel/pid.c b/kernel/pid.c
> > index 0db3e79..6b312c4 100644
> > --- a/kernel/pid.c
> > +++ b/kernel/pid.c
> > @@ -274,6 +274,10 @@ void free_pid(struct pid *pid)
> > case 0:
> > schedule_work(>proc_work);
> > break;
> > +   default:
> > +   if (ns->child_reaper->flags & PF_EXITING)
> > +   wake_up_process(ns->child_reaper);
> > +   break;
> > }
> > }
> > spin_unlock_irqrestore(_lock, flags);
> 
> qSo I think the change that we actually want is just to send a wake-up
> when we have two pids in the pid namespace as well as one pid.
> 
> - That can send one extraneous wake-up but that is relatively harmless.

Would more than one extraneous wake-up be more harmful?

> - We can detect the condition race free.
> - With only two pids remaining we are guaranteed that which ever task is
>   the child_reaper will persist through zap_pid_ns_processes.

My problem is I don't really understand the assumptions behind nr_hashed.
I *thought* it was simply >1 if the init was threaded - but are threads
in init limited to 2?  Or am I totally wrong about what the 2 means?

If init *is* threaded, and the pid_ns->child_reaper exits but the other
thread is still alive, then find_new_reaper should set pid_ns->child_reaper
to the not-PF_EXITING task using

509 while_each_thread(father, thread) {
510 if (thread->flags & PF_EXITING)
511 continue;
512 if (unlikely(pid_ns->child_reaper == father))
513 pid_ns->child_reaper = thread;
514 return thread;
515 }

right?

Which seems to suggest that checking for pid_ns->child_reaper->flags &
PF_EXITING should always give us the right answer in free_pid().

>   There are 3 cases.
>   init-tgleader other -- Single threaded init so of course we won't free the 
> task
>   init-tgleader-dead init-thread -- The last living init thread will call 
> zap_pid_ns_processes.

right,

>   init-tgleader init-thread -- An init with two living threads child_reaper 
> must be the init thread group leader
> 
> Which means at the cost of an extra wake-up we are guaranteed not to
> have races.
> 
> Serge does that look good to you?

I may just need to spend a few hours going back over the old commits
and related email threads pertaining to multi-threaded inits.  I now
regret not having paid enough attention at the time :)

> diff --git a/kernel/pid.c b/kernel/pid.c
> index 17755ae..ab75add 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -265,6 +265,7 @@ void free_pid(struct pid *pid)
> struct pid_namespace *ns = upid->ns;
> hlist_del_rcu(>pid_chain);
> switch(--ns->nr_hashed) {
> +   case 2:
> case 1:
> /* When all 

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-08-29 Thread Linus Torvalds
On Thu, Aug 29, 2013 at 5:26 PM, Benjamin Herrenschmidt
 wrote:
>
> I assume you mean unsigned int ? :-)

Oops, yes.

> What's wrong with the existing arch_spin_is_locked() ?

It takes a memory location. And we very much want to test the value we
loaded into a register.

And yes, gcc can do the right thing. But at least on x86,
arch_spin_is_locked() actually uses ACCESS_ONCE() to load the value
from the memory location, and I actually think that is the right thing
to do (or at least not incorrect). So the end result is that
arch_spin_value_unlocked() is actually fairly fundamentally different
from arch_spin_is_locked().

So I could have re-used arch_spin_is_locked() after having changed the
semantics of it, but I really didn't want to possibly change totally
unrelated users for this particular feature.

> BTW. Do you have your test case at hand ?

My test-case is a joke. It's explicitly *trying* to get as much
contention as possible on a dentry, by just starting up a lot of
threads that look up one single pathname (the same one for everybody).
It defaults to using /tmp for this, but you can specify the filename.

Note that directories, regular files and symlinks have fundamentally
different dentry lookup behavior:

 - directories tend to have an elevated reference count (because they
have children). This was my primary test-case, because while I suspect
that there are crazy loads (and AIM7 may be one of them) that open the
same _regular_ file all concurrently, I don't think it's a "normal"
load). But opening the same directory concurrently as part of pathname
lookup is certainly normal.

 - regular files tend to have a dentry count of zero unless they are
actively open, and the patch I sent out will take the dentry spinlock
for them when doing the final RCU finishing touches if that's the
case. So this one *will* still use the per-dentry spinlock rather than
the lockless refcount increments, but as outlined above I don't think
that should be a scalability issue unless you're crazy.

 - symlink traveral causes us to drop out of RCU lookup mode, and thus
cause various slow-paths to happen. Some of that we can improve on,
but I suspect it will cause the lockless refcount paths to take a hit
too.

Anyway, I'm attaching my completely mindless test program. It has
hacky things like "unsigned long count[MAXTHREADS][32]" which are
purely to just spread out the counts so that they aren't in the same
cacheline etc.

Also note that the performance numbers it spits out depend a lot on
tings like how long the dcache hash chains etc are, so they are not
really reliable. Running the test-program right after reboot when the
dentries haven't been populated can result in much higher numbers -
without that having anything to do with contention or locking at all.

  Linus
#include 
#include 
#include 
#include 
#include 
#include 

#define MAXTHREADS 16

static volatile int start = 0;
static char *file = "/tmp";
static unsigned long count[MAXTHREADS][32];

void *start_routine(void *arg)
{
	const char *filename;
	struct stat st;
	unsigned long *counter = arg;

	pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
	while (!start)
		/* nothing */;
	filename = file;
	for (;;) {
		stat(filename, );
		++*counter;
	}
}

int main(int argc, char **argv)
{
	pthread_t threads[MAXTHREADS];
	unsigned long n;
	int i;

	if (argv[1])
		file = argv[1];
	for (i = 0; i < MAXTHREADS; i++)
		pthread_create(threads+i, NULL, start_routine, count[i]);
	start = 1;
	sleep(10);
	for (i = 0; i < MAXTHREADS; i++)
		pthread_cancel(threads[i]);
	for (i = 0; i < MAXTHREADS; i++)
		pthread_join(threads[i], NULL);
	n = 0;
	for (i = 0; i < MAXTHREADS; i++)
		n += count[i][0];
	printf("Total loops: %lu\n", n);
	return 0;
}


Re: [PATCH 1/1] [media] uvcvideo: quirk PROBE_DEF for Dell SP2008WFP monitor.

2013-08-29 Thread Laurent Pinchart
Hi Joseph,

Thank you for the patch.

On Thursday 29 August 2013 11:17:41 Joseph Salisbury wrote:
> BugLink: http://bugs.launchpad.net/bugs/1217957
> 
> Add quirk for Dell SP2008WFP monitor: 05a9:2641
> 
> Signed-off-by: Joseph Salisbury 
> Tested-by: Christopher Townsend 
> Cc: Laurent Pinchart 
> Cc: Mauro Carvalho Chehab 
> Cc: linux-me...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org

Acked-by: Laurent Pinchart 

I've applied it to my tree. Given that we're too close to the v3.12 merge 
window I will push it for v3.13.

> ---
>  drivers/media/usb/uvc/uvc_driver.c |9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/media/usb/uvc/uvc_driver.c
> b/drivers/media/usb/uvc/uvc_driver.c index ed123f4..8c1826c 100644
> --- a/drivers/media/usb/uvc/uvc_driver.c
> +++ b/drivers/media/usb/uvc/uvc_driver.c
> @@ -2174,6 +2174,15 @@ static struct usb_device_id uvc_ids[] = {
> .bInterfaceSubClass   = 1,
> .bInterfaceProtocol   = 0,
> .driver_info  = UVC_QUIRK_PROBE_DEF },
> + /* Dell SP2008WFP Monitor */
> + { .match_flags  = USB_DEVICE_ID_MATCH_DEVICE
> + | USB_DEVICE_ID_MATCH_INT_INFO,
> +   .idVendor = 0x05a9,
> +   .idProduct= 0x2641,
> +   .bInterfaceClass  = USB_CLASS_VIDEO,
> +   .bInterfaceSubClass   = 1,
> +   .bInterfaceProtocol   = 0,
> +   .driver_info  = UVC_QUIRK_PROBE_DEF },
>   /* Dell Alienware X51 */
>   { .match_flags  = USB_DEVICE_ID_MATCH_DEVICE
> 
>   | USB_DEVICE_ID_MATCH_INT_INFO,
-- 
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make sure to wake reaper

2013-08-29 Thread Eric W. Biederman
Serge Hallyn  writes:

> Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
> possible to get into a situation where a pidns reaper is
> , reparented to host pid 1, but never reaped.  How to
> reproduce this is documented at
>
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
> (and see
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
> In short, run repeated starts of a container whose init is
>
> Process.exit(0);
>
> sysrq-t when such a task is playing zombie shows:
>
> [  131.132978] initx 88011fc14580 0  2084   2039 
> 0x
> [  131.132978]  880116e89ea8 0002 880116e89fd8 
> 00014580
> [  131.132978]  880116e89fd8 00014580 8801172a 
> 8801172a
> [  131.132978]  8801172a0630 88011729fff0 880116e14650 
> 88011729fff0
> [  131.132978] Call Trace:
> [  131.132978]  [] schedule+0x29/0x70
> [  131.132978]  [] do_exit+0x6e1/0xa40
> [  131.132978]  [] ? signal_wake_up_state+0x1e/0x30
> [  131.132978]  [] do_group_exit+0x3f/0xa0
> [  131.132978]  [] SyS_exit_group+0x14/0x20
> [  131.132978]  [] tracesys+0xe1/0xe6
>
> Further debugging showed that every time this happened, zap_pid_ns_processes()
> started with nr_hashed being 3, while we were expecting it to drop to 2.
> Any time it didn't happen, nr_hashed was 1 or 2.  So the reaper was
> waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
> if nr_hashed hits 1.  This patch makes free_pid() wake the reaper any
> time the reaper is PF_EXITING, to force it to re-test the
> pidns->nr_hashed = init_pids test.  Note that this is more like what
> __unhash_process() used to do before
> af4b8a83add95ef40716401395b44a1b579965f4.
>
> Signed-off-by: Serge Hallyn 
> Cc: "Eric W. Biederman" 
> ---
>  kernel/pid.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 0db3e79..6b312c4 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -274,6 +274,10 @@ void free_pid(struct pid *pid)
>   case 0:
>   schedule_work(>proc_work);
>   break;
> + default:
> + if (ns->child_reaper->flags & PF_EXITING)
> + wake_up_process(ns->child_reaper);
> + break;
>   }
>   }
>   spin_unlock_irqrestore(_lock, flags);

qSo I think the change that we actually want is just to send a wake-up
when we have two pids in the pid namespace as well as one pid.

- That can send one extraneous wake-up but that is relatively harmless.
- We can detect the condition race free.
- With only two pids remaining we are guaranteed that which ever task is
  the child_reaper will persist through zap_pid_ns_processes.

  There are 3 cases.
  init-tgleader other -- Single threaded init so of course we won't free the 
task
  init-tgleader-dead init-thread -- The last living init thread will call 
zap_pid_ns_processes.
  init-tgleader init-thread -- An init with two living threads child_reaper 
must be the init thread group leader

Which means at the cost of an extra wake-up we are guaranteed not to
have races.

Serge does that look good to you?

Eric



diff --git a/kernel/pid.c b/kernel/pid.c
index 17755ae..ab75add 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -265,6 +265,7 @@ void free_pid(struct pid *pid)
struct pid_namespace *ns = upid->ns;
hlist_del_rcu(>pid_chain);
switch(--ns->nr_hashed) {
+   case 2:
case 1:
/* When all that is left in the pid namespace
 * is the reaper wake up the reaper.  The reaper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >