subject:"Re\: 2.6.23\-rc6\-mm1"

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-10-17 Thread Pavel Machek

On Fri 2007-09-21 10:06:15, Thomas Gleixner wrote:
> On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
> > Linus Torvalds writes:
> > 
> > > It would indeed be nice if we could just take CPU's down early (while 
> > > everything is working), and run the whole suspend code with just one CPU, 
> > > rather than having to worry about the ordering between CPU and device 
> > > takedown.
> > 
> > That is certainly what we want to do on powerpc.
> 
> I would have expected that we do it exactly this way and it took me by
> surprise, that we do not.

Well, we used to do that, but acpi spec forbids that, and it means
userspace sees plugs/unplugs.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-10-17 Thread Pavel Machek

On Fri 2007-09-21 10:06:15, Thomas Gleixner wrote:
 On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
  Linus Torvalds writes:
  
   It would indeed be nice if we could just take CPU's down early (while 
   everything is working), and run the whole suspend code with just one CPU, 
   rather than having to worry about the ordering between CPU and device 
   takedown.
  
  That is certainly what we want to do on powerpc.
 
 I would have expected that we do it exactly this way and it took me by
 surprise, that we do not.

Well, we used to do that, but acpi spec forbids that, and it means
userspace sees plugs/unplugs.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken [Resending PATCH]

2007-09-28 Thread Kamalesh Babulal

Thomas Gleixner wrote:
> On Fri, 2007-09-28 at 16:07 +0530, Kamalesh Babulal wrote:
>   
>> The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
>> The below patch disables the kgdb from getting compiled over
>> powerpc platform.
>>
>> Signed-off-by : Kamalesh Babulal <[EMAIL PROTECTED]>
>> ---
>>
>> --- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
>> +++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
>> @@ -14,7 +14,7 @@ config KGDB
>> bool "KGDB: kernel debugging with remote gdb"
>> select WANT_EXTRA_DEBUG_INFORMATION
>> select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
>> -   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && 
>> !SUPERH64) || IA64 || PPC)
>> +   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && 
>> !SUPERH64) || IA64 || !PPC)
>> 
>
> This enables the KGDB config for _ALL_ platforms except powerpc. 
>
> Just remove PPC completely.
>
>   tglx
>   
Thanks thomas for the suggestion, resending the patch.

Signed-off-by : Kamalesh Babulal <[EMAIL PROTECTED]>
---

--- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 17:46:55.0 +0530
+++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 17:51:43.0 +0530
@@ -14,7 +14,7 @@ config KGDB
bool "KGDB: kernel debugging with remote gdb"
select WANT_EXTRA_DEBUG_INFORMATION
select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
-   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && !SUPERH64) 
|| IA64 || PPC)
+   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && !SUPERH64) 
|| IA64)
help
  If you say Y here, it will be possible to remotely debug the
  kernel using gdb.  Documentation of kernel debugger is available

-- 

Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken

2007-09-28 Thread Thomas Gleixner

On Fri, 2007-09-28 at 16:07 +0530, Kamalesh Babulal wrote:
> The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
> The below patch disables the kgdb from getting compiled over
> powerpc platform.
> 
> Signed-off-by : Kamalesh Babulal <[EMAIL PROTECTED]>
> ---
> 
> --- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
> +++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
> @@ -14,7 +14,7 @@ config KGDB
> bool "KGDB: kernel debugging with remote gdb"
> select WANT_EXTRA_DEBUG_INFORMATION
> select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
> -   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && 
> !SUPERH64) || IA64 || PPC)
> +   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && 
> !SUPERH64) || IA64 || !PPC)

This enables the KGDB config for _ALL_ platforms except powerpc. 

Just remove PPC completely.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken

2007-09-28 Thread Kamalesh Babulal

Christoph Hellwig wrote:
> On Thu, Sep 20, 2007 at 03:23:19PM +1000, Paul Mackerras wrote:
>   
>> I could remove all the kgdb support from arch/powerpc as a first step,
>> if that would make it easier to pull in the new stuff...
>> 
>
> Given that the existing powerpc kgdb bits didn't seem to work at all when
> I tried them that seems like a perfectly fine thing to do.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>   
Hi Paul,

The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
The below patch disables the kgdb from getting compiled over
powerpc platform.

Signed-off-by : Kamalesh Babulal <[EMAIL PROTECTED]>
---

--- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
+++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
@@ -14,7 +14,7 @@ config KGDB
bool "KGDB: kernel debugging with remote gdb"
select WANT_EXTRA_DEBUG_INFORMATION
select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
-   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && !SUPERH64) 
|| IA64 || PPC)
+   depends on DEBUG_KERNEL && (ARM || X86 || MIPS || (SUPERH && !SUPERH64) 
|| IA64 || !PPC)
help
  If you say Y here, it will be possible to remotely debug the
  kernel using gdb.  Documentation of kernel debugger is available

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken

2007-09-28 Thread Thomas Gleixner

On Fri, 2007-09-28 at 16:07 +0530, Kamalesh Babulal wrote:
 The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
 The below patch disables the kgdb from getting compiled over
 powerpc platform.
 
 Signed-off-by : Kamalesh Babulal [EMAIL PROTECTED]
 ---
 
 --- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
 +++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
 @@ -14,7 +14,7 @@ config KGDB
 bool KGDB: kernel debugging with remote gdb
 select WANT_EXTRA_DEBUG_INFORMATION
 select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
 -   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || PPC)
 +   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || !PPC)

This enables the KGDB config for _ALL_ platforms except powerpc. 

Just remove PPC completely.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken

2007-09-28 Thread Kamalesh Babulal

Christoph Hellwig wrote:
 On Thu, Sep 20, 2007 at 03:23:19PM +1000, Paul Mackerras wrote:
   
 I could remove all the kgdb support from arch/powerpc as a first step,
 if that would make it easier to pull in the new stuff...
 

 Given that the existing powerpc kgdb bits didn't seem to work at all when
 I tried them that seems like a perfectly fine thing to do.

 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
   
Hi Paul,

The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
The below patch disables the kgdb from getting compiled over
powerpc platform.

Signed-off-by : Kamalesh Babulal [EMAIL PROTECTED]
---

--- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
+++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
@@ -14,7 +14,7 @@ config KGDB
bool KGDB: kernel debugging with remote gdb
select WANT_EXTRA_DEBUG_INFORMATION
select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
-   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  !SUPERH64) 
|| IA64 || PPC)
+   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  !SUPERH64) 
|| IA64 || !PPC)
help
  If you say Y here, it will be possible to remotely debug the
  kernel using gdb.  Documentation of kernel debugger is available

-- 
Thanks  Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 powerpc - kgdb is broken [Resending PATCH]

2007-09-28 Thread Kamalesh Babulal

Thomas Gleixner wrote:
 On Fri, 2007-09-28 at 16:07 +0530, Kamalesh Babulal wrote:
   
 The kgdb is also broken with 2.6.23-rc8-mm2 on the powerpc .
 The below patch disables the kgdb from getting compiled over
 powerpc platform.

 Signed-off-by : Kamalesh Babulal [EMAIL PROTECTED]
 ---

 --- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 06:33:37.0 +0530
 +++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 23:48:33.0 +0530
 @@ -14,7 +14,7 @@ config KGDB
 bool KGDB: kernel debugging with remote gdb
 select WANT_EXTRA_DEBUG_INFORMATION
 select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
 -   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || PPC)
 +   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  
 !SUPERH64) || IA64 || !PPC)
 

 This enables the KGDB config for _ALL_ platforms except powerpc. 

 Just remove PPC completely.

   tglx
   
Thanks thomas for the suggestion, resending the patch.

Signed-off-by : Kamalesh Babulal [EMAIL PROTECTED]
---

--- linux-2.6.23-rc8/lib/Kconfig.kgdb   2007-09-28 17:46:55.0 +0530
+++ linux-2.6.23-rc8/lib/~Kconfig.kgdb  2007-09-28 17:51:43.0 +0530
@@ -14,7 +14,7 @@ config KGDB
bool KGDB: kernel debugging with remote gdb
select WANT_EXTRA_DEBUG_INFORMATION
select KGDB_ARCH_HAS_SHADOW_INFO if X86_64
-   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  !SUPERH64) 
|| IA64 || PPC)
+   depends on DEBUG_KERNEL  (ARM || X86 || MIPS || (SUPERH  !SUPERH64) 
|| IA64)
help
  If you say Y here, it will be possible to remotely debug the
  kernel using gdb.  Documentation of kernel debugger is available

-- 

Thanks  Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-26 Thread Jarek Poplawski

On Tue, Sep 25, 2007 at 01:47:05PM +0200, Jarek Poplawski wrote:
> On Mon, Sep 24, 2007 at 11:50:23AM +0200, Nadia Derbey wrote:
> > Jarek Poplawski wrote:
> ...
> > >2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
> > >safe (memory barriers): it's not atomic, so locking is needed, but
> > >e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
> > >freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
> > >
> > >3. Probably similar problem is possible with msr_d.r_msg which is
> > >read in do_msgrcv() under rcu_read_lock() only.
> > >
> > 
> > In think here they have avoided refcoutning by using r_msg:
> > r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
> > freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
> > Setting r_msg is always done under the msq lock (expunge_all() / 
> > pipelined_Sned()).
> >  Since rcu_read_lock is called right after schedule, they are sure the 
> > msq pointer is still valid when they re-lock it once a msg is present in 
> > the receive queue.
> > 
> > Please tell me if I'm not clear ;-)
> 
> All clear!
> 
> Except... this r_msg is still unclear to me. Since it's read without
> msq lock I doubt this first check after schedule() is of any value. A
> comment: "Lockless receive, part 2" tells about some safety against a
> race with pipeline_send() and expunge_all() when in wake_up_process(),
> but how can we be sure this r_msg is not just to be changed and this
> wake_up_process() is running while "while" check still sees
> ERR_PTR(-EAGAIN) instead of NULL?

OOPS!

At last I've found enough time to look at this more exactly plus the
right comment in sem.c, and it looks like it's all right and really
thought up, with all variants foreseen. So, I withdraw this problem
nr 3 too.

Best regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-26 Thread Jarek Poplawski

On Tue, Sep 25, 2007 at 01:47:05PM +0200, Jarek Poplawski wrote:
 On Mon, Sep 24, 2007 at 11:50:23AM +0200, Nadia Derbey wrote:
  Jarek Poplawski wrote:
 ...
  2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
  safe (memory barriers): it's not atomic, so locking is needed, but
  e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
  freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
  
  3. Probably similar problem is possible with msr_d.r_msg which is
  read in do_msgrcv() under rcu_read_lock() only.
  
  
  In think here they have avoided refcoutning by using r_msg:
  r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
  freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
  Setting r_msg is always done under the msq lock (expunge_all() / 
  pipelined_Sned()).
   Since rcu_read_lock is called right after schedule, they are sure the 
  msq pointer is still valid when they re-lock it once a msg is present in 
  the receive queue.
  
  Please tell me if I'm not clear ;-)
 
 All clear!
 
 Except... this r_msg is still unclear to me. Since it's read without
 msq lock I doubt this first check after schedule() is of any value. A
 comment: Lockless receive, part 2 tells about some safety against a
 race with pipeline_send() and expunge_all() when in wake_up_process(),
 but how can we be sure this r_msg is not just to be changed and this
 wake_up_process() is running while while check still sees
 ERR_PTR(-EAGAIN) instead of NULL?

OOPS!

At last I've found enough time to look at this more exactly plus the
right comment in sem.c, and it looks like it's all right and really
thought up, with all variants foreseen. So, I withdraw this problem
nr 3 too.

Best regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-25 Thread Jeff Garzik


Satyam Sharma wrote:

Hi,


On Thu, 20 Sep 2007, Alan Cox wrote:

On Thu, 20 Sep 2007 14:13:15 +0100
[EMAIL PROTECTED] (Mel Gorman) wrote:


PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
doesn't show up on other arches because this driver is specific to the
architecture.

drivers/ata/pata_scc.c: In function `scc_bmdma_status'

Its not been updated to match the libata core changes. Try something like
this. Whoever is maintaining it should also remove the prereset cable handling
code and use the proper cable detect method.


It appears you forgot to fix scc_std_softreset() and one of its callsites
in scc_bdma_stop(). A complete patch is attempted below -- please review.


[PATCH -mm] pata_scc: Keep up with libata core API changes

Little fixlets, that the build started erroring / warning about:

drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
incompatible pointer type
drivers/ata/pata_scc.c: In function 'scc_error_handler':
drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
make[2]: *** [drivers/ata/pata_scc.o] Error 1

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>
Cc: Alan Cox <[EMAIL PROTECTED]>
Cc: Mel Gorman <[EMAIL PROTECTED]>

---

Andrew, this includes (supercedes) the previous two ones from Mel / Alan.

 drivers/ata/pata_scc.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)


applied


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- powerpc link failure

2007-09-25 Thread Andy Whitcroft

On Wed, Sep 19, 2007 at 07:44:03PM +0200, Sam Ravnborg wrote:
> On Wed, Sep 19, 2007 at 10:28:48AM +0100, Andy Whitcroft wrote:
> > I am seeing this strange link error from a PowerMac G5 (powerpc):
> > 
> >   [...]
> > KSYM.tmp_kallsyms2.S
> > AS  .tmp_kallsyms2.o
> > LD  vmlinux.o
> >   ld: dynreloc miscount for fs/built-in.o, section .opd
> >   ld: can not edit opd Bad value
> >   make: *** [vmlinux.o] Error 1
> 
> We have had this issue before.
> Try to see:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=045e72acf16054c4ed2760e9a8edb19a08053af1
> 
> Here it was caused by a weak declaration that was superflous.
> 
> I expect the workaround to be equally simple this time - or I hope so.

Yep, will investigate this as suggested.  As this problem seems to
persist thru 2.6.23-rc8-mm1 I will report it again there and my
findings.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-25 Thread Jarek Poplawski

On Mon, Sep 24, 2007 at 11:50:23AM +0200, Nadia Derbey wrote:
> Jarek Poplawski wrote:
...
> >2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
> >safe (memory barriers): it's not atomic, so locking is needed, but
> >e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
> >freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
> >
> >3. Probably similar problem is possible with msr_d.r_msg which is
> >read in do_msgrcv() under rcu_read_lock() only.
> >
> 
> In think here they have avoided refcoutning by using r_msg:
> r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
> freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
> Setting r_msg is always done under the msq lock (expunge_all() / 
> pipelined_Sned()).
>  Since rcu_read_lock is called right after schedule, they are sure the 
> msq pointer is still valid when they re-lock it once a msg is present in 
> the receive queue.
> 
> Please tell me if I'm not clear ;-)

All clear!

Except... this r_msg is still unclear to me. Since it's read without
msq lock I doubt this first check after schedule() is of any value. A
comment: "Lockless receive, part 2" tells about some safety against a
race with pipeline_send() and expunge_all() when in wake_up_process(),
but how can we be sure this r_msg is not just to be changed and this
wake_up_process() is running while "while" check still sees
ERR_PTR(-EAGAIN) instead of NULL?

Thanks & sorry for delay,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-25 Thread Jarek Poplawski

On Mon, Sep 24, 2007 at 11:50:23AM +0200, Nadia Derbey wrote:
 Jarek Poplawski wrote:
...
 2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
 safe (memory barriers): it's not atomic, so locking is needed, but
 e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
 freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
 
 3. Probably similar problem is possible with msr_d.r_msg which is
 read in do_msgrcv() under rcu_read_lock() only.
 
 
 In think here they have avoided refcoutning by using r_msg:
 r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
 freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
 Setting r_msg is always done under the msq lock (expunge_all() / 
 pipelined_Sned()).
  Since rcu_read_lock is called right after schedule, they are sure the 
 msq pointer is still valid when they re-lock it once a msg is present in 
 the receive queue.
 
 Please tell me if I'm not clear ;-)

All clear!

Except... this r_msg is still unclear to me. Since it's read without
msq lock I doubt this first check after schedule() is of any value. A
comment: Lockless receive, part 2 tells about some safety against a
race with pipeline_send() and expunge_all() when in wake_up_process(),
but how can we be sure this r_msg is not just to be changed and this
wake_up_process() is running while while check still sees
ERR_PTR(-EAGAIN) instead of NULL?

Thanks  sorry for delay,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- powerpc link failure

2007-09-25 Thread Andy Whitcroft

On Wed, Sep 19, 2007 at 07:44:03PM +0200, Sam Ravnborg wrote:
 On Wed, Sep 19, 2007 at 10:28:48AM +0100, Andy Whitcroft wrote:
  I am seeing this strange link error from a PowerMac G5 (powerpc):
  
[...]
  KSYM.tmp_kallsyms2.S
  AS  .tmp_kallsyms2.o
  LD  vmlinux.o
ld: dynreloc miscount for fs/built-in.o, section .opd
ld: can not edit opd Bad value
make: *** [vmlinux.o] Error 1
 
 We have had this issue before.
 Try to see:
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=045e72acf16054c4ed2760e9a8edb19a08053af1
 
 Here it was caused by a weak declaration that was superflous.
 
 I expect the workaround to be equally simple this time - or I hope so.

Yep, will investigate this as suggested.  As this problem seems to
persist thru 2.6.23-rc8-mm1 I will report it again there and my
findings.

-apw
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-25 Thread Jeff Garzik


Satyam Sharma wrote:

Hi,


On Thu, 20 Sep 2007, Alan Cox wrote:

On Thu, 20 Sep 2007 14:13:15 +0100
[EMAIL PROTECTED] (Mel Gorman) wrote:


PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
doesn't show up on other arches because this driver is specific to the
architecture.

drivers/ata/pata_scc.c: In function `scc_bmdma_status'

Its not been updated to match the libata core changes. Try something like
this. Whoever is maintaining it should also remove the prereset cable handling
code and use the proper cable detect method.


It appears you forgot to fix scc_std_softreset() and one of its callsites
in scc_bdma_stop(). A complete patch is attempted below -- please review.


[PATCH -mm] pata_scc: Keep up with libata core API changes

Little fixlets, that the build started erroring / warning about:

drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
incompatible pointer type
drivers/ata/pata_scc.c: In function 'scc_error_handler':
drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
make[2]: *** [drivers/ata/pata_scc.o] Error 1

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Cc: Alan Cox [EMAIL PROTECTED]
Cc: Mel Gorman [EMAIL PROTECTED]

---

Andrew, this includes (supercedes) the previous two ones from Mel / Alan.

 drivers/ata/pata_scc.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)


applied


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-24 Thread Greg KH

On Sat, Sep 22, 2007 at 02:51:54PM +0530, Satyam Sharma wrote:
> Hi Greg,
> 
> 
> On Tue, 18 Sep 2007, Greg KH wrote:
> > 
> > On Tue, Sep 18, 2007 at 03:04:48PM +0530, Satyam Sharma wrote:
> > > 
> > > But wait ... isn't that a statically-allocated kobject, which were
> > > supposed to be "naughty" in the first place?
> > 
> > Yes it is, if you want to dynamically create it, please do.
> 
> Sorry for being late to reply, but do you still want such a patch (i.e.
> convert static to dynamic allocation)?

Yes, I'll gladly take such patches.

> I read elsewhere on this thread that you'd merge Kamalesh's patch and
> fix it up to also use kobject_name() yourself. But it's a small/trivial
> driver, so I think just converting it to dynamic allocation right now
> itself (when we've noticed it already) is probably better (?)

Sure that would be great to have.

> [ BTW I don't see the fix in your git trees or quilt queue. So I'll
>   make a patch on top of 2.6.23-rc6-mm1 itself. ]

I'm behind in updating my patch queue, sorry, other things came up :(

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-24 Thread Mel Gorman

On (22/09/07 12:24), Satyam Sharma didst pronounce:
> 
> 
> On Thu, 20 Sep 2007, Satyam Sharma wrote:
> > 
> > BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
> > IIRC I got build failures in:
> 
> > drivers/net/spider_net.c
> 
> 
> [PATCH -mm] spider_net: Misc build fixes after recent netdev stats changes
> 
> Unbreak the following:
> 
> drivers/net/spider_net.c: In function 'spider_net_release_tx_chain':
> drivers/net/spider_net.c:818: error: 'dev' undeclared (first use in this 
> function)
> drivers/net/spider_net.c:818: error: (Each undeclared identifier is reported 
> only once
> drivers/net/spider_net.c:818: error: for each function it appears in.)
> drivers/net/spider_net.c: In function 'spider_net_xmit':
> drivers/net/spider_net.c:922: error: 'dev' undeclared (first use in this 
> function)
> drivers/net/spider_net.c: In function 'spider_net_pass_skb_up':
> drivers/net/spider_net.c:1018: error: 'dev' undeclared (first use in this 
> function)
> drivers/net/spider_net.c: In function 'spider_net_decode_one_descr':
> drivers/net/spider_net.c:1215: error: 'dev' undeclared (first use in this 
> function)
> make[2]: *** [drivers/net/spider_net.o] Error 1
> 
> Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

I've confirmed that this patch fixes the build error in question.

Acked-by: Mel Gorman <[EMAIL PROTECTED]>

> 
> ---
> 
>  drivers/net/spider_net.c |   24 +++-
>  1 file changed, 11 insertions(+), 13 deletions(-)
> 
> diff -ruNp a/drivers/net/spider_net.c b/drivers/net/spider_net.c
> --- a/drivers/net/spider_net.c2007-09-22 06:26:39.0 +0530
> +++ b/drivers/net/spider_net.c2007-09-22 12:12:23.0 +0530
> @@ -795,6 +795,7 @@ spider_net_set_low_watermark(struct spid
>  static int
>  spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
>  {
> + struct net_device *dev = card->netdev;
>   struct spider_net_descr_chain *chain = >tx_chain;
>   struct spider_net_descr *descr;
>   struct spider_net_hw_descr *hwdescr;
> @@ -919,7 +920,7 @@ spider_net_xmit(struct sk_buff *skb, str
>   spider_net_release_tx_chain(card, 0);
>  
>   if (spider_net_prepare_tx_descr(card, skb) != 0) {
> - dev->stats.tx_dropped++;
> + netdev->stats.tx_dropped++;
>   netif_stop_queue(netdev);
>   return NETDEV_TX_BUSY;
>   }
> @@ -979,16 +980,12 @@ static void
>  spider_net_pass_skb_up(struct spider_net_descr *descr,
>  struct spider_net_card *card)
>  {
> - struct spider_net_hw_descr *hwdescr= descr->hwdescr;
> - struct sk_buff *skb;
> - struct net_device *netdev;
> - u32 data_status, data_error;
> -
> - data_status = hwdescr->data_status;
> - data_error = hwdescr->data_error;
> - netdev = card->netdev;
> + struct spider_net_hw_descr *hwdescr = descr->hwdescr;
> + struct sk_buff *skb = descr->skb;
> + struct net_device *netdev = card->netdev;
> + u32 data_status = hwdescr->data_status;
> + u32 data_error = hwdescr->data_error;
>  
> - skb = descr->skb;
>   skb_put(skb, hwdescr->valid_size);
>  
>   /* the card seems to add 2 bytes of junk in front
> @@ -1015,8 +1012,8 @@ spider_net_pass_skb_up(struct spider_net
>   }
>  
>   /* update netdevice statistics */
> - dev->stats.rx_packets++;
> - dev->stats.rx_bytes += skb->len;
> + netdev->stats.rx_packets++;
> + netdev->stats.rx_bytes += skb->len;
>  
>   /* pass skb up to stack */
>   netif_receive_skb(skb);
> @@ -1184,6 +1181,7 @@ static int spider_net_resync_tail_ptr(st
>  static int
>  spider_net_decode_one_descr(struct spider_net_card *card)
>  {
> + struct net_device *dev = card->netdev;
>   struct spider_net_descr_chain *chain = >rx_chain;
>   struct spider_net_descr *descr = chain->tail;
>   struct spider_net_hw_descr *hwdescr = descr->hwdescr;
> @@ -1210,7 +1208,7 @@ spider_net_decode_one_descr(struct spide
>(status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
>(status == SPIDER_NET_DESCR_FORCE_END) ) {
>   if (netif_msg_rx_err(card))
> - dev_err(>netdev->dev,
> + dev_err(>dev,
>  "dropping RX descriptor with state %d\n", 
> status);
>   dev->stats.rx_dropped++;
>   goto bad_desc;
> 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-24 Thread Mel Gorman

On (22/09/07 14:11), Satyam Sharma didst pronounce:
> 
> > -static volatile int kgdb_hwbreak_sstep[NR_CPUS];
> > +volatile int kgdb_hwbreak_sstep[NR_CPUS];
> 
> That looks fishy to me. Why is it volatile-qualified?

It turned out that it was unnecessary. A follow-up patch removed the volatile
and kept the static.

> And does that
> actually want to be a per-cpu var?
> 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-24 Thread Mel Gorman

On (22/09/07 08:20), Satyam Sharma didst pronounce:
> Hi,
> 
> 
> On Thu, 20 Sep 2007, Alan Cox wrote:
> > 
> > On Thu, 20 Sep 2007 14:13:15 +0100
> > [EMAIL PROTECTED] (Mel Gorman) wrote:
> > 
> > > PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
> > > doesn't show up on other arches because this driver is specific to the
> > > architecture.
> > > 
> > > drivers/ata/pata_scc.c: In function `scc_bmdma_status'
> > 
> > Its not been updated to match the libata core changes. Try something like
> > this. Whoever is maintaining it should also remove the prereset cable 
> > handling
> > code and use the proper cable detect method.
> 
> It appears you forgot to fix scc_std_softreset() and one of its callsites
> in scc_bdma_stop(). A complete patch is attempted below -- please review.
> 

I can confirm it builds without warnings or errors, so thanks. I'm not in
the position to review it for correctness.

> 
> [PATCH -mm] pata_scc: Keep up with libata core API changes
> 
> Little fixlets, that the build started erroring / warning about:
> 
> drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
> drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
> drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
> drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
> incompatible pointer type
> drivers/ata/pata_scc.c: In function 'scc_error_handler':
> drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' 
> from incompatible pointer type
> drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' 
> from incompatible pointer type
> drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' 
> from incompatible pointer type
> make[2]: *** [drivers/ata/pata_scc.o] Error 1
> 
> Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>
> Cc: Alan Cox <[EMAIL PROTECTED]>
> Cc: Mel Gorman <[EMAIL PROTECTED]>
> 
> ---
> 
> Andrew, this includes (supercedes) the previous two ones from Mel / Alan.
> 
>  drivers/ata/pata_scc.c |   21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff -ruNp a/drivers/ata/pata_scc.c b/drivers/ata/pata_scc.c
> --- a/drivers/ata/pata_scc.c  2007-09-22 06:26:37.0 +0530
> +++ b/drivers/ata/pata_scc.c  2007-09-22 08:04:42.0 +0530
> @@ -594,16 +594,17 @@ static unsigned int scc_bus_softreset(st
>   *   Note: Original code is ata_std_softreset().
>   */
>  
> -static int scc_std_softreset (struct ata_port *ap, unsigned int *classes,
> -  unsigned long deadline)
> +static int scc_std_softreset(struct ata_link *link, unsigned int *classes,
> + unsigned long deadline)
>  {
> + struct ata_port *ap = link->ap;
>   unsigned int slave_possible = ap->flags & ATA_FLAG_SLAVE_POSS;
>   unsigned int devmask = 0, err_mask;
>   u8 err;
>  
>   DPRINTK("ENTER\n");
>  
> - if (ata_link_offline(>link)) {
> + if (ata_link_offline(link)) {
>   classes[0] = ATA_DEV_NONE;
>   goto out;
>   }
> @@ -692,7 +693,7 @@ static void scc_bmdma_stop (struct ata_q
>   printk(KERN_WARNING "%s: Internal Bus Error\n", 
> DRV_NAME);
>   out_be32(bmid_base + SCC_DMA_INTST, INTSTS_BMSINT);
>   /* TBD: SW reset */
> - scc_std_softreset(ap, , deadline);
> + scc_std_softreset(>link, , deadline);
>   continue;
>   }
>  
> @@ -731,7 +732,7 @@ static u8 scc_bmdma_status (struct ata_p
>   void __iomem *mmio = ap->ioaddr.bmdma_addr;
>   u8 host_stat = in_be32(mmio + SCC_DMA_STATUS);
>   u32 int_status = in_be32(mmio + SCC_DMA_INTST);
> - struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
> + struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->link.active_tag);
>   static int retry = 0;
>  
>   /* return if IOS_SS is cleared */
> @@ -860,10 +861,10 @@ static void scc_bmdma_freeze (struct ata
>   *   @deadline: deadline jiffies for the operation
>   */
>  
> -static int scc_pata_prereset(struct ata_port *ap, unsigned long deadline)
> +static int scc_pata_prereset(struct ata_link *link, unsigned long deadline)
>  {
> - ap->cbl = ATA_CBL_PATA80;
> - return ata_std_prereset(ap, deadline);
> + link->ap->cbl = ATA_CBL_PATA80;
> + return ata_std_prereset(link, deadline);
>  }
>  
>  /**
> @@ -874,8 +875,10 @@ static int scc_pata_prereset(struct ata_
>   *   Note: Original code is ata_std_postreset().
>   */
>  
> -static void scc_std_postreset (struct ata_port *ap, unsigned int *classes)
> +static void scc_std_postreset(struct ata_link *link, unsigned int *classes)
>  {
> + struct ata_port *ap = link->ap;
> +
>   DPRINTK("ENTER\n");
>  
>   /* is double-select really necessary? */
> 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Nadia Derbey


Jarek Poplawski wrote:

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:


Nadia Derbey wrote:


Jarek Poplawski wrote:



On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:


...

Actually, ipc_lock() is called most of the time without the 
ipc_ids.mutex held and without refcounting (maybe you didn't look for 
the msg_lock() sem_lock() and shm_lock() too).

So I think disabling preemption is needed, isn't it?



so, these rcu_read_locks() don't
work here at all. So, probably I miss something again, but IMHO,
these rcu_read_locks/unlocks could be removed here or in
ipc_lock_by_ptr() and it should be enough to use them directly, where
really needed, e.g., in msg.c do_msgrcv().



I have to check for the ipc_lock_by_ptr(): may be you're right!



So, here is the ipc_lock_by_ptr() status:
1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
call it inside a refcounting.

 ==> no rcu read section needed.

2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
ipc_ids mutex lock.

 ==> no rcu read section needed.

3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
under refcounting

 ==> rcu read section + some more checks needed once the spnlock is
 taken.

So I completely agree with you: we might remove the rcu_read_lock() from 
the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
is already explicitly called in do_msgrcv()).



Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):



Jarek,

I'm realizing I did'nt give you an answer to issues # 1 and 3. Sorry for 
that!



1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().


I think you're completely right: the rcu_read_lock() is not enough in 
this case.
I have to solve this issue, but keeping the original way the ipc 
developers have done it: I think they didn't want to take the mutex lock 
for every single operation. E.g. sending a message to a given message 
queue shouldn't avoid creating new message queues.

I'll come up with a solution.



2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.

3. Probably similar problem is possible with msr_d.r_msg which is
read in do_msgrcv() under rcu_read_lock() only.



In think here they have avoided refcoutning by using r_msg:
r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
Setting r_msg is always done under the msq lock (expunge_all() / 
pipelined_Sned()).
 Since rcu_read_lock is called right after schedule, they are sure the 
msq pointer is still valid when they re-lock it once a msg is present in 
the receive queue.


Please tell me if I'm not clear ;-)

Regards,
Nadia

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Nadia Derbey


Jarek Poplawski wrote:

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...


I hope not! But, then it would be probably another logical trick:
ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
if it's used in do_msgsnd() there should be a risk something can do
this kfree at this moment, and it seems freeque() is the only one,
which both: can do this and cares for this refcount. Then, e.g., if
any of them does ipc_rcu_getref() a bit later and sees old (cached)
value - kfree can be skipped forever. [...]



After rethinking, this scenario seems to be wrong or very unprobable
(I'm not sure of all ways "if (--container...)" could be compiled),
so there should be no such risk - double kfree/vfree is more probable,
so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
pointer acquired by do_msgsend() before freeque() started); then,
after schedule(), do_msgsnd() can work with kfreed msq_queue structure
(at least considering classic RCU).



If msgsnd() acquires the pointer first, it does it under lock + 
rcu_getref(). ==> refcount = 2
When schedule() is called if freeque() takes the pointer it will call 
msg_rmid() that sets the deleted field in the msg queue. When the lock 
is released by freeque(), we have either 1) or 2):

1) freeque()'s putref called 1st ==> refocunt = 1
   Then msgsnd()'s lock_by_ptr() is called ==> rcu lock
   Then msgsnd()'s putref is called ==> refcount = 0
   But this is done under RCU lock, so should be no problem
   Then the deleted field is checked ==> return
2) msgsnd()'s lock_by_ptr() is called ==> rcu lock
   Then we don't mind in which order are done the other operations
   since we under rcu_lock: the structure won't disappear till we test
   the deleted field.

Regards,
Nadia

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-24 Thread Fengguang Wu

On Mon, Sep 24, 2007 at 09:35:23AM +0200, Peter Zijlstra wrote:
> On Mon, 24 Sep 2007 11:01:10 +0800 Fengguang Wu <[EMAIL PROTECTED]>
> wrote:
> 
> > > That is an interesting idea how about this:
> > 
> > It looks like a workaround, but it does solve the most important problem.
> > And it is a good logic by itself.  So I'd vote for it.
> > 
> > The fundamental problem is that the per-bdi-writeback-completion based
> > estimation is not accurate under light loads. The problem remains for
> > a light-load sda when there is a heavy-load sdb. 
> 
> Well, sure, in that case sda would get to write out a lot of small
> things. But in that case it would be fair wrt the other writers.

Hmm, I cannot agree it to be fair - but pretty acceptable ;-)
Your patch already brings great improvements in the multi-bdi case.

> > One more workaround
> > could be to grant bdi(s) a minimal bdi_thresh. 
> 
> Ah, no, that is no good. For if there were a lot of BDIs this might
> happen:
>   nr_bdis * min_thresh > dirty_limit.

Sure it is in the extreme case. However the limit could be ensured
if we really want(which I'm really not sure;-) it:

if (nr_reclaimable + nr_writeback < dirty_thresh &&
bdi_nr_reclaimable + bdi_nr_writeback <= bdi_min_thresh)
break;

> > Or better to adjust the estimation logic?
> 
> Not sure what we can do here. The current thing is simple, fast and fair.

Agreed.

> > > + /*
> > > +  * break out early when:
> > > +  *  - we're below the bdi limit
> > > +  *  - we're below half the total limit
> > > +  *
> > > +  * we let the numbers exceed the strict bdi limit if the total
> > > +  * numbers are too low, this avoids (excessive) small writeouts.
> > > +  */
> > > + if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh ||
> > > + nr_reclaimable + nr_writeback < dirty_thresh / 2)
> > >   break;
> > 
> > This may be slightly better:
> > 
> > if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > break;
> > /*
> >  * Throttle it only when the background writeback cannot 
> > catchup.
> >  */
> > if (nr_reclaimable + nr_writeback <
> > (background_thresh + dirty_thresh) / 2)
> > break;
> 
> Ah, indeed. Good idea.

Thank you :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Jarek Poplawski

On Mon, Sep 24, 2007 at 08:54:07AM +0200, Jarek Poplawski wrote:
> After rethinking, this scenario seems to be wrong or very unprobable
> (I'm not sure of all ways "if (--container...)" could be compiled),
> so there should be no such risk - double kfree/vfree is more probable,
> so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
> do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
> pointer acquired by do_msgsend() before freeque() started); then,
> after schedule(), do_msgsnd() can work with kfreed msq_queue structure
> (at least considering classic RCU).

I see this scenario is even more impossible, so you were right,
it's all right at this point.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-24 Thread Peter Zijlstra

On Mon, 24 Sep 2007 11:01:10 +0800 Fengguang Wu <[EMAIL PROTECTED]>
wrote:

> > That is an interesting idea how about this:
> 
> It looks like a workaround, but it does solve the most important problem.
> And it is a good logic by itself.  So I'd vote for it.
> 
> The fundamental problem is that the per-bdi-writeback-completion based
> estimation is not accurate under light loads. The problem remains for
> a light-load sda when there is a heavy-load sdb. 

Well, sure, in that case sda would get to write out a lot of small
things. But in that case it would be fair wrt the other writers.

> One more workaround
> could be to grant bdi(s) a minimal bdi_thresh. 

Ah, no, that is no good. For if there were a lot of BDIs this might
happen:
  nr_bdis * min_thresh > dirty_limit.

> Or better to adjust the estimation logic?

Not sure what we can do here. The current thing is simple, fast and fair.

> > +   /*
> > +* break out early when:
> > +*  - we're below the bdi limit
> > +*  - we're below half the total limit
> > +*
> > +* we let the numbers exceed the strict bdi limit if the total
> > +* numbers are too low, this avoids (excessive) small writeouts.
> > +*/
> > +   if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh ||
> > +   nr_reclaimable + nr_writeback < dirty_thresh / 2)
> > break;
> 
> This may be slightly better:
> 
>   if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> break;
> /*
>  * Throttle it only when the background writeback cannot 
> catchup.
>  */
> if (nr_reclaimable + nr_writeback <
> (background_thresh + dirty_thresh) / 2)
>   break;

Ah, indeed. Good idea.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...
> I hope not! But, then it would be probably another logical trick:
> ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
> if it's used in do_msgsnd() there should be a risk something can do
> this kfree at this moment, and it seems freeque() is the only one,
> which both: can do this and cares for this refcount. Then, e.g., if
> any of them does ipc_rcu_getref() a bit later and sees old (cached)
> value - kfree can be skipped forever. [...]

After rethinking, this scenario seems to be wrong or very unprobable
(I'm not sure of all ways "if (--container...)" could be compiled),
so there should be no such risk - double kfree/vfree is more probable,
so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
pointer acquired by do_msgsend() before freeque() started); then,
after schedule(), do_msgsnd() can work with kfreed msq_queue structure
(at least considering classic RCU).

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...
 I hope not! But, then it would be probably another logical trick:
 ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
 if it's used in do_msgsnd() there should be a risk something can do
 this kfree at this moment, and it seems freeque() is the only one,
 which both: can do this and cares for this refcount. Then, e.g., if
 any of them does ipc_rcu_getref() a bit later and sees old (cached)
 value - kfree can be skipped forever. [...]

After rethinking, this scenario seems to be wrong or very unprobable
(I'm not sure of all ways if (--container...) could be compiled),
so there should be no such risk - double kfree/vfree is more probable,
so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
pointer acquired by do_msgsend() before freeque() started); then,
after schedule(), do_msgsnd() can work with kfreed msq_queue structure
(at least considering classic RCU).

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-24 Thread Peter Zijlstra

On Mon, 24 Sep 2007 11:01:10 +0800 Fengguang Wu [EMAIL PROTECTED]
wrote:

  That is an interesting idea how about this:
 
 It looks like a workaround, but it does solve the most important problem.
 And it is a good logic by itself.  So I'd vote for it.
 
 The fundamental problem is that the per-bdi-writeback-completion based
 estimation is not accurate under light loads. The problem remains for
 a light-load sda when there is a heavy-load sdb. 

Well, sure, in that case sda would get to write out a lot of small
things. But in that case it would be fair wrt the other writers.

 One more workaround
 could be to grant bdi(s) a minimal bdi_thresh. 

Ah, no, that is no good. For if there were a lot of BDIs this might
happen:
  nr_bdis * min_thresh  dirty_limit.

 Or better to adjust the estimation logic?

Not sure what we can do here. The current thing is simple, fast and fair.

  +   /*
  +* break out early when:
  +*  - we're below the bdi limit
  +*  - we're below half the total limit
  +*
  +* we let the numbers exceed the strict bdi limit if the total
  +* numbers are too low, this avoids (excessive) small writeouts.
  +*/
  +   if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh ||
  +   nr_reclaimable + nr_writeback  dirty_thresh / 2)
  break;
 
 This may be slightly better:
 
   if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
 break;
 /*
  * Throttle it only when the background writeback cannot 
 catchup.
  */
 if (nr_reclaimable + nr_writeback 
 (background_thresh + dirty_thresh) / 2)
   break;

Ah, indeed. Good idea.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Jarek Poplawski

On Mon, Sep 24, 2007 at 08:54:07AM +0200, Jarek Poplawski wrote:
 After rethinking, this scenario seems to be wrong or very unprobable
 (I'm not sure of all ways if (--container...) could be compiled),
 so there should be no such risk - double kfree/vfree is more probable,
 so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
 do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
 pointer acquired by do_msgsend() before freeque() started); then,
 after schedule(), do_msgsnd() can work with kfreed msq_queue structure
 (at least considering classic RCU).

I see this scenario is even more impossible, so you were right,
it's all right at this point.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-24 Thread Fengguang Wu

On Mon, Sep 24, 2007 at 09:35:23AM +0200, Peter Zijlstra wrote:
 On Mon, 24 Sep 2007 11:01:10 +0800 Fengguang Wu [EMAIL PROTECTED]
 wrote:
 
   That is an interesting idea how about this:
  
  It looks like a workaround, but it does solve the most important problem.
  And it is a good logic by itself.  So I'd vote for it.
  
  The fundamental problem is that the per-bdi-writeback-completion based
  estimation is not accurate under light loads. The problem remains for
  a light-load sda when there is a heavy-load sdb. 
 
 Well, sure, in that case sda would get to write out a lot of small
 things. But in that case it would be fair wrt the other writers.

Hmm, I cannot agree it to be fair - but pretty acceptable ;-)
Your patch already brings great improvements in the multi-bdi case.

  One more workaround
  could be to grant bdi(s) a minimal bdi_thresh. 
 
 Ah, no, that is no good. For if there were a lot of BDIs this might
 happen:
   nr_bdis * min_thresh  dirty_limit.

Sure it is in the extreme case. However the limit could be ensured
if we really want(which I'm really not sure;-) it:

if (nr_reclaimable + nr_writeback  dirty_thresh 
bdi_nr_reclaimable + bdi_nr_writeback = bdi_min_thresh)
break;

  Or better to adjust the estimation logic?
 
 Not sure what we can do here. The current thing is simple, fast and fair.

Agreed.

   + /*
   +  * break out early when:
   +  *  - we're below the bdi limit
   +  *  - we're below half the total limit
   +  *
   +  * we let the numbers exceed the strict bdi limit if the total
   +  * numbers are too low, this avoids (excessive) small writeouts.
   +  */
   + if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh ||
   + nr_reclaimable + nr_writeback  dirty_thresh / 2)
 break;
  
  This may be slightly better:
  
  if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
  break;
  /*
   * Throttle it only when the background writeback cannot 
  catchup.
   */
  if (nr_reclaimable + nr_writeback 
  (background_thresh + dirty_thresh) / 2)
  break;
 
 Ah, indeed. Good idea.

Thank you :-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Nadia Derbey


Jarek Poplawski wrote:

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...


I hope not! But, then it would be probably another logical trick:
ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
if it's used in do_msgsnd() there should be a risk something can do
this kfree at this moment, and it seems freeque() is the only one,
which both: can do this and cares for this refcount. Then, e.g., if
any of them does ipc_rcu_getref() a bit later and sees old (cached)
value - kfree can be skipped forever. [...]



After rethinking, this scenario seems to be wrong or very unprobable
(I'm not sure of all ways if (--container...) could be compiled),
so there should be no such risk - double kfree/vfree is more probable,
so no danger. More likely is such refcount abuse: ipc_rcu_getref() in
do_msgsnd() done a bit after ipc_rcu_putref() in freeque() (msq
pointer acquired by do_msgsend() before freeque() started); then,
after schedule(), do_msgsnd() can work with kfreed msq_queue structure
(at least considering classic RCU).



If msgsnd() acquires the pointer first, it does it under lock + 
rcu_getref(). == refcount = 2
When schedule() is called if freeque() takes the pointer it will call 
msg_rmid() that sets the deleted field in the msg queue. When the lock 
is released by freeque(), we have either 1) or 2):

1) freeque()'s putref called 1st == refocunt = 1
   Then msgsnd()'s lock_by_ptr() is called == rcu lock
   Then msgsnd()'s putref is called == refcount = 0
   But this is done under RCU lock, so should be no problem
   Then the deleted field is checked == return
2) msgsnd()'s lock_by_ptr() is called == rcu lock
   Then we don't mind in which order are done the other operations
   since we under rcu_lock: the structure won't disappear till we test
   the deleted field.

Regards,
Nadia

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-24 Thread Nadia Derbey


Jarek Poplawski wrote:

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:


Nadia Derbey wrote:


Jarek Poplawski wrote:



On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:


...

Actually, ipc_lock() is called most of the time without the 
ipc_ids.mutex held and without refcounting (maybe you didn't look for 
the msg_lock() sem_lock() and shm_lock() too).

So I think disabling preemption is needed, isn't it?



so, these rcu_read_locks() don't
work here at all. So, probably I miss something again, but IMHO,
these rcu_read_locks/unlocks could be removed here or in
ipc_lock_by_ptr() and it should be enough to use them directly, where
really needed, e.g., in msg.c do_msgrcv().



I have to check for the ipc_lock_by_ptr(): may be you're right!



So, here is the ipc_lock_by_ptr() status:
1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
call it inside a refcounting.

 == no rcu read section needed.

2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
ipc_ids mutex lock.

 == no rcu read section needed.

3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
under refcounting

 == rcu read section + some more checks needed once the spnlock is
 taken.

So I completely agree with you: we might remove the rcu_read_lock() from 
the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
is already explicitly called in do_msgrcv()).



Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):



Jarek,

I'm realizing I did'nt give you an answer to issues # 1 and 3. Sorry for 
that!



1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().


I think you're completely right: the rcu_read_lock() is not enough in 
this case.
I have to solve this issue, but keeping the original way the ipc 
developers have done it: I think they didn't want to take the mutex lock 
for every single operation. E.g. sending a message to a given message 
queue shouldn't avoid creating new message queues.

I'll come up with a solution.



2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.

3. Probably similar problem is possible with msr_d.r_msg which is
read in do_msgrcv() under rcu_read_lock() only.



In think here they have avoided refcoutning by using r_msg:
r_msg is initialzed to -EAGAIN before releasing the msq lock. if 
freequeue() is called it sets r_msg to EIDRM (see expunge_all(-EIDRM)).
Setting r_msg is always done under the msq lock (expunge_all() / 
pipelined_Sned()).
 Since rcu_read_lock is called right after schedule, they are sure the 
msq pointer is still valid when they re-lock it once a msg is present in 
the receive queue.


Please tell me if I'm not clear ;-)

Regards,
Nadia

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-24 Thread Mel Gorman

On (22/09/07 08:20), Satyam Sharma didst pronounce:
 Hi,
 
 
 On Thu, 20 Sep 2007, Alan Cox wrote:
  
  On Thu, 20 Sep 2007 14:13:15 +0100
  [EMAIL PROTECTED] (Mel Gorman) wrote:
  
   PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
   doesn't show up on other arches because this driver is specific to the
   architecture.
   
   drivers/ata/pata_scc.c: In function `scc_bmdma_status'
  
  Its not been updated to match the libata core changes. Try something like
  this. Whoever is maintaining it should also remove the prereset cable 
  handling
  code and use the proper cable detect method.
 
 It appears you forgot to fix scc_std_softreset() and one of its callsites
 in scc_bdma_stop(). A complete patch is attempted below -- please review.
 

I can confirm it builds without warnings or errors, so thanks. I'm not in
the position to review it for correctness.

 
 [PATCH -mm] pata_scc: Keep up with libata core API changes
 
 Little fixlets, that the build started erroring / warning about:
 
 drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
 drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
 drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
 drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
 incompatible pointer type
 drivers/ata/pata_scc.c: In function 'scc_error_handler':
 drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' 
 from incompatible pointer type
 drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' 
 from incompatible pointer type
 drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' 
 from incompatible pointer type
 make[2]: *** [drivers/ata/pata_scc.o] Error 1
 
 Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
 Cc: Alan Cox [EMAIL PROTECTED]
 Cc: Mel Gorman [EMAIL PROTECTED]
 
 ---
 
 Andrew, this includes (supercedes) the previous two ones from Mel / Alan.
 
  drivers/ata/pata_scc.c |   21 -
  1 file changed, 12 insertions(+), 9 deletions(-)
 
 diff -ruNp a/drivers/ata/pata_scc.c b/drivers/ata/pata_scc.c
 --- a/drivers/ata/pata_scc.c  2007-09-22 06:26:37.0 +0530
 +++ b/drivers/ata/pata_scc.c  2007-09-22 08:04:42.0 +0530
 @@ -594,16 +594,17 @@ static unsigned int scc_bus_softreset(st
   *   Note: Original code is ata_std_softreset().
   */
  
 -static int scc_std_softreset (struct ata_port *ap, unsigned int *classes,
 -  unsigned long deadline)
 +static int scc_std_softreset(struct ata_link *link, unsigned int *classes,
 + unsigned long deadline)
  {
 + struct ata_port *ap = link-ap;
   unsigned int slave_possible = ap-flags  ATA_FLAG_SLAVE_POSS;
   unsigned int devmask = 0, err_mask;
   u8 err;
  
   DPRINTK(ENTER\n);
  
 - if (ata_link_offline(ap-link)) {
 + if (ata_link_offline(link)) {
   classes[0] = ATA_DEV_NONE;
   goto out;
   }
 @@ -692,7 +693,7 @@ static void scc_bmdma_stop (struct ata_q
   printk(KERN_WARNING %s: Internal Bus Error\n, 
 DRV_NAME);
   out_be32(bmid_base + SCC_DMA_INTST, INTSTS_BMSINT);
   /* TBD: SW reset */
 - scc_std_softreset(ap, classes, deadline);
 + scc_std_softreset(ap-link, classes, deadline);
   continue;
   }
  
 @@ -731,7 +732,7 @@ static u8 scc_bmdma_status (struct ata_p
   void __iomem *mmio = ap-ioaddr.bmdma_addr;
   u8 host_stat = in_be32(mmio + SCC_DMA_STATUS);
   u32 int_status = in_be32(mmio + SCC_DMA_INTST);
 - struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap-active_tag);
 + struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap-link.active_tag);
   static int retry = 0;
  
   /* return if IOS_SS is cleared */
 @@ -860,10 +861,10 @@ static void scc_bmdma_freeze (struct ata
   *   @deadline: deadline jiffies for the operation
   */
  
 -static int scc_pata_prereset(struct ata_port *ap, unsigned long deadline)
 +static int scc_pata_prereset(struct ata_link *link, unsigned long deadline)
  {
 - ap-cbl = ATA_CBL_PATA80;
 - return ata_std_prereset(ap, deadline);
 + link-ap-cbl = ATA_CBL_PATA80;
 + return ata_std_prereset(link, deadline);
  }
  
  /**
 @@ -874,8 +875,10 @@ static int scc_pata_prereset(struct ata_
   *   Note: Original code is ata_std_postreset().
   */
  
 -static void scc_std_postreset (struct ata_port *ap, unsigned int *classes)
 +static void scc_std_postreset(struct ata_link *link, unsigned int *classes)
  {
 + struct ata_port *ap = link-ap;
 +
   DPRINTK(ENTER\n);
  
   /* is double-select really necessary? */
 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-24 Thread Mel Gorman

On (22/09/07 12:24), Satyam Sharma didst pronounce:
 
 
 On Thu, 20 Sep 2007, Satyam Sharma wrote:
  
  BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
  IIRC I got build failures in:
 
  drivers/net/spider_net.c
 
 
 [PATCH -mm] spider_net: Misc build fixes after recent netdev stats changes
 
 Unbreak the following:
 
 drivers/net/spider_net.c: In function 'spider_net_release_tx_chain':
 drivers/net/spider_net.c:818: error: 'dev' undeclared (first use in this 
 function)
 drivers/net/spider_net.c:818: error: (Each undeclared identifier is reported 
 only once
 drivers/net/spider_net.c:818: error: for each function it appears in.)
 drivers/net/spider_net.c: In function 'spider_net_xmit':
 drivers/net/spider_net.c:922: error: 'dev' undeclared (first use in this 
 function)
 drivers/net/spider_net.c: In function 'spider_net_pass_skb_up':
 drivers/net/spider_net.c:1018: error: 'dev' undeclared (first use in this 
 function)
 drivers/net/spider_net.c: In function 'spider_net_decode_one_descr':
 drivers/net/spider_net.c:1215: error: 'dev' undeclared (first use in this 
 function)
 make[2]: *** [drivers/net/spider_net.o] Error 1
 
 Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

I've confirmed that this patch fixes the build error in question.

Acked-by: Mel Gorman [EMAIL PROTECTED]

 
 ---
 
  drivers/net/spider_net.c |   24 +++-
  1 file changed, 11 insertions(+), 13 deletions(-)
 
 diff -ruNp a/drivers/net/spider_net.c b/drivers/net/spider_net.c
 --- a/drivers/net/spider_net.c2007-09-22 06:26:39.0 +0530
 +++ b/drivers/net/spider_net.c2007-09-22 12:12:23.0 +0530
 @@ -795,6 +795,7 @@ spider_net_set_low_watermark(struct spid
  static int
  spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
  {
 + struct net_device *dev = card-netdev;
   struct spider_net_descr_chain *chain = card-tx_chain;
   struct spider_net_descr *descr;
   struct spider_net_hw_descr *hwdescr;
 @@ -919,7 +920,7 @@ spider_net_xmit(struct sk_buff *skb, str
   spider_net_release_tx_chain(card, 0);
  
   if (spider_net_prepare_tx_descr(card, skb) != 0) {
 - dev-stats.tx_dropped++;
 + netdev-stats.tx_dropped++;
   netif_stop_queue(netdev);
   return NETDEV_TX_BUSY;
   }
 @@ -979,16 +980,12 @@ static void
  spider_net_pass_skb_up(struct spider_net_descr *descr,
  struct spider_net_card *card)
  {
 - struct spider_net_hw_descr *hwdescr= descr-hwdescr;
 - struct sk_buff *skb;
 - struct net_device *netdev;
 - u32 data_status, data_error;
 -
 - data_status = hwdescr-data_status;
 - data_error = hwdescr-data_error;
 - netdev = card-netdev;
 + struct spider_net_hw_descr *hwdescr = descr-hwdescr;
 + struct sk_buff *skb = descr-skb;
 + struct net_device *netdev = card-netdev;
 + u32 data_status = hwdescr-data_status;
 + u32 data_error = hwdescr-data_error;
  
 - skb = descr-skb;
   skb_put(skb, hwdescr-valid_size);
  
   /* the card seems to add 2 bytes of junk in front
 @@ -1015,8 +1012,8 @@ spider_net_pass_skb_up(struct spider_net
   }
  
   /* update netdevice statistics */
 - dev-stats.rx_packets++;
 - dev-stats.rx_bytes += skb-len;
 + netdev-stats.rx_packets++;
 + netdev-stats.rx_bytes += skb-len;
  
   /* pass skb up to stack */
   netif_receive_skb(skb);
 @@ -1184,6 +1181,7 @@ static int spider_net_resync_tail_ptr(st
  static int
  spider_net_decode_one_descr(struct spider_net_card *card)
  {
 + struct net_device *dev = card-netdev;
   struct spider_net_descr_chain *chain = card-rx_chain;
   struct spider_net_descr *descr = chain-tail;
   struct spider_net_hw_descr *hwdescr = descr-hwdescr;
 @@ -1210,7 +1208,7 @@ spider_net_decode_one_descr(struct spide
(status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
(status == SPIDER_NET_DESCR_FORCE_END) ) {
   if (netif_msg_rx_err(card))
 - dev_err(card-netdev-dev,
 + dev_err(dev-dev,
  dropping RX descriptor with state %d\n, 
 status);
   dev-stats.rx_dropped++;
   goto bad_desc;
 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-24 Thread Mel Gorman

On (22/09/07 14:11), Satyam Sharma didst pronounce:
 
  -static volatile int kgdb_hwbreak_sstep[NR_CPUS];
  +volatile int kgdb_hwbreak_sstep[NR_CPUS];
 
 That looks fishy to me. Why is it volatile-qualified?

It turned out that it was unnecessary. A follow-up patch removed the volatile
and kept the static.

 And does that
 actually want to be a per-cpu var?
 

-- 
-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-24 Thread Greg KH

On Sat, Sep 22, 2007 at 02:51:54PM +0530, Satyam Sharma wrote:
 Hi Greg,
 
 
 On Tue, 18 Sep 2007, Greg KH wrote:
  
  On Tue, Sep 18, 2007 at 03:04:48PM +0530, Satyam Sharma wrote:
   
   But wait ... isn't that a statically-allocated kobject, which were
   supposed to be naughty in the first place?
  
  Yes it is, if you want to dynamically create it, please do.
 
 Sorry for being late to reply, but do you still want such a patch (i.e.
 convert static to dynamic allocation)?

Yes, I'll gladly take such patches.

 I read elsewhere on this thread that you'd merge Kamalesh's patch and
 fix it up to also use kobject_name() yourself. But it's a small/trivial
 driver, so I think just converting it to dynamic allocation right now
 itself (when we've noticed it already) is probably better (?)

Sure that would be great to have.

 [ BTW I don't see the fix in your git trees or quilt queue. So I'll
   make a patch on top of 2.6.23-rc6-mm1 itself. ]

I'm behind in updating my patch queue, sorry, other things came up :(

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 - make access to tasks nsproxy ligther (fix)

2007-09-23 Thread Serge E. Hallyn

Quoting Cedric Le Goater ([EMAIL PROTECTED]):
> Cedric Le Goater wrote:
> > Pavel Emelyanov wrote:
> >> Looks sane :)
> >>
> >> [snip]
> >>
> >>> Index: 2.6.23-rc6-mm1/kernel/exit.c
> >>> ===
> >>> --- 2.6.23-rc6-mm1.orig/kernel/exit.c
> >>> +++ 2.6.23-rc6-mm1/kernel/exit.c
> >>> @@ -408,6 +408,8 @@ void daemonize(const char *name, ...)
> >>>   current->fs = fs;
> >>>   atomic_inc(>count);
> >>>  
> >>> + if (current->nsproxy != init_task.nsproxy)
> >>> + get_nsproxy(init_task.nsproxy);
> >>>   switch_task_namespaces(current, init_task.nsproxy);
> >> shouldn't we make the switch under this if() as well?
> > 
> > right. we can probably simplify switch_task_namespaces() and remove :
> > 
> > if (ns == new)
> > return;
> > 
> > I'll cook a better one today.
> 
> So I removed this test in
> 
> * daemonize() bc it is already done 
> * sys_unshare() bc the nsproxy is always new one 
> * exit_task_namespaces() bc it is called with NULL and the
>   task will die right after that.
> 
> C.
> 
> 
> make-access-to-tasks-nsproxy-lighter.patch breaks unshare()
> 
> when called from unshare(), switch_task_namespaces() takes an 
> extra refcount on the nsproxy, leading to a memory leak of 
> nsproxy objects. 
> 
> Now the problem is that we still need that extra ref when called 
> from daemonize(). Here's an ugly fix for it.
> 
> Signed-off-by: Cedric Le Goater <[EMAIL PROTECTED]>
> Cc: Serge E. Hallyn <[EMAIL PROTECTED]>

Looks good.  Thanks for catching the leak.

Acked-by: Serge E. Hallyn <[EMAIL PROTECTED]>

> Cc: Pavel Emelyanov <[EMAIL PROTECTED]>
> Cc: Eric W. Biederman <[EMAIL PROTECTED]>
> Cc: Oleg Nesterov <[EMAIL PROTECTED]>
> Cc: Paul E. McKenney <[EMAIL PROTECTED]>
> 
> ---
>  include/linux/nsproxy.h |5 +
>  kernel/exit.c   |5 -
>  kernel/nsproxy.c|9 -
>  3 files changed, 9 insertions(+), 10 deletions(-)
> 
> Index: 2.6.23-rc6-mm1/kernel/nsproxy.c
> ===
> --- 2.6.23-rc6-mm1.orig/kernel/nsproxy.c
> +++ 2.6.23-rc6-mm1/kernel/nsproxy.c
> @@ -25,11 +25,6 @@ static struct kmem_cache *nsproxy_cachep
> 
>  struct nsproxy init_nsproxy = INIT_NSPROXY(init_nsproxy);
> 
> -static inline void get_nsproxy(struct nsproxy *ns)
> -{
> - atomic_inc(>count);
> -}
> -
>  /*
>   * creates a copy of "orig" with refcount 1.
>   */
> @@ -205,11 +200,7 @@ void switch_task_namespaces(struct task_
>   might_sleep();
> 
>   ns = p->nsproxy;
> - if (ns == new)
> - return;
> 
> - if (new)
> - get_nsproxy(new);
>   rcu_assign_pointer(p->nsproxy, new);
> 
>   if (ns && atomic_dec_and_test(>count)) {
> Index: 2.6.23-rc6-mm1/kernel/exit.c
> ===
> --- 2.6.23-rc6-mm1.orig/kernel/exit.c
> +++ 2.6.23-rc6-mm1/kernel/exit.c
> @@ -408,7 +408,10 @@ void daemonize(const char *name, ...)
>   current->fs = fs;
>   atomic_inc(>count);
> 
> - switch_task_namespaces(current, init_task.nsproxy);
> + if (current->nsproxy != init_task.nsproxy) {
> + get_nsproxy(init_task.nsproxy);
> + switch_task_namespaces(current, init_task.nsproxy);
> + }
> 
>   exit_files(current);
>   current->files = init_task.files;
> Index: 2.6.23-rc6-mm1/include/linux/nsproxy.h
> ===
> --- 2.6.23-rc6-mm1.orig/include/linux/nsproxy.h
> +++ 2.6.23-rc6-mm1/include/linux/nsproxy.h
> @@ -77,6 +77,11 @@ static inline void put_nsproxy(struct ns
>   }
>  }
> 
> +static inline void get_nsproxy(struct nsproxy *ns)
> +{
> + atomic_inc(>count);
> +}
> +
>  #ifdef CONFIG_CONTAINER_NS
>  int ns_container_clone(struct task_struct *tsk);
>  #else
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-23 Thread Fengguang Wu

On Sun, Sep 23, 2007 at 03:02:35PM +0200, Peter Zijlstra wrote:
> On Sun, 23 Sep 2007 09:20:49 +0800 Fengguang Wu <[EMAIL PROTECTED]>
> wrote:
> 
> > On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
> > > On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu <[EMAIL PROTECTED]>
> > > wrote:
> > > 
> > > > --- linux-2.6.22.orig/mm/page-writeback.c
> > > > +++ linux-2.6.22/mm/page-writeback.c
> > > > @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
> > > > bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> > > > }
> > > >  
> > > > +   printk(KERN_DEBUG "balance_dirty_pages written %lu %lu 
> > > > congested %d limits %lu %lu %lu %lu %lu %ld\n",
> > > > +   pages_written,
> > > > +   write_chunk - wbc.nr_to_write,
> > > > +   bdi_write_congested(bdi),
> > > > +   background_thresh, dirty_thresh,
> > > > +   bdi_thresh, bdi_nr_reclaimable, 
> > > > bdi_nr_writeback,
> > > > +   bdi_thresh - bdi_nr_reclaimable - 
> > > > bdi_nr_writeback);
> > > > +
> > > > if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > > > break;
> > > > if (pages_written >= write_chunk)
> > > > 
> > > 
> > > > [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
> > > > 195477 5801 5760 288 -247
> > > 
> > > 
> > > 
> > > Could you perhaps instrument the writeback_inodes() path to see why
> > > nothing is written out? - the attached patch would be a nice start.
> > 
> > Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
> > (need to watch it in a longer time window).
> > 
> > Anyway here's the output of your patch:
> > sb_locked 0
> > sb_empty 97011
> 
> It this the delta during one of these lockups? If so, it would seem

delta since boot time, for 2.6.23-rc6-mm1, no lockups ;-)

> that although dirty pages are reported against the BDI, no actual dirty
> inodes could be found.

no lockups, therefore not necessarily.
There are many other calls into writeback_inodes().

> [ note to self: writeback_inodes() seems to write out to any superblock
>   in the system. Might want to limit that to superblocks on wbc->bdi ]

generic_sync_sb_inodes() does have something like:

if (wbc->bdi && bdi != wbc->bdi)
continue;

> You say that switching to .23-rc6-mm1 solved it in your case. You are
> developing in the writeback_inodes() path, right? Could it be one of
> your local changes that confused it here?

There are a lot of changes between them:
- bdi-v9 vs bdi-v10;
- a lot writeback patches in -mm
- some writeback patches maintained locally
I just rebased my patches to .23-rc6-mm1...

> > > Most peculiar. It seems writeback_inodes() doesn't even attempt to
> > > write out stuff. Nor are outstanding writeback pages completed.
> > 
> > Still true. Another problem is that balance_dirty_pages() is being called 
> > even
> > when there are only 54 dirty pages. That could slow down writers 
> > unnecessarily.
> > 
> > balance_dirty_pages() should not be entered at all with small nr_dirty.
> > 
> > Look at these lines:
> > [  197.471619] balance_dirty_pages for tar written 405 405 congested 0 
> > global 196554 54 403 196097 bdi 0 0 398 -398
> > [  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 0 0 380 -380
> > [  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 23 0 369 -346
> > [  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 23 0 366 -343
> > [  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 23 0 365 -342
> > [  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 23 0 364 -341
> > [  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 372 196128 bdi 23 0 362 -339
> > [  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196554 54 341 196159 bdi 47 0 327 -280
> > [  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196546 54 279 196213 bdi 95 0 279 -184
> > [  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196546 54 248 196244 bdi 95 0 255 -160
> > [  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196546 54 217 196275 bdi 143 0 210 -67
> > [  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196546 54 217 196275 bdi 143 0 209 -66
> > [  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
> > 196546 54 155 196337 bdi 167 0 163 4
> 
> That is an interesting idea

Re: 2.6.23-rc6-mm1 -- ipg.c don't compile on i386 with CONFIG_HIGHMEM64G

2007-09-23 Thread trem

Hi

I've tried to compile 2.6.23-rc6-mm1, but it fails on ipg.c with the error :
Setup is 10964 bytes (padded to 11264 bytes).
System is 1640 kB
Kernel: arch/i386/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 2030 modules
WARNING: Can't handle masks in drivers/mtd/nand/cafe_nand:0
ERROR: "__udivdi3" [drivers/net/ipg.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
error: Bad exit status from /home/trem/rpm/tmp/rpm-tmp.23895 (%build)

I've instigated a bit, and I've found this code in ipg.c :

static void ipg_nic_txfree(struct net_device *dev)
{
   struct ipg_nic_private *sp = netdev_priv(dev);
   void __iomem *ioaddr = sp->ioaddr;
   const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) -
   (sp->txd_map / sizeof(struct ipg_tx)) - 1;
   unsigned int released, pending;



sp->txd_map is an u64
because :
dma_addr_t txd_map;

And in asm-i386/types.h, I see :
#ifdef CONFIG_HIGHMEM64G
typedef u64 dma_addr_t;
#else
typedef u32 dma_addr_t;
#endif
I my config, I use CONFIG_HIGHMEM64G

sizeof(struct ipg_tx) is an u32
So the div failed on i386 because of u64 / u32.

I propose the following patch :

--- linux-2.6.22.old/drivers/net/ipg.c  2007-09-21 20:28:57.0 -0400
+++ linux-2.6.22.new/drivers/net/ipg.c  2007-09-21 20:22:15.0 -0400
@@ -837,7 +837,7 @@ static void ipg_nic_txfree(struct net_de
struct ipg_nic_private *sp = netdev_priv(dev);
void __iomem *ioaddr = sp->ioaddr;
const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) -
-   (sp->txd_map / sizeof(struct ipg_tx)) - 1;
+   do_div(sp->txd_map , sizeof(struct ipg_tx)) - 1;
unsigned int released, pending;

IPG_DEBUG_MSG("_nic_txfree\n");


With this patch, it compiles on i386 and x86_64, but I haven't tested if
this network card works (I don't have it).


regards,
trem

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-23 Thread Peter Zijlstra

On Sun, 23 Sep 2007 09:20:49 +0800 Fengguang Wu <[EMAIL PROTECTED]>
wrote:

> On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
> > On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu <[EMAIL PROTECTED]>
> > wrote:
> > 
> > > --- linux-2.6.22.orig/mm/page-writeback.c
> > > +++ linux-2.6.22/mm/page-writeback.c
> > > @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
> > >   bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> > >   }
> > >  
> > > + printk(KERN_DEBUG "balance_dirty_pages written %lu %lu 
> > > congested %d limits %lu %lu %lu %lu %lu %ld\n",
> > > + pages_written,
> > > + write_chunk - wbc.nr_to_write,
> > > + bdi_write_congested(bdi),
> > > + background_thresh, dirty_thresh,
> > > + bdi_thresh, bdi_nr_reclaimable, 
> > > bdi_nr_writeback,
> > > + bdi_thresh - bdi_nr_reclaimable - 
> > > bdi_nr_writeback);
> > > +
> > >   if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > >   break;
> > >   if (pages_written >= write_chunk)
> > > 
> > 
> > > [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
> > > 195477 5801 5760 288 -247
> > 
> > 
> > 
> > Could you perhaps instrument the writeback_inodes() path to see why
> > nothing is written out? - the attached patch would be a nice start.
> 
> Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
> (need to watch it in a longer time window).
> 
> Anyway here's the output of your patch:
> sb_locked 0
> sb_empty 97011

It this the delta during one of these lockups? If so, it would seem
that although dirty pages are reported against the BDI, no actual dirty
inodes could be found.

[ note to self: writeback_inodes() seems to write out to any superblock
  in the system. Might want to limit that to superblocks on wbc->bdi ]

You say that switching to .23-rc6-mm1 solved it in your case. You are
developing in the writeback_inodes() path, right? Could it be one of
your local changes that confused it here?

> > Most peculiar. It seems writeback_inodes() doesn't even attempt to
> > write out stuff. Nor are outstanding writeback pages completed.
> 
> Still true. Another problem is that balance_dirty_pages() is being called even
> when there are only 54 dirty pages. That could slow down writers 
> unnecessarily.
> 
> balance_dirty_pages() should not be entered at all with small nr_dirty.
> 
> Look at these lines:
> [  197.471619] balance_dirty_pages for tar written 405 405 congested 0 global 
> 196554 54 403 196097 bdi 0 0 398 -398
> [  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 0 0 380 -380
> [  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 23 0 369 -346
> [  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 23 0 366 -343
> [  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 23 0 365 -342
> [  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 23 0 364 -341
> [  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 372 196128 bdi 23 0 362 -339
> [  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196554 54 341 196159 bdi 47 0 327 -280
> [  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196546 54 279 196213 bdi 95 0 279 -184
> [  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196546 54 248 196244 bdi 95 0 255 -160
> [  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196546 54 217 196275 bdi 143 0 210 -67
> [  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196546 54 217 196275 bdi 143 0 209 -66
> [  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
> 196546 54 155 196337 bdi 167 0 163 4

That is an interesting idea how about this:

---
Subject: mm: speed up writeback ramp-up on clean systems

We allow violation of bdi limits if there is a lot of room on the
system. Once we hit half the total limit we start enforcing bdi limits
and bdi ramp-up should happen. Doing it this way avoids many small
writeouts on an otherwise idle system and should also speed up the
ramp-up.

Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
---

Index: linux-2.6/mm/page-writeback.c
===
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -355,8 +355,8 @@ get_dirty_limits(long *pbackground, long
  */
 static void balance_dirty_pages(struct address_space *mapping)
 {
-   long bdi_nr_reclaimable;
-   long bdi_nr_writeback;
+   long nr_reclaimable,

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-23 Thread Peter Zijlstra

On Sun, 23 Sep 2007 09:20:49 +0800 Fengguang Wu [EMAIL PROTECTED]
wrote:

 On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
  On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu [EMAIL PROTECTED]
  wrote:
  
   --- linux-2.6.22.orig/mm/page-writeback.c
   +++ linux-2.6.22/mm/page-writeback.c
   @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
 bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
 }

   + printk(KERN_DEBUG balance_dirty_pages written %lu %lu 
   congested %d limits %lu %lu %lu %lu %lu %ld\n,
   + pages_written,
   + write_chunk - wbc.nr_to_write,
   + bdi_write_congested(bdi),
   + background_thresh, dirty_thresh,
   + bdi_thresh, bdi_nr_reclaimable, 
   bdi_nr_writeback,
   + bdi_thresh - bdi_nr_reclaimable - 
   bdi_nr_writeback);
   +
 if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
 break;
 if (pages_written = write_chunk)
   
  
   [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
   195477 5801 5760 288 -247
  
  snip long series of mostly identical lines
  
  Could you perhaps instrument the writeback_inodes() path to see why
  nothing is written out? - the attached patch would be a nice start.
 
 Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
 (need to watch it in a longer time window).
 
 Anyway here's the output of your patch:
 sb_locked 0
 sb_empty 97011

It this the delta during one of these lockups? If so, it would seem
that although dirty pages are reported against the BDI, no actual dirty
inodes could be found.

[ note to self: writeback_inodes() seems to write out to any superblock
  in the system. Might want to limit that to superblocks on wbc-bdi ]

You say that switching to .23-rc6-mm1 solved it in your case. You are
developing in the writeback_inodes() path, right? Could it be one of
your local changes that confused it here?

  Most peculiar. It seems writeback_inodes() doesn't even attempt to
  write out stuff. Nor are outstanding writeback pages completed.
 
 Still true. Another problem is that balance_dirty_pages() is being called even
 when there are only 54 dirty pages. That could slow down writers 
 unnecessarily.
 
 balance_dirty_pages() should not be entered at all with small nr_dirty.
 
 Look at these lines:
 [  197.471619] balance_dirty_pages for tar written 405 405 congested 0 global 
 196554 54 403 196097 bdi 0 0 398 -398
 [  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 0 0 380 -380
 [  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 23 0 369 -346
 [  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 23 0 366 -343
 [  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 23 0 365 -342
 [  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 23 0 364 -341
 [  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 372 196128 bdi 23 0 362 -339
 [  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
 196554 54 341 196159 bdi 47 0 327 -280
 [  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
 196546 54 279 196213 bdi 95 0 279 -184
 [  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
 196546 54 248 196244 bdi 95 0 255 -160
 [  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
 196546 54 217 196275 bdi 143 0 210 -67
 [  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
 196546 54 217 196275 bdi 143 0 209 -66
 [  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
 196546 54 155 196337 bdi 167 0 163 4

That is an interesting idea how about this:

---
Subject: mm: speed up writeback ramp-up on clean systems

We allow violation of bdi limits if there is a lot of room on the
system. Once we hit half the total limit we start enforcing bdi limits
and bdi ramp-up should happen. Doing it this way avoids many small
writeouts on an otherwise idle system and should also speed up the
ramp-up.

Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---

Index: linux-2.6/mm/page-writeback.c
===
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -355,8 +355,8 @@ get_dirty_limits(long *pbackground, long
  */
 static void balance_dirty_pages(struct address_space *mapping)
 {
-   long bdi_nr_reclaimable;
-   long bdi_nr_writeback;
+   long nr_reclaimable, bdi_nr_reclaimable;
+   long nr_writeback, bdi_nr_writeback;
long background_thresh;

Re: 2.6.23-rc6-mm1 -- ipg.c don't compile on i386 with CONFIG_HIGHMEM64G

2007-09-23 Thread trem

Hi

I've tried to compile 2.6.23-rc6-mm1, but it fails on ipg.c with the error :
Setup is 10964 bytes (padded to 11264 bytes).
System is 1640 kB
Kernel: arch/i386/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 2030 modules
WARNING: Can't handle masks in drivers/mtd/nand/cafe_nand:0
ERROR: __udivdi3 [drivers/net/ipg.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
error: Bad exit status from /home/trem/rpm/tmp/rpm-tmp.23895 (%build)

I've instigated a bit, and I've found this code in ipg.c :

static void ipg_nic_txfree(struct net_device *dev)
{
   struct ipg_nic_private *sp = netdev_priv(dev);
   void __iomem *ioaddr = sp-ioaddr;
   const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) -
   (sp-txd_map / sizeof(struct ipg_tx)) - 1;
   unsigned int released, pending;



sp-txd_map is an u64
because :
dma_addr_t txd_map;

And in asm-i386/types.h, I see :
#ifdef CONFIG_HIGHMEM64G
typedef u64 dma_addr_t;
#else
typedef u32 dma_addr_t;
#endif
I my config, I use CONFIG_HIGHMEM64G

sizeof(struct ipg_tx) is an u32
So the div failed on i386 because of u64 / u32.

I propose the following patch :

--- linux-2.6.22.old/drivers/net/ipg.c  2007-09-21 20:28:57.0 -0400
+++ linux-2.6.22.new/drivers/net/ipg.c  2007-09-21 20:22:15.0 -0400
@@ -837,7 +837,7 @@ static void ipg_nic_txfree(struct net_de
struct ipg_nic_private *sp = netdev_priv(dev);
void __iomem *ioaddr = sp-ioaddr;
const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) -
-   (sp-txd_map / sizeof(struct ipg_tx)) - 1;
+   do_div(sp-txd_map , sizeof(struct ipg_tx)) - 1;
unsigned int released, pending;

IPG_DEBUG_MSG(_nic_txfree\n);


With this patch, it compiles on i386 and x86_64, but I haven't tested if
this network card works (I don't have it).


regards,
trem

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-23 Thread Fengguang Wu

On Sun, Sep 23, 2007 at 03:02:35PM +0200, Peter Zijlstra wrote:
 On Sun, 23 Sep 2007 09:20:49 +0800 Fengguang Wu [EMAIL PROTECTED]
 wrote:
 
  On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
   On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu [EMAIL PROTECTED]
   wrote:
   
--- linux-2.6.22.orig/mm/page-writeback.c
+++ linux-2.6.22/mm/page-writeback.c
@@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
}
 
+   printk(KERN_DEBUG balance_dirty_pages written %lu %lu 
congested %d limits %lu %lu %lu %lu %lu %ld\n,
+   pages_written,
+   write_chunk - wbc.nr_to_write,
+   bdi_write_congested(bdi),
+   background_thresh, dirty_thresh,
+   bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
+   bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
+
if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
break;
if (pages_written = write_chunk)

   
[ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
195477 5801 5760 288 -247
   
   snip long series of mostly identical lines
   
   Could you perhaps instrument the writeback_inodes() path to see why
   nothing is written out? - the attached patch would be a nice start.
  
  Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
  (need to watch it in a longer time window).
  
  Anyway here's the output of your patch:
  sb_locked 0
  sb_empty 97011
 
 It this the delta during one of these lockups? If so, it would seem

delta since boot time, for 2.6.23-rc6-mm1, no lockups ;-)

 that although dirty pages are reported against the BDI, no actual dirty
 inodes could be found.

no lockups, therefore not necessarily.
There are many other calls into writeback_inodes().

 [ note to self: writeback_inodes() seems to write out to any superblock
   in the system. Might want to limit that to superblocks on wbc-bdi ]

generic_sync_sb_inodes() does have something like:

if (wbc-bdi  bdi != wbc-bdi)
continue;

 You say that switching to .23-rc6-mm1 solved it in your case. You are
 developing in the writeback_inodes() path, right? Could it be one of
 your local changes that confused it here?

There are a lot of changes between them:
- bdi-v9 vs bdi-v10;
- a lot writeback patches in -mm
- some writeback patches maintained locally
I just rebased my patches to .23-rc6-mm1...

   Most peculiar. It seems writeback_inodes() doesn't even attempt to
   write out stuff. Nor are outstanding writeback pages completed.
  
  Still true. Another problem is that balance_dirty_pages() is being called 
  even
  when there are only 54 dirty pages. That could slow down writers 
  unnecessarily.
  
  balance_dirty_pages() should not be entered at all with small nr_dirty.
  
  Look at these lines:
  [  197.471619] balance_dirty_pages for tar written 405 405 congested 0 
  global 196554 54 403 196097 bdi 0 0 398 -398
  [  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 0 0 380 -380
  [  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 23 0 369 -346
  [  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 23 0 366 -343
  [  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 23 0 365 -342
  [  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 23 0 364 -341
  [  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 372 196128 bdi 23 0 362 -339
  [  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
  196554 54 341 196159 bdi 47 0 327 -280
  [  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
  196546 54 279 196213 bdi 95 0 279 -184
  [  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
  196546 54 248 196244 bdi 95 0 255 -160
  [  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
  196546 54 217 196275 bdi 143 0 210 -67
  [  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
  196546 54 217 196275 bdi 143 0 209 -66
  [  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
  196546 54 155 196337 bdi 167 0 163 4
 
 That is an interesting idea how about this:

It looks like a workaround, but it does solve the most important problem.
And it is a good logic by itself.  So I'd vote for it.

The fundamental problem is that the

Re: 2.6.23-rc6-mm1 - make access to tasks nsproxy ligther (fix)

2007-09-23 Thread Serge E. Hallyn

Quoting Cedric Le Goater ([EMAIL PROTECTED]):
 Cedric Le Goater wrote:
  Pavel Emelyanov wrote:
  Looks sane :)
 
  [snip]
 
  Index: 2.6.23-rc6-mm1/kernel/exit.c
  ===
  --- 2.6.23-rc6-mm1.orig/kernel/exit.c
  +++ 2.6.23-rc6-mm1/kernel/exit.c
  @@ -408,6 +408,8 @@ void daemonize(const char *name, ...)
current-fs = fs;
atomic_inc(fs-count);
   
  + if (current-nsproxy != init_task.nsproxy)
  + get_nsproxy(init_task.nsproxy);
switch_task_namespaces(current, init_task.nsproxy);
  shouldn't we make the switch under this if() as well?
  
  right. we can probably simplify switch_task_namespaces() and remove :
  
  if (ns == new)
  return;
  
  I'll cook a better one today.
 
 So I removed this test in
 
 * daemonize() bc it is already done 
 * sys_unshare() bc the nsproxy is always new one 
 * exit_task_namespaces() bc it is called with NULL and the
   task will die right after that.
 
 C.
 
 
 make-access-to-tasks-nsproxy-lighter.patch breaks unshare()
 
 when called from unshare(), switch_task_namespaces() takes an 
 extra refcount on the nsproxy, leading to a memory leak of 
 nsproxy objects. 
 
 Now the problem is that we still need that extra ref when called 
 from daemonize(). Here's an ugly fix for it.
 
 Signed-off-by: Cedric Le Goater [EMAIL PROTECTED]
 Cc: Serge E. Hallyn [EMAIL PROTECTED]

Looks good.  Thanks for catching the leak.

Acked-by: Serge E. Hallyn [EMAIL PROTECTED]

 Cc: Pavel Emelyanov [EMAIL PROTECTED]
 Cc: Eric W. Biederman [EMAIL PROTECTED]
 Cc: Oleg Nesterov [EMAIL PROTECTED]
 Cc: Paul E. McKenney [EMAIL PROTECTED]
 
 ---
  include/linux/nsproxy.h |5 +
  kernel/exit.c   |5 -
  kernel/nsproxy.c|9 -
  3 files changed, 9 insertions(+), 10 deletions(-)
 
 Index: 2.6.23-rc6-mm1/kernel/nsproxy.c
 ===
 --- 2.6.23-rc6-mm1.orig/kernel/nsproxy.c
 +++ 2.6.23-rc6-mm1/kernel/nsproxy.c
 @@ -25,11 +25,6 @@ static struct kmem_cache *nsproxy_cachep
 
  struct nsproxy init_nsproxy = INIT_NSPROXY(init_nsproxy);
 
 -static inline void get_nsproxy(struct nsproxy *ns)
 -{
 - atomic_inc(ns-count);
 -}
 -
  /*
   * creates a copy of orig with refcount 1.
   */
 @@ -205,11 +200,7 @@ void switch_task_namespaces(struct task_
   might_sleep();
 
   ns = p-nsproxy;
 - if (ns == new)
 - return;
 
 - if (new)
 - get_nsproxy(new);
   rcu_assign_pointer(p-nsproxy, new);
 
   if (ns  atomic_dec_and_test(ns-count)) {
 Index: 2.6.23-rc6-mm1/kernel/exit.c
 ===
 --- 2.6.23-rc6-mm1.orig/kernel/exit.c
 +++ 2.6.23-rc6-mm1/kernel/exit.c
 @@ -408,7 +408,10 @@ void daemonize(const char *name, ...)
   current-fs = fs;
   atomic_inc(fs-count);
 
 - switch_task_namespaces(current, init_task.nsproxy);
 + if (current-nsproxy != init_task.nsproxy) {
 + get_nsproxy(init_task.nsproxy);
 + switch_task_namespaces(current, init_task.nsproxy);
 + }
 
   exit_files(current);
   current-files = init_task.files;
 Index: 2.6.23-rc6-mm1/include/linux/nsproxy.h
 ===
 --- 2.6.23-rc6-mm1.orig/include/linux/nsproxy.h
 +++ 2.6.23-rc6-mm1/include/linux/nsproxy.h
 @@ -77,6 +77,11 @@ static inline void put_nsproxy(struct ns
   }
  }
 
 +static inline void get_nsproxy(struct nsproxy *ns)
 +{
 + atomic_inc(ns-count);
 +}
 +
  #ifdef CONFIG_CONTAINER_NS
  int ns_container_clone(struct task_struct *tsk);
  #else
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-22 Thread Fengguang Wu

On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
> On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu <[EMAIL PROTECTED]>
> wrote:
> 
> > --- linux-2.6.22.orig/mm/page-writeback.c
> > +++ linux-2.6.22/mm/page-writeback.c
> > @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
> > bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> > }
> >  
> > +   printk(KERN_DEBUG "balance_dirty_pages written %lu %lu 
> > congested %d limits %lu %lu %lu %lu %lu %ld\n",
> > +   pages_written,
> > +   write_chunk - wbc.nr_to_write,
> > +   bdi_write_congested(bdi),
> > +   background_thresh, dirty_thresh,
> > +   bdi_thresh, bdi_nr_reclaimable, 
> > bdi_nr_writeback,
> > +   bdi_thresh - bdi_nr_reclaimable - 
> > bdi_nr_writeback);
> > +
> > if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > break;
> > if (pages_written >= write_chunk)
> > 
> 
> > [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
> > 195477 5801 5760 288 -247
> 
> 
> 
> Could you perhaps instrument the writeback_inodes() path to see why
> nothing is written out? - the attached patch would be a nice start.

Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
(need to watch it in a longer time window).

Anyway here's the output of your patch:
sb_locked 0
sb_empty 97011

> Most peculiar. It seems writeback_inodes() doesn't even attempt to
> write out stuff. Nor are outstanding writeback pages completed.

Still true. Another problem is that balance_dirty_pages() is being called even
when there are only 54 dirty pages. That could slow down writers unnecessarily.

balance_dirty_pages() should not be entered at all with small nr_dirty.

Look at these lines:
[  197.471619] balance_dirty_pages for tar written 405 405 congested 0 global 
196554 54 403 196097 bdi 0 0 398 -398
[  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 0 0 380 -380
[  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 369 -346
[  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 366 -343
[  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 365 -342
[  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 364 -341
[  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 362 -339
[  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 341 196159 bdi 47 0 327 -280
[  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 279 196213 bdi 95 0 279 -184
[  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 248 196244 bdi 95 0 255 -160
[  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 217 196275 bdi 143 0 210 -67
[  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 217 196275 bdi 143 0 209 -66
[  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 155 196337 bdi 167 0 163 4

The trace messages are generated by the following code:

--- linux-2.6.23-rc6-mm1.orig/mm/page-writeback.c
+++ linux-2.6.23-rc6-mm1/mm/page-writeback.c
@@ -421,6 +421,18 @@ static void balance_dirty_pages(struct a
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
}

+   printk(KERN_DEBUG "balance_dirty_pages for %s written %lu %lu 
congested %d "
+   "global %lu %lu %lu %ld bdi %lu %lu %lu %ld\n",
+   current->comm,
+   pages_written, write_chunk - wbc.nr_to_write,
+   bdi_write_congested(bdi),
+   dirty_thresh,
+   global_dirty_unstable_pages(), 
global_page_state(NR_WRITEBACK),
+   dirty_thresh -
+   global_dirty_unstable_pages() - 
global_page_state(NR_WRITEBACK),
+   bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
+   bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
+   
if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
break;
if (pages_written >= write_chunk)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-22 Thread Peter Zijlstra

On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu <[EMAIL PROTECTED]>
wrote:

> --- linux-2.6.22.orig/mm/page-writeback.c
> +++ linux-2.6.22/mm/page-writeback.c
> @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
>   bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
>   }
>  
> + printk(KERN_DEBUG "balance_dirty_pages written %lu %lu 
> congested %d limits %lu %lu %lu %lu %lu %ld\n",
> + pages_written,
> + write_chunk - wbc.nr_to_write,
> + bdi_write_congested(bdi),
> + background_thresh, dirty_thresh,
> + bdi_thresh, bdi_nr_reclaimable, 
> bdi_nr_writeback,
> + bdi_thresh - bdi_nr_reclaimable - 
> bdi_nr_writeback);
> +
>   if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
>   break;
>   if (pages_written >= write_chunk)
> 

> [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
> 195477 5801 5760 288 -247



Most peculiar. It seems writeback_inodes() doesn't even attempt to
write out stuff. Nor are outstanding writeback pages completed.

Could you perhaps instrument the writeback_inodes() path to see why
nothing is written out? - the attached patch would be a nice start.

> Here are some messages when doing large dds:

> [  511.733791] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
> 198913 10999 18288 0 -7289
> [  511.735048] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
> 198913 12012 16752 0 -4740
> [  511.736506] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
> 198913 12306 15192 1056 -3942
> [  512.002169] balance_dirty_pages written 1547 1547 congested 2 limits 49726 
> 198905 13471 12624 1848 -1001
> [  512.003795] balance_dirty_pages written 1540 1540 congested 2 limits 49723 
> 198892 13470 11088 3384 -1002
> [  512.083517] balance_dirty_pages written 1540 1540 congested 2 limits 49712 
> 198850 13572 9336 4992 -756
> [  512.085145] balance_dirty_pages written 1540 1540 congested 2 limits 49706 
> 198825 13569 7800 6528 -759
> [  512.086773] balance_dirty_pages written 1540 1540 congested 2 limits 49702 
> 198808 13568 6240 8064 -736
> [  512.184267] balance_dirty_pages written 1539 1539 congested 2 limits 49697 
> 198791 13649 5592 8592 -535
> [  512.185903] balance_dirty_pages written 1540 1540 congested 2 limits 49694 
> 198778 13649 4056 10152 -559
> [  512.187506] balance_dirty_pages written 1540 1540 congested 2 limits 49688 
> 198753 13647 2496 11688 -537
> [  512.259848] balance_dirty_pages written 1546 1546 congested 2 limits 49682 
> 198728 13769 744 13248 -223
> [  512.554646] balance_dirty_pages written 618 618 congested 2 limits 49678 
> 198712 14242 1 13368 873
> [  512.585204] balance_dirty_pages written 794 794 congested 2 limits 49657 
> 198630 14500 1 12936 1563
> [  527.714294] balance_dirty_pages written 1540 1540 congested 0 limits 49608 
> 198432 29502 28080 0 1422

This looks like a sane series, we have a surplus of reclaimable pages,
start writeout, which increases writeback pages, and congests the
device, and eventually all subsides and we finish congestion and quit.

> [  529.298022] balance_dirty_pages written 1540 1540 congested 0 limits 49579 
> 198318 30307 34704 0 -4397
> [  529.304975] balance_dirty_pages written 1539 1539 congested 0 limits 49575 
> 198302 32451 30600 0 1851
> [  529.305205] balance_dirty_pages written 1538 1538 congested 0 limits 49576 
> 198306 32571 30384 0 2187
> [  529.306988] balance_dirty_pages written 1537 1537 congested 0 limits 49580 
> 198320 32702 30120 0 2582
> [  530.893830] balance_dirty_pages written 1541 1541 congested 0 limits 49553 
> 198214 34216 35352 0 -1136
> [  530.893970] balance_dirty_pages written 1538 1538 congested 0 limits 49553 
> 198214 34290 35160 0 -870
> [  530.899873] balance_dirty_pages written 1546 1546 congested 0 limits 49556 
> 198227 36248 31248 0 5000
> [  530.900282] balance_dirty_pages written 1546 1546 congested 0 limits 49557 
> 198231 36442 30864 0 5578
> [  530.900586] balance_dirty_pages written 1539 1539 congested 0 limits 49558 
> 198235 36601 30552 0 6049
> [  532.343097] balance_dirty_pages written 1541 1541 congested 0 limits 49530 
> 198120 37862 36072 0 1790
> [  532.343595] balance_dirty_pages written 1547 1547 congested 0 limits 49533 
> 198132 38081 35640 0 2441
> [  533.872355] balance_dirty_pages written 1540 1540 congested 0 limits 49502 
> 198009 41148 37224 0 3924
> [  542.566083] balance_dirty_pages written 1542 1542 congested 0 limits 49367 
> 197469 51948 52680 0 -732
> [  542.567093] balance_dirty_pages written 1537 1537 congested 0 limits 49370 
> 197482 52663 50952 0 1711
> [  542.586552] balance_dirty_pages written 1540 1540 congested 0 limits 49352 
> 197410 54545 46032 0 8513
> [  542.606002] balance_dirty_pages written 1540 1540

Re: 2.6.23-rc6-mm1

2007-09-22 Thread Satyam Sharma

Hi Greg,

On Tue, 18 Sep 2007, Greg KH wrote:
> 
> On Tue, Sep 18, 2007 at 03:04:48PM +0530, Satyam Sharma wrote:
> > 
> > But wait ... isn't that a statically-allocated kobject, which were
> > supposed to be "naughty" in the first place?
> 
> Yes it is, if you want to dynamically create it, please do.

Sorry for being late to reply, but do you still want such a patch (i.e.
convert static to dynamic allocation)?

I read elsewhere on this thread that you'd merge Kamalesh's patch and
fix it up to also use kobject_name() yourself. But it's a small/trivial
driver, so I think just converting it to dynamic allocation right now
itself (when we've noticed it already) is probably better (?)

[ BTW I don't see the fix in your git trees or quilt queue. So I'll
  make a patch on top of 2.6.23-rc6-mm1 itself. ]

Thanks,

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-22 Thread Satyam Sharma


> -static volatile int kgdb_hwbreak_sstep[NR_CPUS];
> +volatile int kgdb_hwbreak_sstep[NR_CPUS];

That looks fishy to me. Why is it volatile-qualified? And does that
actually want to be a per-cpu var?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
> 
> BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
> IIRC I got build failures in:

> drivers/net/spider_net.c

Fixing the above showed up another problem in another file of the
same driver (drivers/net/spider_net_ethtool.c)


[PATCH -mm] spider_net_ethtool: Keep up with recent netdev stats changes

Unbreak the following:

drivers/net/spider_net_ethtool.c: In function 'spider_net_get_ethtool_stats':
drivers/net/spider_net_ethtool.c:160: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:161: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:162: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:163: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:164: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:165: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:166: error: structure has no member named 
'netdev_stats'
make[2]: *** [drivers/net/spider_net_ethtool.o] Error 1

Also do another ARRAY_SIZE() cleanup while at it.

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

---

 drivers/net/spider_net_ethtool.c |   18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff -ruNp a/drivers/net/spider_net_ethtool.c b/drivers/net/spider_net_ethtool.c
--- a/drivers/net/spider_net_ethtool.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net_ethtool.c  2007-09-22 12:43:51.0 +0530
@@ -28,8 +28,6 @@
 #include "spider_net.h"
 
 
-#define SPIDER_NET_NUM_STATS 13
-
 static struct {
const char str[ETH_GSTRING_LEN];
 } ethtool_stats_keys[] = {
@@ -149,7 +147,7 @@ spider_net_ethtool_get_ringparam(struct 
 
 static int spider_net_get_stats_count(struct net_device *netdev)
 {
-   return SPIDER_NET_NUM_STATS;
+   return ARRAY_SIZE(ethtool_stats_keys);
 }
 
 static void spider_net_get_ethtool_stats(struct net_device *netdev,
@@ -157,13 +155,13 @@ static void spider_net_get_ethtool_stats
 {
struct spider_net_card *card = netdev->priv;
 
-   data[0] = card->netdev_stats.tx_packets;
-   data[1] = card->netdev_stats.tx_bytes;
-   data[2] = card->netdev_stats.rx_packets;
-   data[3] = card->netdev_stats.rx_bytes;
-   data[4] = card->netdev_stats.tx_errors;
-   data[5] = card->netdev_stats.tx_dropped;
-   data[6] = card->netdev_stats.rx_dropped;
+   data[0] = netdev->stats.tx_packets;
+   data[1] = netdev->stats.tx_bytes;
+   data[2] = netdev->stats.rx_packets;
+   data[3] = netdev->stats.rx_bytes;
+   data[4] = netdev->stats.tx_errors;
+   data[5] = netdev->stats.tx_dropped;
+   data[6] = netdev->stats.rx_dropped;
data[7] = card->spider_stats.rx_desc_error;
data[8] = card->spider_stats.tx_timeouts;
data[9] = card->spider_stats.alloc_rx_skb_error;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
> 
> BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
> IIRC I got build failures in:

> drivers/net/spider_net.c


[PATCH -mm] spider_net: Misc build fixes after recent netdev stats changes

Unbreak the following:

drivers/net/spider_net.c: In function 'spider_net_release_tx_chain':
drivers/net/spider_net.c:818: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c:818: error: (Each undeclared identifier is reported 
only once
drivers/net/spider_net.c:818: error: for each function it appears in.)
drivers/net/spider_net.c: In function 'spider_net_xmit':
drivers/net/spider_net.c:922: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_pass_skb_up':
drivers/net/spider_net.c:1018: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_decode_one_descr':
drivers/net/spider_net.c:1215: error: 'dev' undeclared (first use in this 
function)
make[2]: *** [drivers/net/spider_net.o] Error 1

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>

---

 drivers/net/spider_net.c |   24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff -ruNp a/drivers/net/spider_net.c b/drivers/net/spider_net.c
--- a/drivers/net/spider_net.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net.c  2007-09-22 12:12:23.0 +0530
@@ -795,6 +795,7 @@ spider_net_set_low_watermark(struct spid
 static int
 spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
 {
+   struct net_device *dev = card->netdev;
struct spider_net_descr_chain *chain = >tx_chain;
struct spider_net_descr *descr;
struct spider_net_hw_descr *hwdescr;
@@ -919,7 +920,7 @@ spider_net_xmit(struct sk_buff *skb, str
spider_net_release_tx_chain(card, 0);
 
if (spider_net_prepare_tx_descr(card, skb) != 0) {
-   dev->stats.tx_dropped++;
+   netdev->stats.tx_dropped++;
netif_stop_queue(netdev);
return NETDEV_TX_BUSY;
}
@@ -979,16 +980,12 @@ static void
 spider_net_pass_skb_up(struct spider_net_descr *descr,
   struct spider_net_card *card)
 {
-   struct spider_net_hw_descr *hwdescr= descr->hwdescr;
-   struct sk_buff *skb;
-   struct net_device *netdev;
-   u32 data_status, data_error;
-
-   data_status = hwdescr->data_status;
-   data_error = hwdescr->data_error;
-   netdev = card->netdev;
+   struct spider_net_hw_descr *hwdescr = descr->hwdescr;
+   struct sk_buff *skb = descr->skb;
+   struct net_device *netdev = card->netdev;
+   u32 data_status = hwdescr->data_status;
+   u32 data_error = hwdescr->data_error;
 
-   skb = descr->skb;
skb_put(skb, hwdescr->valid_size);
 
/* the card seems to add 2 bytes of junk in front
@@ -1015,8 +1012,8 @@ spider_net_pass_skb_up(struct spider_net
}
 
/* update netdevice statistics */
-   dev->stats.rx_packets++;
-   dev->stats.rx_bytes += skb->len;
+   netdev->stats.rx_packets++;
+   netdev->stats.rx_bytes += skb->len;
 
/* pass skb up to stack */
netif_receive_skb(skb);
@@ -1184,6 +1181,7 @@ static int spider_net_resync_tail_ptr(st
 static int
 spider_net_decode_one_descr(struct spider_net_card *card)
 {
+   struct net_device *dev = card->netdev;
struct spider_net_descr_chain *chain = >rx_chain;
struct spider_net_descr *descr = chain->tail;
struct spider_net_hw_descr *hwdescr = descr->hwdescr;
@@ -1210,7 +1208,7 @@ spider_net_decode_one_descr(struct spide
 (status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
 (status == SPIDER_NET_DESCR_FORCE_END) ) {
if (netif_msg_rx_err(card))
-   dev_err(>netdev->dev,
+   dev_err(>dev,
   "dropping RX descriptor with state %d\n", 
status);
dev->stats.rx_dropped++;
goto bad_desc;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma

On Thu, 20 Sep 2007, Satyam Sharma wrote:
> 
> BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
> IIRC I got build failures in:

> drivers/md/raid6int8.c

This turned out to be a gcc bug -- I was using an old cross-compiler.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
> 
> BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
> IIRC I got build failures in:

> drivers/ata/pata_scc.c

http://lkml.org/lkml/2007/9/21/557
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/ata/pata_scc.c

http://lkml.org/lkml/2007/9/21/557
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/md/raid6int8.c

This turned out to be a gcc bug -- I was using an old cross-compiler.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/net/spider_net.c


[PATCH -mm] spider_net: Misc build fixes after recent netdev stats changes

Unbreak the following:

drivers/net/spider_net.c: In function 'spider_net_release_tx_chain':
drivers/net/spider_net.c:818: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c:818: error: (Each undeclared identifier is reported 
only once
drivers/net/spider_net.c:818: error: for each function it appears in.)
drivers/net/spider_net.c: In function 'spider_net_xmit':
drivers/net/spider_net.c:922: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_pass_skb_up':
drivers/net/spider_net.c:1018: error: 'dev' undeclared (first use in this 
function)
drivers/net/spider_net.c: In function 'spider_net_decode_one_descr':
drivers/net/spider_net.c:1215: error: 'dev' undeclared (first use in this 
function)
make[2]: *** [drivers/net/spider_net.o] Error 1

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/spider_net.c |   24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff -ruNp a/drivers/net/spider_net.c b/drivers/net/spider_net.c
--- a/drivers/net/spider_net.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net.c  2007-09-22 12:12:23.0 +0530
@@ -795,6 +795,7 @@ spider_net_set_low_watermark(struct spid
 static int
 spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
 {
+   struct net_device *dev = card-netdev;
struct spider_net_descr_chain *chain = card-tx_chain;
struct spider_net_descr *descr;
struct spider_net_hw_descr *hwdescr;
@@ -919,7 +920,7 @@ spider_net_xmit(struct sk_buff *skb, str
spider_net_release_tx_chain(card, 0);
 
if (spider_net_prepare_tx_descr(card, skb) != 0) {
-   dev-stats.tx_dropped++;
+   netdev-stats.tx_dropped++;
netif_stop_queue(netdev);
return NETDEV_TX_BUSY;
}
@@ -979,16 +980,12 @@ static void
 spider_net_pass_skb_up(struct spider_net_descr *descr,
   struct spider_net_card *card)
 {
-   struct spider_net_hw_descr *hwdescr= descr-hwdescr;
-   struct sk_buff *skb;
-   struct net_device *netdev;
-   u32 data_status, data_error;
-
-   data_status = hwdescr-data_status;
-   data_error = hwdescr-data_error;
-   netdev = card-netdev;
+   struct spider_net_hw_descr *hwdescr = descr-hwdescr;
+   struct sk_buff *skb = descr-skb;
+   struct net_device *netdev = card-netdev;
+   u32 data_status = hwdescr-data_status;
+   u32 data_error = hwdescr-data_error;
 
-   skb = descr-skb;
skb_put(skb, hwdescr-valid_size);
 
/* the card seems to add 2 bytes of junk in front
@@ -1015,8 +1012,8 @@ spider_net_pass_skb_up(struct spider_net
}
 
/* update netdevice statistics */
-   dev-stats.rx_packets++;
-   dev-stats.rx_bytes += skb-len;
+   netdev-stats.rx_packets++;
+   netdev-stats.rx_bytes += skb-len;
 
/* pass skb up to stack */
netif_receive_skb(skb);
@@ -1184,6 +1181,7 @@ static int spider_net_resync_tail_ptr(st
 static int
 spider_net_decode_one_descr(struct spider_net_card *card)
 {
+   struct net_device *dev = card-netdev;
struct spider_net_descr_chain *chain = card-rx_chain;
struct spider_net_descr *descr = chain-tail;
struct spider_net_hw_descr *hwdescr = descr-hwdescr;
@@ -1210,7 +1208,7 @@ spider_net_decode_one_descr(struct spide
 (status == SPIDER_NET_DESCR_PROTECTION_ERROR) ||
 (status == SPIDER_NET_DESCR_FORCE_END) ) {
if (netif_msg_rx_err(card))
-   dev_err(card-netdev-dev,
+   dev_err(dev-dev,
   dropping RX descriptor with state %d\n, 
status);
dev-stats.rx_dropped++;
goto bad_desc;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failures on ppc64_defconfig

2007-09-22 Thread Satyam Sharma



On Thu, 20 Sep 2007, Satyam Sharma wrote:
 
 BTW ppc64_defconfig didn't quite like 2.6.23-rc6-mm1 either ...
 IIRC I got build failures in:

 drivers/net/spider_net.c

Fixing the above showed up another problem in another file of the
same driver (drivers/net/spider_net_ethtool.c)


[PATCH -mm] spider_net_ethtool: Keep up with recent netdev stats changes

Unbreak the following:

drivers/net/spider_net_ethtool.c: In function 'spider_net_get_ethtool_stats':
drivers/net/spider_net_ethtool.c:160: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:161: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:162: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:163: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:164: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:165: error: structure has no member named 
'netdev_stats'
drivers/net/spider_net_ethtool.c:166: error: structure has no member named 
'netdev_stats'
make[2]: *** [drivers/net/spider_net_ethtool.o] Error 1

Also do another ARRAY_SIZE() cleanup while at it.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/spider_net_ethtool.c |   18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff -ruNp a/drivers/net/spider_net_ethtool.c b/drivers/net/spider_net_ethtool.c
--- a/drivers/net/spider_net_ethtool.c  2007-09-22 06:26:39.0 +0530
+++ b/drivers/net/spider_net_ethtool.c  2007-09-22 12:43:51.0 +0530
@@ -28,8 +28,6 @@
 #include spider_net.h
 
 
-#define SPIDER_NET_NUM_STATS 13
-
 static struct {
const char str[ETH_GSTRING_LEN];
 } ethtool_stats_keys[] = {
@@ -149,7 +147,7 @@ spider_net_ethtool_get_ringparam(struct 
 
 static int spider_net_get_stats_count(struct net_device *netdev)
 {
-   return SPIDER_NET_NUM_STATS;
+   return ARRAY_SIZE(ethtool_stats_keys);
 }
 
 static void spider_net_get_ethtool_stats(struct net_device *netdev,
@@ -157,13 +155,13 @@ static void spider_net_get_ethtool_stats
 {
struct spider_net_card *card = netdev-priv;
 
-   data[0] = card-netdev_stats.tx_packets;
-   data[1] = card-netdev_stats.tx_bytes;
-   data[2] = card-netdev_stats.rx_packets;
-   data[3] = card-netdev_stats.rx_bytes;
-   data[4] = card-netdev_stats.tx_errors;
-   data[5] = card-netdev_stats.tx_dropped;
-   data[6] = card-netdev_stats.rx_dropped;
+   data[0] = netdev-stats.tx_packets;
+   data[1] = netdev-stats.tx_bytes;
+   data[2] = netdev-stats.rx_packets;
+   data[3] = netdev-stats.rx_bytes;
+   data[4] = netdev-stats.tx_errors;
+   data[5] = netdev-stats.tx_dropped;
+   data[6] = netdev-stats.rx_dropped;
data[7] = card-spider_stats.rx_desc_error;
data[8] = card-spider_stats.tx_timeouts;
data[9] = card-spider_stats.alloc_rx_skb_error;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-22 Thread Satyam Sharma


 -static volatile int kgdb_hwbreak_sstep[NR_CPUS];
 +volatile int kgdb_hwbreak_sstep[NR_CPUS];

That looks fishy to me. Why is it volatile-qualified? And does that
actually want to be a per-cpu var?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-22 Thread Satyam Sharma

Hi Greg,


On Tue, 18 Sep 2007, Greg KH wrote:
 
 On Tue, Sep 18, 2007 at 03:04:48PM +0530, Satyam Sharma wrote:
  
  But wait ... isn't that a statically-allocated kobject, which were
  supposed to be naughty in the first place?
 
 Yes it is, if you want to dynamically create it, please do.

Sorry for being late to reply, but do you still want such a patch (i.e.
convert static to dynamic allocation)?

I read elsewhere on this thread that you'd merge Kamalesh's patch and
fix it up to also use kobject_name() yourself. But it's a small/trivial
driver, so I think just converting it to dynamic allocation right now
itself (when we've noticed it already) is probably better (?)

[ BTW I don't see the fix in your git trees or quilt queue. So I'll
  make a patch on top of 2.6.23-rc6-mm1 itself. ]


Thanks,

Satyam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-22 Thread Peter Zijlstra

On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu [EMAIL PROTECTED]
wrote:

 --- linux-2.6.22.orig/mm/page-writeback.c
 +++ linux-2.6.22/mm/page-writeback.c
 @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
   bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
   }
  
 + printk(KERN_DEBUG balance_dirty_pages written %lu %lu 
 congested %d limits %lu %lu %lu %lu %lu %ld\n,
 + pages_written,
 + write_chunk - wbc.nr_to_write,
 + bdi_write_congested(bdi),
 + background_thresh, dirty_thresh,
 + bdi_thresh, bdi_nr_reclaimable, 
 bdi_nr_writeback,
 + bdi_thresh - bdi_nr_reclaimable - 
 bdi_nr_writeback);
 +
   if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
   break;
   if (pages_written = write_chunk)
 

 [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
 195477 5801 5760 288 -247

snip long series of mostly identical lines

Most peculiar. It seems writeback_inodes() doesn't even attempt to
write out stuff. Nor are outstanding writeback pages completed.

Could you perhaps instrument the writeback_inodes() path to see why
nothing is written out? - the attached patch would be a nice start.

 Here are some messages when doing large dds:

 [  511.733791] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
 198913 10999 18288 0 -7289
 [  511.735048] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
 198913 12012 16752 0 -4740
 [  511.736506] balance_dirty_pages written 1540 1540 congested 0 limits 49728 
 198913 12306 15192 1056 -3942
 [  512.002169] balance_dirty_pages written 1547 1547 congested 2 limits 49726 
 198905 13471 12624 1848 -1001
 [  512.003795] balance_dirty_pages written 1540 1540 congested 2 limits 49723 
 198892 13470 11088 3384 -1002
 [  512.083517] balance_dirty_pages written 1540 1540 congested 2 limits 49712 
 198850 13572 9336 4992 -756
 [  512.085145] balance_dirty_pages written 1540 1540 congested 2 limits 49706 
 198825 13569 7800 6528 -759
 [  512.086773] balance_dirty_pages written 1540 1540 congested 2 limits 49702 
 198808 13568 6240 8064 -736
 [  512.184267] balance_dirty_pages written 1539 1539 congested 2 limits 49697 
 198791 13649 5592 8592 -535
 [  512.185903] balance_dirty_pages written 1540 1540 congested 2 limits 49694 
 198778 13649 4056 10152 -559
 [  512.187506] balance_dirty_pages written 1540 1540 congested 2 limits 49688 
 198753 13647 2496 11688 -537
 [  512.259848] balance_dirty_pages written 1546 1546 congested 2 limits 49682 
 198728 13769 744 13248 -223
 [  512.554646] balance_dirty_pages written 618 618 congested 2 limits 49678 
 198712 14242 1 13368 873
 [  512.585204] balance_dirty_pages written 794 794 congested 2 limits 49657 
 198630 14500 1 12936 1563
 [  527.714294] balance_dirty_pages written 1540 1540 congested 0 limits 49608 
 198432 29502 28080 0 1422

This looks like a sane series, we have a surplus of reclaimable pages,
start writeout, which increases writeback pages, and congests the
device, and eventually all subsides and we finish congestion and quit.

 [  529.298022] balance_dirty_pages written 1540 1540 congested 0 limits 49579 
 198318 30307 34704 0 -4397
 [  529.304975] balance_dirty_pages written 1539 1539 congested 0 limits 49575 
 198302 32451 30600 0 1851
 [  529.305205] balance_dirty_pages written 1538 1538 congested 0 limits 49576 
 198306 32571 30384 0 2187
 [  529.306988] balance_dirty_pages written 1537 1537 congested 0 limits 49580 
 198320 32702 30120 0 2582
 [  530.893830] balance_dirty_pages written 1541 1541 congested 0 limits 49553 
 198214 34216 35352 0 -1136
 [  530.893970] balance_dirty_pages written 1538 1538 congested 0 limits 49553 
 198214 34290 35160 0 -870
 [  530.899873] balance_dirty_pages written 1546 1546 congested 0 limits 49556 
 198227 36248 31248 0 5000
 [  530.900282] balance_dirty_pages written 1546 1546 congested 0 limits 49557 
 198231 36442 30864 0 5578
 [  530.900586] balance_dirty_pages written 1539 1539 congested 0 limits 49558 
 198235 36601 30552 0 6049
 [  532.343097] balance_dirty_pages written 1541 1541 congested 0 limits 49530 
 198120 37862 36072 0 1790
 [  532.343595] balance_dirty_pages written 1547 1547 congested 0 limits 49533 
 198132 38081 35640 0 2441
 [  533.872355] balance_dirty_pages written 1540 1540 congested 0 limits 49502 
 198009 41148 37224 0 3924
 [  542.566083] balance_dirty_pages written 1542 1542 congested 0 limits 49367 
 197469 51948 52680 0 -732
 [  542.567093] balance_dirty_pages written 1537 1537 congested 0 limits 49370 
 197482 52663 50952 0 1711
 [  542.586552] balance_dirty_pages written 1540 1540 congested 0 limits 49352 
 197410 54545 46032 0 8513
 [  542.606002] balance_dirty_pages written 1540 1540 congested 0 limits 49337 
 197350 55132 44520 0 10612

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-22 Thread Fengguang Wu

On Sat, Sep 22, 2007 at 03:16:22PM +0200, Peter Zijlstra wrote:
 On Sat, 22 Sep 2007 09:55:09 +0800 Fengguang Wu [EMAIL PROTECTED]
 wrote:
 
  --- linux-2.6.22.orig/mm/page-writeback.c
  +++ linux-2.6.22/mm/page-writeback.c
  @@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
  bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
  }
   
  +   printk(KERN_DEBUG balance_dirty_pages written %lu %lu 
  congested %d limits %lu %lu %lu %lu %lu %ld\n,
  +   pages_written,
  +   write_chunk - wbc.nr_to_write,
  +   bdi_write_congested(bdi),
  +   background_thresh, dirty_thresh,
  +   bdi_thresh, bdi_nr_reclaimable, 
  bdi_nr_writeback,
  +   bdi_thresh - bdi_nr_reclaimable - 
  bdi_nr_writeback);
  +
  if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
  break;
  if (pages_written = write_chunk)
  
 
  [ 1305.361511] balance_dirty_pages written 0 0 congested 0 limits 48869 
  195477 5801 5760 288 -247
 
 snip long series of mostly identical lines
 
 Could you perhaps instrument the writeback_inodes() path to see why
 nothing is written out? - the attached patch would be a nice start.

Curiously the lockup problem disappeared after upgrading to 2.6.23-rc6-mm1.
(need to watch it in a longer time window).

Anyway here's the output of your patch:
sb_locked 0
sb_empty 97011

 Most peculiar. It seems writeback_inodes() doesn't even attempt to
 write out stuff. Nor are outstanding writeback pages completed.

Still true. Another problem is that balance_dirty_pages() is being called even
when there are only 54 dirty pages. That could slow down writers unnecessarily.

balance_dirty_pages() should not be entered at all with small nr_dirty.

Look at these lines:
[  197.471619] balance_dirty_pages for tar written 405 405 congested 0 global 
196554 54 403 196097 bdi 0 0 398 -398
[  197.472196] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 0 0 380 -380
[  197.472893] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 369 -346
[  197.473158] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 366 -343
[  197.473403] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 365 -342
[  197.473674] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 364 -341
[  197.474265] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 372 196128 bdi 23 0 362 -339
[  197.475440] balance_dirty_pages for tar written 405 0 congested 0 global 
196554 54 341 196159 bdi 47 0 327 -280
[  197.476970] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 279 196213 bdi 95 0 279 -184
[  197.43] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 248 196244 bdi 95 0 255 -160
[  197.479463] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 217 196275 bdi 143 0 210 -67
[  197.479656] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 217 196275 bdi 143 0 209 -66
[  197.481159] balance_dirty_pages for tar written 405 0 congested 0 global 
196546 54 155 196337 bdi 167 0 163 4

The trace messages are generated by the following code:

--- linux-2.6.23-rc6-mm1.orig/mm/page-writeback.c
+++ linux-2.6.23-rc6-mm1/mm/page-writeback.c
@@ -421,6 +421,18 @@ static void balance_dirty_pages(struct a
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
}

+   printk(KERN_DEBUG balance_dirty_pages for %s written %lu %lu 
congested %d 
+   global %lu %lu %lu %ld bdi %lu %lu %lu %ld\n,
+   current-comm,
+   pages_written, write_chunk - wbc.nr_to_write,
+   bdi_write_congested(bdi),
+   dirty_thresh,
+   global_dirty_unstable_pages(), 
global_page_state(NR_WRITEBACK),
+   dirty_thresh -
+   global_dirty_unstable_pages() - 
global_page_state(NR_WRITEBACK),
+   bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
+   bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
+   
if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
break;
if (pages_written = write_chunk)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-21 Thread Satyam Sharma

Hi,


On Thu, 20 Sep 2007, Alan Cox wrote:
> 
> On Thu, 20 Sep 2007 14:13:15 +0100
> [EMAIL PROTECTED] (Mel Gorman) wrote:
> 
> > PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
> > doesn't show up on other arches because this driver is specific to the
> > architecture.
> > 
> > drivers/ata/pata_scc.c: In function `scc_bmdma_status'
> 
> Its not been updated to match the libata core changes. Try something like
> this. Whoever is maintaining it should also remove the prereset cable handling
> code and use the proper cable detect method.

It appears you forgot to fix scc_std_softreset() and one of its callsites
in scc_bdma_stop(). A complete patch is attempted below -- please review.


[PATCH -mm] pata_scc: Keep up with libata core API changes

Little fixlets, that the build started erroring / warning about:

drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
incompatible pointer type
drivers/ata/pata_scc.c: In function 'scc_error_handler':
drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
make[2]: *** [drivers/ata/pata_scc.o] Error 1

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>
Cc: Alan Cox <[EMAIL PROTECTED]>
Cc: Mel Gorman <[EMAIL PROTECTED]>

---

Andrew, this includes (supercedes) the previous two ones from Mel / Alan.

 drivers/ata/pata_scc.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff -ruNp a/drivers/ata/pata_scc.c b/drivers/ata/pata_scc.c
--- a/drivers/ata/pata_scc.c2007-09-22 06:26:37.0 +0530
+++ b/drivers/ata/pata_scc.c2007-09-22 08:04:42.0 +0530
@@ -594,16 +594,17 @@ static unsigned int scc_bus_softreset(st
  * Note: Original code is ata_std_softreset().
  */
 
-static int scc_std_softreset (struct ata_port *ap, unsigned int *classes,
-  unsigned long deadline)
+static int scc_std_softreset(struct ata_link *link, unsigned int *classes,
+ unsigned long deadline)
 {
+   struct ata_port *ap = link->ap;
unsigned int slave_possible = ap->flags & ATA_FLAG_SLAVE_POSS;
unsigned int devmask = 0, err_mask;
u8 err;
 
DPRINTK("ENTER\n");
 
-   if (ata_link_offline(>link)) {
+   if (ata_link_offline(link)) {
classes[0] = ATA_DEV_NONE;
goto out;
}
@@ -692,7 +693,7 @@ static void scc_bmdma_stop (struct ata_q
printk(KERN_WARNING "%s: Internal Bus Error\n", 
DRV_NAME);
out_be32(bmid_base + SCC_DMA_INTST, INTSTS_BMSINT);
/* TBD: SW reset */
-   scc_std_softreset(ap, , deadline);
+   scc_std_softreset(>link, , deadline);
continue;
}
 
@@ -731,7 +732,7 @@ static u8 scc_bmdma_status (struct ata_p
void __iomem *mmio = ap->ioaddr.bmdma_addr;
u8 host_stat = in_be32(mmio + SCC_DMA_STATUS);
u32 int_status = in_be32(mmio + SCC_DMA_INTST);
-   struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
+   struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->link.active_tag);
static int retry = 0;
 
/* return if IOS_SS is cleared */
@@ -860,10 +861,10 @@ static void scc_bmdma_freeze (struct ata
  * @deadline: deadline jiffies for the operation
  */
 
-static int scc_pata_prereset(struct ata_port *ap, unsigned long deadline)
+static int scc_pata_prereset(struct ata_link *link, unsigned long deadline)
 {
-   ap->cbl = ATA_CBL_PATA80;
-   return ata_std_prereset(ap, deadline);
+   link->ap->cbl = ATA_CBL_PATA80;
+   return ata_std_prereset(link, deadline);
 }
 
 /**
@@ -874,8 +875,10 @@ static int scc_pata_prereset(struct ata_
  * Note: Original code is ata_std_postreset().
  */
 
-static void scc_std_postreset (struct ata_port *ap, unsigned int *classes)
+static void scc_std_postreset(struct ata_link *link, unsigned int *classes)
 {
+   struct ata_port *ap = link->ap;
+
DPRINTK("ENTER\n");
 
/* is double-select really necessary? */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-21 Thread Fengguang Wu

On Thu, Sep 20, 2007 at 12:31:39PM +0100, Hugh Dickins wrote:
> On Wed, 19 Sep 2007, Peter Zijlstra wrote:
> > On Wed, 19 Sep 2007 21:03:19 +0100 (BST) Hugh Dickins
> > <[EMAIL PROTECTED]> wrote:
> > 
> > > On Wed, 19 Sep 2007, Andy Whitcroft wrote:
> > > > Seems I have a case of a largish i386 NUMA (NUMA-Q) which has a mkfs
> > > > stuck in a 'D' wait:
> > > > 
> > > >  ===
> > > > mkfs.ext2 D c10220f4 0  6233   6222
> > > >  [] io_schedule_timeout+0x1e/0x28
> > > >  [] congestion_wait+0x62/0x7a
> > > >  [] get_dirty_limits+0x16a/0x172
> > > >  [] balance_dirty_pages+0x154/0x1be
> > > >  [] generic_perform_write+0x168/0x18a
> > > >  [] generic_file_buffered_write+0x73/0x107
> > > >  [] __generic_file_aio_write_nolock+0x47a/0x4a5
> > > >  [] generic_file_aio_write_nolock+0x48/0x9b
> > > >  [] do_sync_write+0xbf/0xfc
> > > >  [] vfs_write+0x8d/0x108
> > > >  [] sys_write+0x41/0x67
> > > >  [] syscall_call+0x7/0xb
> > > >  ===
> > > 
> > > [edited out some bogus lines from stale stack]
> > > 
> > > > This machine and others have run numerous test runs on this kernel and
> > > > this is the first time I've see a hang like this.
> > > 
> > > I've been seeing something like that on 4-way PPC64: in my case I've
> > > shells hanging in D state trying to append to kernel build log on ext3
> > > (the builds themselves going on elsewhere, in tmpfs): one of the shells
> > > holding i_mutex and stuck doing congestion_waits from balance_dirty_pages.
> > > 
> > > > I wonder if this is the ultimate cause of the couple of mainline hangs
> > > > which were seen, but not diagnosed.
> > > 
> > > My *guess* is that this is peculiar to 2.6.23-rc6-mm1, and from Peter's
> > > mm-per-device-dirty-threshold.patch.  printks showed bdi_nr_reclaimable
> > > 0, bdi_nr_writeback 24, bdi_thresh 1 in balance_dirty_pages (though I've
> > > not done enough to check if those really correlate with the hangs),
> > > and I'm wondering if the bdi_stat_sum business is needed on the
> > > !nr_reclaimable path.
> > 
> > FWIW my tired brain seems to think it the !nr_reclaimable path needs it
> > just the same. So this change seems to make sense for now :-)
> 
> Thanks.
> 
> > > So I'm running now with the patch below, good so far, but can't judge
> > > until tomorrow whether it has actually addressed the problem seen.
> 
> Last night's run went well: that patch does indeed seem to have fixed it.
> Looking at the timings (some variance but _very_ much less than the night
> before), there does appear to be some other occasional slight slowdown -
> but I've no reason to suspect your patch for it, nor to suppose it's
> something new: it may just be an artifact of my heavy swap thrashing.
> 
> 
> [PATCH mm] mm per-device dirty threshold fix
> 
> Fix occasional hang when a task couldn't get out of balance_dirty_pages:
> mm-per-device-dirty-threshold.patch needs to reevaluate bdi_nr_writeback
> across all cpus when bdi_thresh is low, even in the case when there was
> no bdi_nr_reclaimable.
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>

Thank you Hugh. I ran into similar problems with many dd(large file)
operations.  This patch seems to fix it.

But now my desktop was locked up again when writing a lot of small
files. The problem is repeatable with the command
 $ ketchup 2.6.23-rc6-mm1

I writeup two debug patches:

---
 mm/page-writeback.c |9 +
 1 file changed, 9 insertions(+)

--- linux-2.6.22.orig/mm/page-writeback.c
+++ linux-2.6.22/mm/page-writeback.c
@@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
}
 
+   printk(KERN_DEBUG "balance_dirty_pages written %lu %lu 
congested %d limits %lu %lu %lu %lu %lu %ld\n",
+   pages_written,
+   write_chunk - wbc.nr_to_write,
+   bdi_write_congested(bdi),
+   background_thresh, dirty_thresh,
+   bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
+   bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
+
if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
break;
if (pages_written >= write_chunk)

---
 mm/page-writeback.c |5 +
 1 file changed, 5 insertions(+)

--- linux-2.6.22.orig/mm/page-writeback.c
+++ linux-2.6.22/mm/page-writeback.c
@@ -373,6 +373,7 @@ static void balance_dirty_pages(struct a
long bdi_thresh;
unsigned long pages_written = 0;
unsigned long write_chunk = sync_writeback_pages();
+   int i = 0;
 
struct backing_dev_info *bdi = mapping->backing_dev_info;
 
@@ -434,6 +435,10 @@ static void balance_dirty_pages(struct a
bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
bdi_thresh -

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Friday, 21 September 2007 21:16, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
> > On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> > > I simply rmmod'ed the processor module before suspend and the problem is
> > > solved as well. The cpuidle patches make this problem more prominent due
> > > to the possible more direct switch into lower power states, when we wait 
> > > for
> > > a long time on something. 
> > 
> > So, perhaps we can add a .suspend()/.resume() routines to the processor 
> > driver
> > and use them to disable/enable the cpuidle functionality during a
> > suspend/resume?
> 
> http://tglx.de/private/tglx/p.diff
> 
> untested yet, but I'm on the way to do that :)

Heh, I thought of the same thing. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
> On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> > I simply rmmod'ed the processor module before suspend and the problem is
> > solved as well. The cpuidle patches make this problem more prominent due
> > to the possible more direct switch into lower power states, when we wait for
> > a long time on something. 
> 
> So, perhaps we can add a .suspend()/.resume() routines to the processor driver
> and use them to disable/enable the cpuidle functionality during a
> suspend/resume?

http://tglx.de/private/tglx/p.diff

untested yet, but I'm on the way to do that :)

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
> > > > If you need any help from me with that, please let me know.
> > > 
> > > I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> > > After debugging the swsusp_suspend() code path I figured out, that we
> > > end up in C2 or deeper power states while we run the suspend code. The
> > > same happens when we come back on resume. It looks like we disable stuff
> > > in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
> > 
> > Hm, can you please run the test I've suggested in another branch of the
> > thread, ie.
> > 
> > # echo shutdown > /sys/power/disk
> > # echo disk > /sys/power/state
> > 
> > without your debugging code in disk.c?
> > 
> > This makes the hibernation code omit the major ACPI hooks, so if it works,
> > we'll know that these hooks are responsible for the problem.
> 
> Yes, this works fine. We still go into C3, but this seems not longer to
> brick the box.
> 
> > > I hacked the idle loop arch code to use halt() right before we call
> > > device_suspend() and switch back to the acpi idle code right after
> > > device_resume(). This solves the problem as well.
> > 
> > Well, that seems less intrusive than changing the code ordering right before
> > the major kernel release, but I think we should do our best to understand 
> > what
> > _exactly_ is happening here.
> 
> I found some other subtle thinko in the clock events code while I was
> heading down the swsusp_suspend code path. I wait for confirmation that
> it does not brick some endangered boxen, though. Still with this change
> in the clock events code, my VAIO goes into C2 or C3 and causes the box
> to wait for a helping keystroke.
> 
> The correct solution would be, that the ACPI code ignores the lower
> C-states during suspend / resume.

Yes, certainly.

> I simply rmmod'ed the processor module before suspend and the problem is
> solved as well. The cpuidle patches make this problem more prominent due
> to the possible more direct switch into lower power states, when we wait for
> a long time on something. 

So, perhaps we can add a .suspend()/.resume() routines to the processor driver
and use them to disable/enable the cpuidle functionality during a
suspend/resume?

> I think we really should not fiddle with the various cpu states during
> the critical parts of suspend / resume. Let's keep it simple. We have
> the same policy during boot and I think the suspend / resume critical
> parts have similar constraints.

I completely agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
> > > If you need any help from me with that, please let me know.
> > 
> > I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> > After debugging the swsusp_suspend() code path I figured out, that we
> > end up in C2 or deeper power states while we run the suspend code. The
> > same happens when we come back on resume. It looks like we disable stuff
> > in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
> 
> Hm, can you please run the test I've suggested in another branch of the
> thread, ie.
> 
> # echo shutdown > /sys/power/disk
> # echo disk > /sys/power/state
> 
> without your debugging code in disk.c?
> 
> This makes the hibernation code omit the major ACPI hooks, so if it works,
> we'll know that these hooks are responsible for the problem.

Yes, this works fine. We still go into C3, but this seems not longer to
brick the box.

> > I hacked the idle loop arch code to use halt() right before we call
> > device_suspend() and switch back to the acpi idle code right after
> > device_resume(). This solves the problem as well.
> 
> Well, that seems less intrusive than changing the code ordering right before
> the major kernel release, but I think we should do our best to understand what
> _exactly_ is happening here.

I found some other subtle thinko in the clock events code while I was
heading down the swsusp_suspend code path. I wait for confirmation that
it does not brick some endangered boxen, though. Still with this change
in the clock events code, my VAIO goes into C2 or C3 and causes the box
to wait for a helping keystroke.

The correct solution would be, that the ACPI code ignores the lower
C-states during suspend / resume. I simply rmmod'ed the processor module
before suspend and the problem is solved as well. The cpuidle patches
make this problem more prominent due to the possible more direct switch
into lower power states, when we wait for a long time on something.

I think we really should not fiddle with the various cpu states during
the critical parts of suspend / resume. Let's keep it simple. We have
the same policy during boot and I think the suspend / resume critical
parts have similar constraints.

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Friday, 21 September 2007 14:59, Thomas Gleixner wrote:
> Rafael,
> 
> On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
> > > -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
> > > hint that this change is making things work again.
> > 
> > Yes, it is.
> > 
> > > I need to go down into the details of the swsusp_suspend() code path to
> > > figure out, what's the root cause. 
> > 
> > If you need any help from me with that, please let me know.
> 
> I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
> After debugging the swsusp_suspend() code path I figured out, that we
> end up in C2 or deeper power states while we run the suspend code. The
> same happens when we come back on resume. It looks like we disable stuff
> in the ACPI BIOS, which makes the C2 and deeper power states misbehave.

Hm, can you please run the test I've suggested in another branch of the
thread, ie.

# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state

without your debugging code in disk.c?

This makes the hibernation code omit the major ACPI hooks, so if it works,
we'll know that these hooks are responsible for the problem.

> I hacked the idle loop arch code to use halt() right before we call
> device_suspend() and switch back to the acpi idle code right after
> device_resume(). This solves the problem as well.

Well, that seems less intrusive than changing the code ordering right before
the major kernel release, but I think we should do our best to understand what
_exactly_ is happening here.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 sparc build error

2007-09-21 Thread Mathieu Desnoyers

* Guennadi Liakhovetski ([EMAIL PROTECTED]) wrote:
> Provide {enable,disable}_irq_wakeup dummies for undefined 
> CONFIG_GENERIC_HARDIRQS case. Completely untested, as I don't even have 
> cross-compilers for platforms without CONFIG_GENERIC_HARDIRQS.
> 
> Signed-off-by: Guennadi Liakhovetski <[EMAIL PROTECTED]>
> 

It builds fine now.

Tested-by: Mathieu Desnoyers <[EMAIL PROTECTED]>

> ---
> 
> On Tue, 18 Sep 2007, Andrew Morton wrote:
> 
> > On Tue, 18 Sep 2007 16:54:03 -0400
> > Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:
> > 
> > > I got the following error when building 2.6.23-rc6-mm1 on sparc:
> > > 
> > > 
> > >   
> > > /opt/crosstool/gcc-4.1.1-glibc-2.3.6/sparc-unknown-linux-gnu/bin/sparc-unknown-linux-gnu-gcc
> > >  -Wp,-MD,drivers/serial/.serial_core.o.d  -nostdinc -isystem 
> > > /opt/crosstool/gcc-4.1.1-glibc-2.3.6/sparc-unknown-linux-gnu/lib/gcc/sparc-unknown-linux-gnu/4.1.1/include
> > >  -D__KERNEL__ -Iinclude -Iinclude2 
> > > -I/home/compudj/git/linux-2.6-lttng/include -include 
> > > include/linux/autoconf.h  
> > > -I/home/compudj/git/linux-2.6-lttng/drivers/serial -Idrivers/serial -Wall 
> > > -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
> > > -fno-common -Werror-implicit-function-declaration -Os -m32 -pipe -mno-fpu 
> > > -fcall-used-g5 -fcall-used-g7 -fomit-frame-pointer -fno-stack-protector 
> > > -Wdeclaration-after-statement -Wno-pointer-sign  -D"KBUILD_STR(s)=#s" 
> > > -D"KBUILD_BASENAME=KBUILD_STR(serial_core)"  
> > > -D"KBUILD_MODNAME=KBUILD_STR(serial_core)" -c -o 
> > > drivers/serial/.tmp_serial_core.o 
> > > /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c
> > > /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c: In 
> > > function 'uart_suspend_port':
> > > /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c:1980: 
> > > error: implicit declaration of function 'enable_irq_wake'
> > > /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c: In 
> > > function 'uart_resume_port':
> > > /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c:2035: 
> > > error: implicit declaration of function 'disable_irq_wake'
> > 
> > hm, I wonder why I didn't hit that.
> > 
> > enable_irq_wake() was added by wake-up-from-a-serial-port.patch
> > 
> > I note that git-input adds a call too, and might have a problem
> > with !CONFIG_GENERIC_HARDIRQS.
> > 
> > Not sure what the best fix is here.  We could sprinkle ifdefs all
> > over the code, or just add the suitable empty stubs for enable_irq_wake(),
> > etc.
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 5323f62..ecade41 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -205,6 +205,9 @@ static inline int disable_irq_wake(unsigned int irq)
>   enable_irq(irq)
>  # endif
>  
> +#define enable_irq_wake(irq) ({ (void)(irq); 0; })
> +#define disable_irq_wake(irq) ({ (void)(irq); 0; })
> +
>  #endif /* CONFIG_GENERIC_HARDIRQS */
>  
>  #ifndef __ARCH_SET_SOFTIRQ_PENDING

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
> > -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
> > hint that this change is making things work again.
> 
> Yes, it is.
> 
> > I need to go down into the details of the swsusp_suspend() code path to
> > figure out, what's the root cause. 
> 
> If you need any help from me with that, please let me know.

I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
After debugging the swsusp_suspend() code path I figured out, that we
end up in C2 or deeper power states while we run the suspend code. The
same happens when we come back on resume. It looks like we disable stuff
in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
I hacked the idle loop arch code to use halt() right before we call
device_suspend() and switch back to the acpi idle code right after
device_resume(). This solves the problem as well.

Len, any opinion on this one ?

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...
> any of them does ipc_rcu_getref() a bit later and sees old (cached)

Should be:
"any of them does ipc_rcu_putref() a bit later and sees old (cached)"

Sorry,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 12:11:15PM +0200, Nadia Derbey wrote:
> Jarek Poplawski wrote:
...
> >2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
> >safe (memory barriers): it's not atomic, so locking is needed, but
> >e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
> >freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
> 
> OK, but freeque() freeary() and shm_destroy() are special cases:
> we have the following path:
> 
> mutex_lock(ipc_ids.mutex)
> ...
> ipc_lock(ipcp)
> ... do whatever cleaning is needed ...
> ipc_rmid(ipcp)
> ipc_unlock(ipcp)
> 
> ipc_rcu_putref(ipcp)
> 
> Once the rmid has been done the ipc structure is considered as not 
> visible anymore from the user side ==> any syscall called with the 
> corresponding id will return invalid.
> The only thing that could happen is that this structure be reused for a 
> newly allocated ipc structure. But this too cannot happen since we are 
> under the ipc_ids mutex lock.
> 
> Am I wrong?

I hope not! But, then it would be probably another logical trick:
ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
if it's used in do_msgsnd() there should be a risk something can do
this kfree at this moment, and it seems freeque() is the only one,
which both: can do this and cares for this refcount. Then, e.g., if
any of them does ipc_rcu_getref() a bit later and sees old (cached)
value - kfree can be skipped forever. On the other hand, if there is
no such possibility because of other reasons, this all refcounting
looks like a code beautifier only...

Thanks,
Jarek P.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Nadia Derbey


Jarek Poplawski wrote:

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:


Nadia Derbey wrote:


Jarek Poplawski wrote:



On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:


...

Actually, ipc_lock() is called most of the time without the 
ipc_ids.mutex held and without refcounting (maybe you didn't look for 
the msg_lock() sem_lock() and shm_lock() too).

So I think disabling preemption is needed, isn't it?



so, these rcu_read_locks() don't
work here at all. So, probably I miss something again, but IMHO,
these rcu_read_locks/unlocks could be removed here or in
ipc_lock_by_ptr() and it should be enough to use them directly, where
really needed, e.g., in msg.c do_msgrcv().



I have to check for the ipc_lock_by_ptr(): may be you're right!



So, here is the ipc_lock_by_ptr() status:
1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
call it inside a refcounting.

 ==> no rcu read section needed.

2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
ipc_ids mutex lock.

 ==> no rcu read section needed.

3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
under refcounting

 ==> rcu read section + some more checks needed once the spnlock is
 taken.

So I completely agree with you: we might remove the rcu_read_lock() from 
the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
is already explicitly called in do_msgrcv()).



Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):

1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().

2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.


OK, but freeque() freeary() and shm_destroy() are special cases:
we have the following path:

mutex_lock(ipc_ids.mutex)
...
ipc_lock(ipcp)
... do whatever cleaning is needed ...
ipc_rmid(ipcp)
ipc_unlock(ipcp)

ipc_rcu_putref(ipcp)

Once the rmid has been done the ipc structure is considered as not 
visible anymore from the user side ==> any syscall called with the 
corresponding id will return invalid.
The only thing that could happen is that this structure be reused for a 
newly allocated ipc structure. But this too cannot happen since we are 
under the ipc_ids mutex lock.


Am I wrong?

Answers to the other questions in separate e-mails

Regards,
Nadia

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Nadia Derbey


Andrew Morton wrote:

On Tue, 18 Sep 2007 16:55:19 +0200 Nadia Derbey <[EMAIL PROTECTED]> wrote:



This patch fixes the missing rcu_read_(un)lock in the ipc code



Thanks.  Could you please check the code comments in ipc/util.c for
accuracy and completeness sometime?



Done - see attachment.
Hope I didn't forget / add some wrong stuff ;-)
BTW this patch also fixes a missing lock in shm_get_stat().

Regards,
Nadia
This patch fixes the wrong / obsolete comments in the ipc code.
Also adds a missing lock around ipc_get_maxid() in shm_get_stat().

Signed-off-by: Nadia Derbey <[EMAIL PROTECTED]>

---

This patch applies on top of 2.6.23-rc6-mm1 + the previous IPC patches.




 ipc/msg.c  |   14 +-
 ipc/sem.c  |   14 ++
 ipc/shm.c  |   39 --
 ipc/util.c |   78 +++--
 ipc/util.h |   18 --
 5 files changed, 118 insertions(+), 45 deletions(-)

Index: linux-2.6.23-rc6-mm1/ipc/util.c
===
--- linux-2.6.23-rc6-mm1.orig/ipc/util.c	2007-09-21 08:00:41.0 +0200
+++ linux-2.6.23-rc6-mm1/ipc/util.c	2007-09-21 10:32:03.0 +0200
@@ -194,7 +194,7 @@ void __init ipc_init_proc_interface(cons
  *	Requires ipc_ids.mutex locked.
  *	Returns the LOCKED pointer to the ipc structure if found or NULL
  *	if not.
- *	If key is found ipc contains its ipc structure
+ *	If key is found ipc points to the owning ipc structure
  */
  
 static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
@@ -258,10 +258,10 @@ int ipc_get_maxid(struct ipc_ids *ids)
  *	@new: new IPC permission set
  *	@size: limit for the number of used ids
  *
- *	Add an entry 'new' to the IPC idr. The permissions object is
+ *	Add an entry 'new' to the IPC ids idr. The permissions object is
  *	initialised and the first free entry is set up and the id assigned
- *	is returned. The list is returned in a locked state on success.
- *	On failure the list is not locked and -1 is returned.
+ *	is returned. The 'new' entry is returned in a locked state on success.
+ *	On failure the entry is not locked and -1 is returned.
  *
  *	Called with ipc_ids.mutex held.
  */
@@ -270,10 +270,6 @@ int ipc_addid(struct ipc_ids* ids, struc
 {
 	int id, err;
 
-	/*
-	 * rcu_dereference()() is not needed here since
-	 * ipc_ids.mutex is held
-	 */
 	if (size > IPCMNI)
 		size = IPCMNI;
 
@@ -303,12 +299,12 @@ int ipc_addid(struct ipc_ids* ids, struc
 /**
  *	ipcget_new	-	create a new ipc object
  *	@ns: namespace
- *	@ids: identifer set
+ *	@ids: IPC identifer set
  *	@ops: the actual creation routine to call
  *	@params: its parameters
  *
- *	This routine is called sys_msgget, sys_semget() and sys_shmget() when
- *	the key is IPC_PRIVATE
+ *	This routine is called by sys_msgget, sys_semget() and sys_shmget()
+ *	when the key is IPC_PRIVATE.
  */
 int ipcget_new(struct ipc_namespace *ns, struct ipc_ids *ids,
 		struct ipc_ops *ops, struct ipc_params *params)
@@ -330,9 +326,16 @@ int ipcget_new(struct ipc_namespace *ns,
 /**
  *	ipc_check_perms	-	check security and permissions for an IPC
  *	@ipcp: ipc permission set
- *	@ids: identifer set
  *	@ops: the actual security routine to call
  *	@params: its parameters
+ *
+ *	This routine is called by sys_msgget(), sys_semget() and sys_shmget()
+ *  when the key is not IPC_PRIVATE and that key already exists in the
+ *  ids IDR.
+ *
+ *	On success, the IPC id is returned.
+ *
+ *	It is called with ipc_ids.mutex and ipcp->lock held.
  */
 static int ipc_check_perms(struct kern_ipc_perm *ipcp, struct ipc_ops *ops,
 			struct ipc_params *params)
@@ -353,12 +356,16 @@ static int ipc_check_perms(struct kern_i
 /**
  *	ipcget_public	-	get an ipc object or create a new one
  *	@ns: namespace
- *	@ids: identifer set
+ *	@ids: IPC identifer set
  *	@ops: the actual creation routine to call
  *	@params: its parameters
  *
- *	This routine is called sys_msgget, sys_semget() and sys_shmget() when
- *	the key is not IPC_PRIVATE
+ *	This routine is called by sys_msgget, sys_semget() and sys_shmget()
+ *	when the key is not IPC_PRIVATE.
+ *	It adds a new entry if the key is not found and does some permission
+ *  / security checkings if the key is found.
+ *
+ *	On success, the ipc id is returned.
  */
 int ipcget_public(struct ipc_namespace *ns, struct ipc_ids *ids,
 		struct ipc_ops *ops, struct ipc_params *params)
@@ -389,6 +396,10 @@ int ipcget_public(struct ipc_namespace *
 			if (ops->more_checks)
 err = ops->more_checks(ipcp, params);
 			if (!err)
+/*
+ * ipc_check_perms returns the IPC id on
+ * success
+ */
 err = ipc_check_perms(ipcp, ops, params);
 		}
 		ipc_unlock(ipcp);
@@ -401,12 +412,9 @@ int ipcget_public(struct ipc_namespace *
 
 /**
  *	ipc_rmid	-	remove an IPC identifier
- *	@ids: identifier set
- *	@id: ipc perm structure containing the identifier to remove
+ *	@ids: IPC identifier set
+ *	@ipcp:

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

On Friday, 21 September 2007 09:56, Thomas Gleixner wrote:
> On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
> > > > (Btw, the above commit message points to just my response with a 
> > > > testing 
> > > > patch to the real email: the actual explanation of the INSANE ordering 
> > > > is 
> > > > from Len Brown in
> > > > 
> > > > 
> > > > https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> > > > 
> > > > and there Len claims that we *must* wake up CPU's early).
> > > 
> > > ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
> > > turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
> > > 
> > > Howerver, it seems that bugzilla entry may just be bogus. It talks about 
> > > "it appears that some firmware in the future may depend on that sequence 
> > > for correction operation"
> > > 
> > > Len, Shaohua, what are the real issues here? 
> > 
> > Intel's reference BIOS for Core Duo performs some re-initialization
> > in _WAK that will get blow away if INIT follows _WAK.
> > IIR, it is related to re-initializing the thermal sensors.
> > I opened bug 5651 when the BIOS team informed me of this issue.
> > 
> > Yes, bringing a processor offline and then online again w/o
> > an intervening suspend or reset would not evaluate _WAK,
> > and thus may still run into the issue.
> 
> If this is true, then we should disable the sys//cpu/online entry
> right away.

Or drop the execution of _INI from the CPU hotplug, if possible ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
> Rafael,
> 
> On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
> > > We disable everything in device_suspend()
> > 
> > No, we don't.  sysdevs are _not_ suspended in device_suspend().
> > They are suspended in device_power_down(), which is called
> > _after_ disable_nonboot_cpus() (from swsusp_suspend()).
> > 
> > > including timekeeping,
> > 
> > No, the timekeeping is suspended in device_power_down() (or at least it 
> > should
> > be).
> 
> Damn, you are right. Reading through 30 different logs confused me.
> 
> > >   enable_nonboot_cpus();
> > 
> > Actually, we can't do this here, because of ACPI and some interrupt handling
> > related problems.  Unfortunately, platform_finish() needs to go _after_
> > enable_nonboot_cpus() and device_resume() needs to go after 
> > platform_finish().
> > Analogously, disable_nonboot_cpus() has to go after platform_prepare().
> >
> > Otherwise, some systems will break.
> 
> Well, I don't buy this one. The system would break in the same way, when
> I take CPU#1 offline before I initiate the suspend.
> 
> > > and non-surprisingly the "my VAIO needs help from keyboard" problem went
> > > away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
> > > work at all on my VAIO due to some yet not identified wreckage)
> > 
> > Hm, I really don't know why it helps, but that's not because of the 
> > timekeeping
> > suspend, IMO.
> 
> It is related. We rely on some subtle thing which is not up when we
> resume the non boot cpu.
> 
> > > I did not yet look into the suspend to ram code, but I guess that there
> > > is an equivalent problem.
> > 
> > Yes, the code ordering is the same, but it's not totally wrong, IMHO.
> > 
> > > But I have no idea why this affects Andrews jinxed VAIO (UP machine),
> > > though I suspect that we have more timekeeping/timer depending code
> > > somewhere waiting to bite us.
> > 
> > That's possible.
> > 
> > > Also I still need to debug why the HIBERNATION_TEST code path (which has
> > > a msleep(5000) in it) does not fail,
> > 
> > See above. :-)
> 
> Yes. It makes sense. When I change the TEST code path to:
> 
> - printk("swsusp debug: Waiting for 5 seconds.\n");
> - msleep(5000);
> + printk("swsusp debug: before swsusp_suspend\n");
> + error = swsusp_suspend();
> 
> then I have the same effect as I get from real hibernation. And we
> actually shut down time keeping somewhere in that code path.
> 
> ACPI: PCI interrupt for device :00:1b.0 disabled
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> ACPI: PCI Interrupt :00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
> -> works fine
> 
> This is with my patch applied. Without that I get:
> 
> CPU1 is down
> swsusp debug: before swsusp_suspend
> Suspend timekeeping
> swsusp: critical section: 
> swsusp: Need to copy 112429 pages
> swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
> swsusp: critical section: done (112429 pages copied)
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Resume timekeeping
> Enabling non-boot CPUs
> --> Waits for ever until a key is pressed

Can you please run one more test?

Namely, without your debugging code in disk.c, please try

# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:
> Nadia Derbey wrote:
> >Jarek Poplawski wrote:
> >
> >>On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:
...
> >Actually, ipc_lock() is called most of the time without the 
> >ipc_ids.mutex held and without refcounting (maybe you didn't look for 
> >the msg_lock() sem_lock() and shm_lock() too).
> >So I think disabling preemption is needed, isn't it?
> >
> >>so, these rcu_read_locks() don't
> >>work here at all. So, probably I miss something again, but IMHO,
> >>these rcu_read_locks/unlocks could be removed here or in
> >>ipc_lock_by_ptr() and it should be enough to use them directly, where
> >>really needed, e.g., in msg.c do_msgrcv().
> >>
> >
> >I have to check for the ipc_lock_by_ptr(): may be you're right!
> >
> 
> So, here is the ipc_lock_by_ptr() status:
> 1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
> call it inside a refcounting.
>   ==> no rcu read section needed.
> 
> 2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
> ipc_ids mutex lock.
>   ==> no rcu read section needed.
> 
> 3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
> under refcounting
>   ==> rcu read section + some more checks needed once the spnlock is
>   taken.
> 
> So I completely agree with you: we might remove the rcu_read_lock() from 
> the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
> is already explicitly called in do_msgrcv()).

Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):

1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().

2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.

3. Probably similar problem is possible with msr_d.r_msg which is
read in do_msgrcv() under rcu_read_lock() only.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
> Linus Torvalds writes:
> 
> > It would indeed be nice if we could just take CPU's down early (while 
> > everything is working), and run the whole suspend code with just one CPU, 
> > rather than having to worry about the ordering between CPU and device 
> > takedown.
> 
> That is certainly what we want to do on powerpc.

I would have expected that we do it exactly this way and it took me by
surprise, that we do not.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
> > > (Btw, the above commit message points to just my response with a testing 
> > > patch to the real email: the actual explanation of the INSANE ordering is 
> > > from Len Brown in
> > > 
> > >   
> > > https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
> > > 
> > > and there Len claims that we *must* wake up CPU's early).
> > 
> > ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
> > turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
> > 
> > Howerver, it seems that bugzilla entry may just be bogus. It talks about 
> > "it appears that some firmware in the future may depend on that sequence 
> > for correction operation"
> > 
> > Len, Shaohua, what are the real issues here? 
> 
> Intel's reference BIOS for Core Duo performs some re-initialization
> in _WAK that will get blow away if INIT follows _WAK.
> IIR, it is related to re-initializing the thermal sensors.
> I opened bug 5651 when the BIOS team informed me of this issue.
> 
> Yes, bringing a processor offline and then online again w/o
> an intervening suspend or reset would not evaluate _WAK,
> and thus may still run into the issue.

If this is true, then we should disable the sys//cpu/online entry
right away.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: BUG kmalloc-16: Object padding overwritten (sysfs?)

2007-09-21 Thread Alexey Dobriyan

On Thu, Sep 20, 2007 at 10:36:13AM -0700, Christoph Lameter wrote:
> On Thu, 20 Sep 2007, Alexey Dobriyan wrote:
> > The winner is 
> > slub-avoid-touching-page-struct-when-freeing-to-per-cpu-slab.patch
> > Blind bisecting pointed to it and reverting the patch from full -mm makes
> > the problem go away
> 
> Hmmm.. This means likely that the c->node is used somewhere for 
> indexing Ahhh... If we count objects for sysfs output then c->node may 
> be used to index into the statistics array. The offset from the poison 
> also makes sense now since we increment values there.
> 
> Does this patch fix the issue?

Yes, it does.

> SLUB: Fix slab object counting.
> 
> We can only use the node value of the per cpu structure for counting if it 
> is positive. A negative value indicates that the slab is not valid.
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> 
> ---
>  mm/slub.c |8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.23-rc6-mm1/mm/slub.c
> ===
> --- linux-2.6.23-rc6-mm1.orig/mm/slub.c   2007-09-20 10:31:04.0 
> -0700
> +++ linux-2.6.23-rc6-mm1/mm/slub.c2007-09-20 10:32:19.0 -0700
> @@ -3412,12 +3412,16 @@ static unsigned long slab_objects(struct
>  
>   for_each_possible_cpu(cpu) {
>   struct page *page;
> + int node;
>   struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
>  
>   if (!c)
>   continue;
>  
>   page = c->page;
> + node = c->node;
> + if (node < 0)
> + continue;
>   if (page) {
>   if (flags & SO_CPU) {
>   int x = 0;
> @@ -3427,9 +3431,9 @@ static unsigned long slab_objects(struct
>   else
>   x = 1;
>   total += x;
> - nodes[c->node] += x;
> + nodes[node] += x;
>   }
> - per_cpu[c->node]++;
> + per_cpu[node]++;
>   }
>   }
>  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: BUG kmalloc-16: Object padding overwritten (sysfs?)

2007-09-21 Thread Alexey Dobriyan

On Thu, Sep 20, 2007 at 10:36:13AM -0700, Christoph Lameter wrote:
 On Thu, 20 Sep 2007, Alexey Dobriyan wrote:
  The winner is 
  slub-avoid-touching-page-struct-when-freeing-to-per-cpu-slab.patch
  Blind bisecting pointed to it and reverting the patch from full -mm makes
  the problem go away
 
 Hmmm.. This means likely that the c-node is used somewhere for 
 indexing Ahhh... If we count objects for sysfs output then c-node may 
 be used to index into the statistics array. The offset from the poison 
 also makes sense now since we increment values there.
 
 Does this patch fix the issue?

Yes, it does.

 SLUB: Fix slab object counting.
 
 We can only use the node value of the per cpu structure for counting if it 
 is positive. A negative value indicates that the slab is not valid.
 
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  mm/slub.c |8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)
 
 Index: linux-2.6.23-rc6-mm1/mm/slub.c
 ===
 --- linux-2.6.23-rc6-mm1.orig/mm/slub.c   2007-09-20 10:31:04.0 
 -0700
 +++ linux-2.6.23-rc6-mm1/mm/slub.c2007-09-20 10:32:19.0 -0700
 @@ -3412,12 +3412,16 @@ static unsigned long slab_objects(struct
  
   for_each_possible_cpu(cpu) {
   struct page *page;
 + int node;
   struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
  
   if (!c)
   continue;
  
   page = c-page;
 + node = c-node;
 + if (node  0)
 + continue;
   if (page) {
   if (flags  SO_CPU) {
   int x = 0;
 @@ -3427,9 +3431,9 @@ static unsigned long slab_objects(struct
   else
   x = 1;
   total += x;
 - nodes[c-node] += x;
 + nodes[node] += x;
   }
 - per_cpu[c-node]++;
 + per_cpu[node]++;
   }
   }
  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
   (Btw, the above commit message points to just my response with a testing 
   patch to the real email: the actual explanation of the INSANE ordering is 
   from Len Brown in
   
 
   https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html
   
   and there Len claims that we *must* wake up CPU's early).
  
  ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
  turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
  
  Howerver, it seems that bugzilla entry may just be bogus. It talks about 
  it appears that some firmware in the future may depend on that sequence 
  for correction operation
  
  Len, Shaohua, what are the real issues here? 
 
 Intel's reference BIOS for Core Duo performs some re-initialization
 in _WAK that will get blow away if INIT follows _WAK.
 IIR, it is related to re-initializing the thermal sensors.
 I opened bug 5651 when the BIOS team informed me of this issue.
 
 Yes, bringing a processor offline and then online again w/o
 an intervening suspend or reset would not evaluate _WAK,
 and thus may still run into the issue.

If this is true, then we should disable the sys//cpu/online entry
right away.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

On Fri, 2007-09-21 at 14:51 +1000, Paul Mackerras wrote:
 Linus Torvalds writes:
 
  It would indeed be nice if we could just take CPU's down early (while 
  everything is working), and run the whole suspend code with just one CPU, 
  rather than having to worry about the ordering between CPU and device 
  takedown.
 
 That is certainly what we want to do on powerpc.

I would have expected that we do it exactly this way and it took me by
surprise, that we do not.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:
 Nadia Derbey wrote:
 Jarek Poplawski wrote:
 
 On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:
...
 Actually, ipc_lock() is called most of the time without the 
 ipc_ids.mutex held and without refcounting (maybe you didn't look for 
 the msg_lock() sem_lock() and shm_lock() too).
 So I think disabling preemption is needed, isn't it?
 
 so, these rcu_read_locks() don't
 work here at all. So, probably I miss something again, but IMHO,
 these rcu_read_locks/unlocks could be removed here or in
 ipc_lock_by_ptr() and it should be enough to use them directly, where
 really needed, e.g., in msg.c do_msgrcv().
 
 
 I have to check for the ipc_lock_by_ptr(): may be you're right!
 
 
 So, here is the ipc_lock_by_ptr() status:
 1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
 call it inside a refcounting.
   == no rcu read section needed.
 
 2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
 ipc_ids mutex lock.
   == no rcu read section needed.
 
 3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
 under refcounting
   == rcu read section + some more checks needed once the spnlock is
   taken.
 
 So I completely agree with you: we might remove the rcu_read_lock() from 
 the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
 is already explicitly called in do_msgrcv()).

Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):

1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().

2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.

3. Probably similar problem is possible with msr_d.r_msg which is
read in do_msgrcv() under rcu_read_lock() only.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Thursday, 20 September 2007 23:53, Thomas Gleixner wrote:
 Rafael,
 
 On Thu, 2007-09-20 at 23:45 +0200, Rafael J. Wysocki wrote:
   We disable everything in device_suspend()
  
  No, we don't.  sysdevs are _not_ suspended in device_suspend().
  They are suspended in device_power_down(), which is called
  _after_ disable_nonboot_cpus() (from swsusp_suspend()).
  
   including timekeeping,
  
  No, the timekeeping is suspended in device_power_down() (or at least it 
  should
  be).
 
 Damn, you are right. Reading through 30 different logs confused me.
 
 enable_nonboot_cpus();
  
  Actually, we can't do this here, because of ACPI and some interrupt handling
  related problems.  Unfortunately, platform_finish() needs to go _after_
  enable_nonboot_cpus() and device_resume() needs to go after 
  platform_finish().
  Analogously, disable_nonboot_cpus() has to go after platform_prepare().
 
  Otherwise, some systems will break.
 
 Well, I don't buy this one. The system would break in the same way, when
 I take CPU#1 offline before I initiate the suspend.
 
   and non-surprisingly the my VAIO needs help from keyboard problem went
   away immediately. See patch below. (on top of rc7-hrt1, -mm1 does not
   work at all on my VAIO due to some yet not identified wreckage)
  
  Hm, I really don't know why it helps, but that's not because of the 
  timekeeping
  suspend, IMO.
 
 It is related. We rely on some subtle thing which is not up when we
 resume the non boot cpu.
 
   I did not yet look into the suspend to ram code, but I guess that there
   is an equivalent problem.
  
  Yes, the code ordering is the same, but it's not totally wrong, IMHO.
  
   But I have no idea why this affects Andrews jinxed VAIO (UP machine),
   though I suspect that we have more timekeeping/timer depending code
   somewhere waiting to bite us.
  
  That's possible.
  
   Also I still need to debug why the HIBERNATION_TEST code path (which has
   a msleep(5000) in it) does not fail,
  
  See above. :-)
 
 Yes. It makes sense. When I change the TEST code path to:
 
 - printk(swsusp debug: Waiting for 5 seconds.\n);
 - msleep(5000);
 + printk(swsusp debug: before swsusp_suspend\n);
 + error = swsusp_suspend();
 
 then I have the same effect as I get from real hibernation. And we
 actually shut down time keeping somewhere in that code path.
 
 ACPI: PCI interrupt for device :00:1b.0 disabled
 swsusp debug: before swsusp_suspend
 Suspend timekeeping
 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 ACPI: PCI Interrupt :00:02.0[A] - GSI 16 (level, low) - IRQ 16
 - works fine
 
 This is with my patch applied. Without that I get:
 
 CPU1 is down
 swsusp debug: before swsusp_suspend
 Suspend timekeeping
 swsusp: critical section: 
 swsusp: Need to copy 112429 pages
 swsusp: Normal pages needed: 35399 + 1024 + 40, available pages: 193876
 swsusp: critical section: done (112429 pages copied)
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
 Resume timekeeping
 Enabling non-boot CPUs
 -- Waits for ever until a key is pressed

Can you please run one more test?

Namely, without your debugging code in disk.c, please try

# echo shutdown  /sys/power/disk
# echo disk  /sys/power/state

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

On Friday, 21 September 2007 09:56, Thomas Gleixner wrote:
 On Thu, 2007-09-20 at 19:35 -0400, Len Brown wrote:
(Btw, the above commit message points to just my response with a 
testing 
patch to the real email: the actual explanation of the INSANE ordering 
is 
from Len Brown in


https://lists.linux-foundation.org/pipermail/linux-pm/2006-November/004161.html

and there Len claims that we *must* wake up CPU's early).
   
   ..and points to commit 1a38416cea8ac801ae8f261074721f35317613dc which in 
   turn talks about http://bugzilla.kernel.org/show_bug.cgi?id=5651 
   
   Howerver, it seems that bugzilla entry may just be bogus. It talks about 
   it appears that some firmware in the future may depend on that sequence 
   for correction operation
   
   Len, Shaohua, what are the real issues here? 
  
  Intel's reference BIOS for Core Duo performs some re-initialization
  in _WAK that will get blow away if INIT follows _WAK.
  IIR, it is related to re-initializing the thermal sensors.
  I opened bug 5651 when the BIOS team informed me of this issue.
  
  Yes, bringing a processor offline and then online again w/o
  an intervening suspend or reset would not evaluate _WAK,
  and thus may still run into the issue.
 
 If this is true, then we should disable the sys//cpu/online entry
 right away.

Or drop the execution of _INI from the CPU hotplug, if possible ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Nadia Derbey


Andrew Morton wrote:

On Tue, 18 Sep 2007 16:55:19 +0200 Nadia Derbey [EMAIL PROTECTED] wrote:



This patch fixes the missing rcu_read_(un)lock in the ipc code



Thanks.  Could you please check the code comments in ipc/util.c for
accuracy and completeness sometime?



Done - see attachment.
Hope I didn't forget / add some wrong stuff ;-)
BTW this patch also fixes a missing lock in shm_get_stat().

Regards,
Nadia
This patch fixes the wrong / obsolete comments in the ipc code.
Also adds a missing lock around ipc_get_maxid() in shm_get_stat().

Signed-off-by: Nadia Derbey [EMAIL PROTECTED]

---

This patch applies on top of 2.6.23-rc6-mm1 + the previous IPC patches.




 ipc/msg.c  |   14 +-
 ipc/sem.c  |   14 ++
 ipc/shm.c  |   39 --
 ipc/util.c |   78 +++--
 ipc/util.h |   18 --
 5 files changed, 118 insertions(+), 45 deletions(-)

Index: linux-2.6.23-rc6-mm1/ipc/util.c
===
--- linux-2.6.23-rc6-mm1.orig/ipc/util.c	2007-09-21 08:00:41.0 +0200
+++ linux-2.6.23-rc6-mm1/ipc/util.c	2007-09-21 10:32:03.0 +0200
@@ -194,7 +194,7 @@ void __init ipc_init_proc_interface(cons
  *	Requires ipc_ids.mutex locked.
  *	Returns the LOCKED pointer to the ipc structure if found or NULL
  *	if not.
- *	If key is found ipc contains its ipc structure
+ *	If key is found ipc points to the owning ipc structure
  */
  
 static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
@@ -258,10 +258,10 @@ int ipc_get_maxid(struct ipc_ids *ids)
  *	@new: new IPC permission set
  *	@size: limit for the number of used ids
  *
- *	Add an entry 'new' to the IPC idr. The permissions object is
+ *	Add an entry 'new' to the IPC ids idr. The permissions object is
  *	initialised and the first free entry is set up and the id assigned
- *	is returned. The list is returned in a locked state on success.
- *	On failure the list is not locked and -1 is returned.
+ *	is returned. The 'new' entry is returned in a locked state on success.
+ *	On failure the entry is not locked and -1 is returned.
  *
  *	Called with ipc_ids.mutex held.
  */
@@ -270,10 +270,6 @@ int ipc_addid(struct ipc_ids* ids, struc
 {
 	int id, err;
 
-	/*
-	 * rcu_dereference()() is not needed here since
-	 * ipc_ids.mutex is held
-	 */
 	if (size  IPCMNI)
 		size = IPCMNI;
 
@@ -303,12 +299,12 @@ int ipc_addid(struct ipc_ids* ids, struc
 /**
  *	ipcget_new	-	create a new ipc object
  *	@ns: namespace
- *	@ids: identifer set
+ *	@ids: IPC identifer set
  *	@ops: the actual creation routine to call
  *	@params: its parameters
  *
- *	This routine is called sys_msgget, sys_semget() and sys_shmget() when
- *	the key is IPC_PRIVATE
+ *	This routine is called by sys_msgget, sys_semget() and sys_shmget()
+ *	when the key is IPC_PRIVATE.
  */
 int ipcget_new(struct ipc_namespace *ns, struct ipc_ids *ids,
 		struct ipc_ops *ops, struct ipc_params *params)
@@ -330,9 +326,16 @@ int ipcget_new(struct ipc_namespace *ns,
 /**
  *	ipc_check_perms	-	check security and permissions for an IPC
  *	@ipcp: ipc permission set
- *	@ids: identifer set
  *	@ops: the actual security routine to call
  *	@params: its parameters
+ *
+ *	This routine is called by sys_msgget(), sys_semget() and sys_shmget()
+ *  when the key is not IPC_PRIVATE and that key already exists in the
+ *  ids IDR.
+ *
+ *	On success, the IPC id is returned.
+ *
+ *	It is called with ipc_ids.mutex and ipcp-lock held.
  */
 static int ipc_check_perms(struct kern_ipc_perm *ipcp, struct ipc_ops *ops,
 			struct ipc_params *params)
@@ -353,12 +356,16 @@ static int ipc_check_perms(struct kern_i
 /**
  *	ipcget_public	-	get an ipc object or create a new one
  *	@ns: namespace
- *	@ids: identifer set
+ *	@ids: IPC identifer set
  *	@ops: the actual creation routine to call
  *	@params: its parameters
  *
- *	This routine is called sys_msgget, sys_semget() and sys_shmget() when
- *	the key is not IPC_PRIVATE
+ *	This routine is called by sys_msgget, sys_semget() and sys_shmget()
+ *	when the key is not IPC_PRIVATE.
+ *	It adds a new entry if the key is not found and does some permission
+ *  / security checkings if the key is found.
+ *
+ *	On success, the ipc id is returned.
  */
 int ipcget_public(struct ipc_namespace *ns, struct ipc_ids *ids,
 		struct ipc_ops *ops, struct ipc_params *params)
@@ -389,6 +396,10 @@ int ipcget_public(struct ipc_namespace *
 			if (ops-more_checks)
 err = ops-more_checks(ipcp, params);
 			if (!err)
+/*
+ * ipc_check_perms returns the IPC id on
+ * success
+ */
 err = ipc_check_perms(ipcp, ops, params);
 		}
 		ipc_unlock(ipcp);
@@ -401,12 +412,9 @@ int ipcget_public(struct ipc_namespace *
 
 /**
  *	ipc_rmid	-	remove an IPC identifier
- *	@ids: identifier set
- *	@id: ipc perm structure containing the identifier to remove
+ *	@ids: IPC identifier set
+ *	@ipcp: ipc perm

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Nadia Derbey


Jarek Poplawski wrote:

On Thu, Sep 20, 2007 at 03:08:42PM +0200, Nadia Derbey wrote:


Nadia Derbey wrote:


Jarek Poplawski wrote:



On Thu, Sep 20, 2007 at 08:24:58AM +0200, Nadia Derbey wrote:


...

Actually, ipc_lock() is called most of the time without the 
ipc_ids.mutex held and without refcounting (maybe you didn't look for 
the msg_lock() sem_lock() and shm_lock() too).

So I think disabling preemption is needed, isn't it?



so, these rcu_read_locks() don't
work here at all. So, probably I miss something again, but IMHO,
these rcu_read_locks/unlocks could be removed here or in
ipc_lock_by_ptr() and it should be enough to use them directly, where
really needed, e.g., in msg.c do_msgrcv().



I have to check for the ipc_lock_by_ptr(): may be you're right!



So, here is the ipc_lock_by_ptr() status:
1) do_msgsnd(), semctl_main(GETALL), semctl_main(SETALL) and find_undo() 
call it inside a refcounting.

 == no rcu read section needed.

2) *_exit_ns(), ipc_findkey() and sysvipc_find_ipc() call it under the 
ipc_ids mutex lock.

 == no rcu read section needed.

3) do_msgrcv() is the only path where ipc_lock_by_ptr() is not called 
under refcounting

 == rcu read section + some more checks needed once the spnlock is
 taken.

So I completely agree with you: we might remove the rcu_read_lock() from 
the ipc_lock_by_ptr() and explicitley  call it when needed (actually, it 
is already explicitly called in do_msgrcv()).



Yes, IMHO, it should be at least more readable when we can see where
this RCU is really needed.

But, after 3-rd look, I have a few more doubts (btw., 3 looks are
still not enough for me with this code, so I cerainly can miss many
things here, and, alas, I manged to see util and msg code only):

1. ipc_lock() and ipc_lock_check() are used without ipc_ids.mutex,
but it's probably wrong: they call idr_find() with ipc_ids pointer
which needs this mutex, just like in similar code in: ipc_findkey(),
ipc_get_maxid() or sysvipc_find_ipc().

2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
safe (memory barriers): it's not atomic, so locking is needed, but
e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
freeque() calls ipc_rcu_putref() with ipc_ids mutex only.


OK, but freeque() freeary() and shm_destroy() are special cases:
we have the following path:

mutex_lock(ipc_ids.mutex)
...
ipc_lock(ipcp)
... do whatever cleaning is needed ...
ipc_rmid(ipcp)
ipc_unlock(ipcp)

ipc_rcu_putref(ipcp)

Once the rmid has been done the ipc structure is considered as not 
visible anymore from the user side == any syscall called with the 
corresponding id will return invalid.
The only thing that could happen is that this structure be reused for a 
newly allocated ipc structure. But this too cannot happen since we are 
under the ipc_ids mutex lock.


Am I wrong?

Answers to the other questions in separate e-mails

Regards,
Nadia

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 12:11:15PM +0200, Nadia Derbey wrote:
 Jarek Poplawski wrote:
...
 2. I'm not sure this refcounting with ipc_rcu_getref/putref is SMP
 safe (memory barriers): it's not atomic, so locking is needed, but
 e.g. in do_msgsnd() kern_ipc_perm lock is used for this, while
 freeque() calls ipc_rcu_putref() with ipc_ids mutex only.
 
 OK, but freeque() freeary() and shm_destroy() are special cases:
 we have the following path:
 
 mutex_lock(ipc_ids.mutex)
 ...
 ipc_lock(ipcp)
 ... do whatever cleaning is needed ...
 ipc_rmid(ipcp)
 ipc_unlock(ipcp)
 
 ipc_rcu_putref(ipcp)
 
 Once the rmid has been done the ipc structure is considered as not 
 visible anymore from the user side == any syscall called with the 
 corresponding id will return invalid.
 The only thing that could happen is that this structure be reused for a 
 newly allocated ipc structure. But this too cannot happen since we are 
 under the ipc_ids mutex lock.
 
 Am I wrong?

I hope not! But, then it would be probably another logical trick:
ipc_rcu_getref/putref() seems to prevent kfreeing of a structure, so
if it's used in do_msgsnd() there should be a risk something can do
this kfree at this moment, and it seems freeque() is the only one,
which both: can do this and cares for this refcount. Then, e.g., if
any of them does ipc_rcu_getref() a bit later and sees old (cached)
value - kfree can be skipped forever. On the other hand, if there is
no such possibility because of other reasons, this all refcounting
looks like a code beautifier only...

Thanks,
Jarek P.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: IPC: sleeping function called ...

2007-09-21 Thread Jarek Poplawski

On Fri, Sep 21, 2007 at 01:03:47PM +0200, Jarek Poplawski wrote:
...
 any of them does ipc_rcu_getref() a bit later and sees old (cached)

Should be:
any of them does ipc_rcu_putref() a bit later and sees old (cached)

Sorry,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
  -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
  hint that this change is making things work again.
 
 Yes, it is.
 
  I need to go down into the details of the swsusp_suspend() code path to
  figure out, what's the root cause. 
 
 If you need any help from me with that, please let me know.

I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
After debugging the swsusp_suspend() code path I figured out, that we
end up in C2 or deeper power states while we run the suspend code. The
same happens when we come back on resume. It looks like we disable stuff
in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
I hacked the idle loop arch code to use halt() right before we call
device_suspend() and switch back to the acpi idle code right after
device_resume(). This solves the problem as well.

Len, any opinion on this one ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Friday, 21 September 2007 14:59, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 00:30 +0200, Rafael J. Wysocki wrote:
   -ETOOTIRED led me too a wrong conclusion, but still it is a valuable
   hint that this change is making things work again.
  
  Yes, it is.
  
   I need to go down into the details of the swsusp_suspend() code path to
   figure out, what's the root cause. 
  
  If you need any help from me with that, please let me know.
 
 I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
 After debugging the swsusp_suspend() code path I figured out, that we
 end up in C2 or deeper power states while we run the suspend code. The
 same happens when we come back on resume. It looks like we disable stuff
 in the ACPI BIOS, which makes the C2 and deeper power states misbehave.

Hm, can you please run the test I've suggested in another branch of the
thread, ie.

# echo shutdown  /sys/power/disk
# echo disk  /sys/power/state

without your debugging code in disk.c?

This makes the hibernation code omit the major ACPI hooks, so if it works,
we'll know that these hooks are responsible for the problem.

 I hacked the idle loop arch code to use halt() right before we call
 device_suspend() and switch back to the acpi idle code right after
 device_resume(). This solves the problem as well.

Well, that seems less intrusive than changing the code ordering right before
the major kernel release, but I think we should do our best to understand what
_exactly_ is happening here.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 sparc build error

2007-09-21 Thread Mathieu Desnoyers

* Guennadi Liakhovetski ([EMAIL PROTECTED]) wrote:
 Provide {enable,disable}_irq_wakeup dummies for undefined 
 CONFIG_GENERIC_HARDIRQS case. Completely untested, as I don't even have 
 cross-compilers for platforms without CONFIG_GENERIC_HARDIRQS.
 
 Signed-off-by: Guennadi Liakhovetski [EMAIL PROTECTED]
 

It builds fine now.

Tested-by: Mathieu Desnoyers [EMAIL PROTECTED]

 ---
 
 On Tue, 18 Sep 2007, Andrew Morton wrote:
 
  On Tue, 18 Sep 2007 16:54:03 -0400
  Mathieu Desnoyers [EMAIL PROTECTED] wrote:
  
   I got the following error when building 2.6.23-rc6-mm1 on sparc:
   
   
 
   /opt/crosstool/gcc-4.1.1-glibc-2.3.6/sparc-unknown-linux-gnu/bin/sparc-unknown-linux-gnu-gcc
-Wp,-MD,drivers/serial/.serial_core.o.d  -nostdinc -isystem 
   /opt/crosstool/gcc-4.1.1-glibc-2.3.6/sparc-unknown-linux-gnu/lib/gcc/sparc-unknown-linux-gnu/4.1.1/include
-D__KERNEL__ -Iinclude -Iinclude2 
   -I/home/compudj/git/linux-2.6-lttng/include -include 
   include/linux/autoconf.h  
   -I/home/compudj/git/linux-2.6-lttng/drivers/serial -Idrivers/serial -Wall 
   -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
   -fno-common -Werror-implicit-function-declaration -Os -m32 -pipe -mno-fpu 
   -fcall-used-g5 -fcall-used-g7 -fomit-frame-pointer -fno-stack-protector 
   -Wdeclaration-after-statement -Wno-pointer-sign  -DKBUILD_STR(s)=#s 
   -DKBUILD_BASENAME=KBUILD_STR(serial_core)  
   -DKBUILD_MODNAME=KBUILD_STR(serial_core) -c -o 
   drivers/serial/.tmp_serial_core.o 
   /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c
   /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c: In 
   function 'uart_suspend_port':
   /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c:1980: 
   error: implicit declaration of function 'enable_irq_wake'
   /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c: In 
   function 'uart_resume_port':
   /home/compudj/git/linux-2.6-lttng/drivers/serial/serial_core.c:2035: 
   error: implicit declaration of function 'disable_irq_wake'
  
  hm, I wonder why I didn't hit that.
  
  enable_irq_wake() was added by wake-up-from-a-serial-port.patch
  
  I note that git-input adds a call too, and might have a problem
  with !CONFIG_GENERIC_HARDIRQS.
  
  Not sure what the best fix is here.  We could sprinkle ifdefs all
  over the code, or just add the suitable empty stubs for enable_irq_wake(),
  etc.
 
 diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
 index 5323f62..ecade41 100644
 --- a/include/linux/interrupt.h
 +++ b/include/linux/interrupt.h
 @@ -205,6 +205,9 @@ static inline int disable_irq_wake(unsigned int irq)
   enable_irq(irq)
  # endif
  
 +#define enable_irq_wake(irq) ({ (void)(irq); 0; })
 +#define disable_irq_wake(irq) ({ (void)(irq); 0; })
 +
  #endif /* CONFIG_GENERIC_HARDIRQS */
  
  #ifndef __ARCH_SET_SOFTIRQ_PENDING

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
   If you need any help from me with that, please let me know.
  
  I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
  After debugging the swsusp_suspend() code path I figured out, that we
  end up in C2 or deeper power states while we run the suspend code. The
  same happens when we come back on resume. It looks like we disable stuff
  in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
 
 Hm, can you please run the test I've suggested in another branch of the
 thread, ie.
 
 # echo shutdown  /sys/power/disk
 # echo disk  /sys/power/state
 
 without your debugging code in disk.c?
 
 This makes the hibernation code omit the major ACPI hooks, so if it works,
 we'll know that these hooks are responsible for the problem.

Yes, this works fine. We still go into C3, but this seems not longer to
brick the box.

  I hacked the idle loop arch code to use halt() right before we call
  device_suspend() and switch back to the acpi idle code right after
  device_resume(). This solves the problem as well.
 
 Well, that seems less intrusive than changing the code ordering right before
 the major kernel release, but I think we should do our best to understand what
 _exactly_ is happening here.

I found some other subtle thinko in the clock events code while I was
heading down the swsusp_suspend code path. I wait for confirmation that
it does not brick some endangered boxen, though. Still with this change
in the clock events code, my VAIO goes into C2 or C3 and causes the box
to wait for a helping keystroke.

The correct solution would be, that the ACPI code ignores the lower
C-states during suspend / resume. I simply rmmod'ed the processor module
before suspend and the problem is solved as well. The cpuidle patches
make this problem more prominent due to the possible more direct switch
into lower power states, when we wait for a long time on something.

I think we really should not fiddle with the various cpu states during
the critical parts of suspend / resume. Let's keep it simple. We have
the same policy during boot and I think the suspend / resume critical
parts have similar constraints.

tglx






-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 16:20 +0200, Rafael J. Wysocki wrote:
If you need any help from me with that, please let me know.
   
   I'm zooming in. It seems, that the ACPI idle code plays tricks with us.
   After debugging the swsusp_suspend() code path I figured out, that we
   end up in C2 or deeper power states while we run the suspend code. The
   same happens when we come back on resume. It looks like we disable stuff
   in the ACPI BIOS, which makes the C2 and deeper power states misbehave.
  
  Hm, can you please run the test I've suggested in another branch of the
  thread, ie.
  
  # echo shutdown  /sys/power/disk
  # echo disk  /sys/power/state
  
  without your debugging code in disk.c?
  
  This makes the hibernation code omit the major ACPI hooks, so if it works,
  we'll know that these hooks are responsible for the problem.
 
 Yes, this works fine. We still go into C3, but this seems not longer to
 brick the box.
 
   I hacked the idle loop arch code to use halt() right before we call
   device_suspend() and switch back to the acpi idle code right after
   device_resume(). This solves the problem as well.
  
  Well, that seems less intrusive than changing the code ordering right before
  the major kernel release, but I think we should do our best to understand 
  what
  _exactly_ is happening here.
 
 I found some other subtle thinko in the clock events code while I was
 heading down the swsusp_suspend code path. I wait for confirmation that
 it does not brick some endangered boxen, though. Still with this change
 in the clock events code, my VAIO goes into C2 or C3 and causes the box
 to wait for a helping keystroke.
 
 The correct solution would be, that the ACPI code ignores the lower
 C-states during suspend / resume.

Yes, certainly.

 I simply rmmod'ed the processor module before suspend and the problem is
 solved as well. The cpuidle patches make this problem more prominent due
 to the possible more direct switch into lower power states, when we wait for
 a long time on something. 

So, perhaps we can add a .suspend()/.resume() routines to the processor driver
and use them to disable/enable the cpuidle functionality during a
suspend/resume?

 I think we really should not fiddle with the various cpu states during
 the critical parts of suspend / resume. Let's keep it simple. We have
 the same policy during boot and I think the suspend / resume critical
 parts have similar constraints.

I completely agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Rafael J. Wysocki

Thomas,

On Friday, 21 September 2007 21:16, Thomas Gleixner wrote:
 Rafael,
 
 On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
  On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
   I simply rmmod'ed the processor module before suspend and the problem is
   solved as well. The cpuidle patches make this problem more prominent due
   to the possible more direct switch into lower power states, when we wait 
   for
   a long time on something. 
  
  So, perhaps we can add a .suspend()/.resume() routines to the processor 
  driver
  and use them to disable/enable the cpuidle functionality during a
  suspend/resume?
 
 http://tglx.de/private/tglx/p.diff
 
 untested yet, but I'm on the way to do that :)

Heh, I thought of the same thing. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-21 Thread Thomas Gleixner

Rafael,

On Fri, 2007-09-21 at 21:20 +0200, Rafael J. Wysocki wrote:
 On Friday, 21 September 2007 18:27, Thomas Gleixner wrote:
  I simply rmmod'ed the processor module before suspend and the problem is
  solved as well. The cpuidle patches make this problem more prominent due
  to the possible more direct switch into lower power states, when we wait for
  a long time on something. 
 
 So, perhaps we can add a .suspend()/.resume() routines to the processor driver
 and use them to disable/enable the cpuidle functionality during a
 suspend/resume?

http://tglx.de/private/tglx/p.diff

untested yet, but I'm on the way to do that :)

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1 -- mkfs stuck in 'D'

2007-09-21 Thread Fengguang Wu

On Thu, Sep 20, 2007 at 12:31:39PM +0100, Hugh Dickins wrote:
 On Wed, 19 Sep 2007, Peter Zijlstra wrote:
  On Wed, 19 Sep 2007 21:03:19 +0100 (BST) Hugh Dickins
  [EMAIL PROTECTED] wrote:
  
   On Wed, 19 Sep 2007, Andy Whitcroft wrote:
Seems I have a case of a largish i386 NUMA (NUMA-Q) which has a mkfs
stuck in a 'D' wait:

 ===
mkfs.ext2 D c10220f4 0  6233   6222
 [c12194da] io_schedule_timeout+0x1e/0x28
 [c10454b4] congestion_wait+0x62/0x7a
 [c10402af] get_dirty_limits+0x16a/0x172
 [c104040b] balance_dirty_pages+0x154/0x1be
 [c103bda3] generic_perform_write+0x168/0x18a
 [c103be38] generic_file_buffered_write+0x73/0x107
 [c103c346] __generic_file_aio_write_nolock+0x47a/0x4a5
 [c103c3b9] generic_file_aio_write_nolock+0x48/0x9b
 [c105d2d6] do_sync_write+0xbf/0xfc
 [c105d3a0] vfs_write+0x8d/0x108
 [c105d4c3] sys_write+0x41/0x67
 [c100260a] syscall_call+0x7/0xb
 ===
   
   [edited out some bogus lines from stale stack]
   
This machine and others have run numerous test runs on this kernel and
this is the first time I've see a hang like this.
   
   I've been seeing something like that on 4-way PPC64: in my case I've
   shells hanging in D state trying to append to kernel build log on ext3
   (the builds themselves going on elsewhere, in tmpfs): one of the shells
   holding i_mutex and stuck doing congestion_waits from balance_dirty_pages.
   
I wonder if this is the ultimate cause of the couple of mainline hangs
which were seen, but not diagnosed.
   
   My *guess* is that this is peculiar to 2.6.23-rc6-mm1, and from Peter's
   mm-per-device-dirty-threshold.patch.  printks showed bdi_nr_reclaimable
   0, bdi_nr_writeback 24, bdi_thresh 1 in balance_dirty_pages (though I've
   not done enough to check if those really correlate with the hangs),
   and I'm wondering if the bdi_stat_sum business is needed on the
   !nr_reclaimable path.
  
  FWIW my tired brain seems to think it the !nr_reclaimable path needs it
  just the same. So this change seems to make sense for now :-)
 
 Thanks.
 
   So I'm running now with the patch below, good so far, but can't judge
   until tomorrow whether it has actually addressed the problem seen.
 
 Last night's run went well: that patch does indeed seem to have fixed it.
 Looking at the timings (some variance but _very_ much less than the night
 before), there does appear to be some other occasional slight slowdown -
 but I've no reason to suspect your patch for it, nor to suppose it's
 something new: it may just be an artifact of my heavy swap thrashing.
 
 
 [PATCH mm] mm per-device dirty threshold fix
 
 Fix occasional hang when a task couldn't get out of balance_dirty_pages:
 mm-per-device-dirty-threshold.patch needs to reevaluate bdi_nr_writeback
 across all cpus when bdi_thresh is low, even in the case when there was
 no bdi_nr_reclaimable.
 
 Signed-off-by: Hugh Dickins [EMAIL PROTECTED]

Thank you Hugh. I ran into similar problems with many dd(large file)
operations.  This patch seems to fix it.

But now my desktop was locked up again when writing a lot of small
files. The problem is repeatable with the command
 $ ketchup 2.6.23-rc6-mm1

I writeup two debug patches:

---
 mm/page-writeback.c |9 +
 1 file changed, 9 insertions(+)

--- linux-2.6.22.orig/mm/page-writeback.c
+++ linux-2.6.22/mm/page-writeback.c
@@ -426,6 +426,14 @@ static void balance_dirty_pages(struct a
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
}
 
+   printk(KERN_DEBUG balance_dirty_pages written %lu %lu 
congested %d limits %lu %lu %lu %lu %lu %ld\n,
+   pages_written,
+   write_chunk - wbc.nr_to_write,
+   bdi_write_congested(bdi),
+   background_thresh, dirty_thresh,
+   bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
+   bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
+
if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
break;
if (pages_written = write_chunk)

---
 mm/page-writeback.c |5 +
 1 file changed, 5 insertions(+)

--- linux-2.6.22.orig/mm/page-writeback.c
+++ linux-2.6.22/mm/page-writeback.c
@@ -373,6 +373,7 @@ static void balance_dirty_pages(struct a
long bdi_thresh;
unsigned long pages_written = 0;
unsigned long write_chunk = sync_writeback_pages();
+   int i = 0;
 
struct backing_dev_info *bdi = mapping-backing_dev_info;
 
@@ -434,6 +435,10 @@ static void balance_dirty_pages(struct a
bdi_thresh, bdi_nr_reclaimable, 
bdi_nr_writeback,
bdi_thresh - bdi_nr_reclaimable - 
bdi_nr_writeback);
 
+   if (i++  200) {
+

Re: 2.6.23-rc6-mm1: Build failure on ppc64 drivers/ata/pata_scc.c

2007-09-21 Thread Satyam Sharma

Hi,


On Thu, 20 Sep 2007, Alan Cox wrote:
 
 On Thu, 20 Sep 2007 14:13:15 +0100
 [EMAIL PROTECTED] (Mel Gorman) wrote:
 
  PPC64 building allmodconfig fails to compile drivers/ata/pata_scc.c . It
  doesn't show up on other arches because this driver is specific to the
  architecture.
  
  drivers/ata/pata_scc.c: In function `scc_bmdma_status'
 
 Its not been updated to match the libata core changes. Try something like
 this. Whoever is maintaining it should also remove the prereset cable handling
 code and use the proper cable detect method.

It appears you forgot to fix scc_std_softreset() and one of its callsites
in scc_bdma_stop(). A complete patch is attempted below -- please review.


[PATCH -mm] pata_scc: Keep up with libata core API changes

Little fixlets, that the build started erroring / warning about:

drivers/ata/pata_scc.c: In function 'scc_bmdma_status':
drivers/ata/pata_scc.c:734: error: structure has no member named 'active_tag'
drivers/ata/pata_scc.c: In function 'scc_pata_prereset':
drivers/ata/pata_scc.c:866: warning: passing arg 1 of 'ata_std_prereset' from 
incompatible pointer type
drivers/ata/pata_scc.c: In function 'scc_error_handler':
drivers/ata/pata_scc.c:908: warning: passing arg 2 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 3 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
drivers/ata/pata_scc.c:908: warning: passing arg 5 of 'ata_bmdma_drive_eh' from 
incompatible pointer type
make[2]: *** [drivers/ata/pata_scc.o] Error 1

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Cc: Alan Cox [EMAIL PROTECTED]
Cc: Mel Gorman [EMAIL PROTECTED]

---

Andrew, this includes (supercedes) the previous two ones from Mel / Alan.

 drivers/ata/pata_scc.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff -ruNp a/drivers/ata/pata_scc.c b/drivers/ata/pata_scc.c
--- a/drivers/ata/pata_scc.c2007-09-22 06:26:37.0 +0530
+++ b/drivers/ata/pata_scc.c2007-09-22 08:04:42.0 +0530
@@ -594,16 +594,17 @@ static unsigned int scc_bus_softreset(st
  * Note: Original code is ata_std_softreset().
  */
 
-static int scc_std_softreset (struct ata_port *ap, unsigned int *classes,
-  unsigned long deadline)
+static int scc_std_softreset(struct ata_link *link, unsigned int *classes,
+ unsigned long deadline)
 {
+   struct ata_port *ap = link-ap;
unsigned int slave_possible = ap-flags  ATA_FLAG_SLAVE_POSS;
unsigned int devmask = 0, err_mask;
u8 err;
 
DPRINTK(ENTER\n);
 
-   if (ata_link_offline(ap-link)) {
+   if (ata_link_offline(link)) {
classes[0] = ATA_DEV_NONE;
goto out;
}
@@ -692,7 +693,7 @@ static void scc_bmdma_stop (struct ata_q
printk(KERN_WARNING %s: Internal Bus Error\n, 
DRV_NAME);
out_be32(bmid_base + SCC_DMA_INTST, INTSTS_BMSINT);
/* TBD: SW reset */
-   scc_std_softreset(ap, classes, deadline);
+   scc_std_softreset(ap-link, classes, deadline);
continue;
}
 
@@ -731,7 +732,7 @@ static u8 scc_bmdma_status (struct ata_p
void __iomem *mmio = ap-ioaddr.bmdma_addr;
u8 host_stat = in_be32(mmio + SCC_DMA_STATUS);
u32 int_status = in_be32(mmio + SCC_DMA_INTST);
-   struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap-active_tag);
+   struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap-link.active_tag);
static int retry = 0;
 
/* return if IOS_SS is cleared */
@@ -860,10 +861,10 @@ static void scc_bmdma_freeze (struct ata
  * @deadline: deadline jiffies for the operation
  */
 
-static int scc_pata_prereset(struct ata_port *ap, unsigned long deadline)
+static int scc_pata_prereset(struct ata_link *link, unsigned long deadline)
 {
-   ap-cbl = ATA_CBL_PATA80;
-   return ata_std_prereset(ap, deadline);
+   link-ap-cbl = ATA_CBL_PATA80;
+   return ata_std_prereset(link, deadline);
 }
 
 /**
@@ -874,8 +875,10 @@ static int scc_pata_prereset(struct ata_
  * Note: Original code is ata_std_postreset().
  */
 
-static void scc_std_postreset (struct ata_port *ap, unsigned int *classes)
+static void scc_std_postreset(struct ata_link *link, unsigned int *classes)
 {
+   struct ata_port *ap = link-ap;
+
DPRINTK(ENTER\n);
 
/* is double-select really necessary? */
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1: failure to boot on HP nx6325, no sound when booted, USB-related WARNING

2007-09-20 Thread Paul Mackerras

Linus Torvalds writes:

> It would indeed be nice if we could just take CPU's down early (while 
> everything is working), and run the whole suspend code with just one CPU, 
> rather than having to worry about the ordering between CPU and device 
> takedown.

That is certainly what we want to do on powerpc.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc6-mm1

2007-09-20 Thread Joseph Fannin

On Thu, Sep 20, 2007 at 09:42:44PM +0530, Kamalesh Babulal wrote:
> On 9/20/07, Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> > On 9/20/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > >
> > > On Wed, 19 Sep 2007 19:58:28 -0400
> > > [EMAIL PROTECTED] (Joseph Fannin) wrote:
> > >
> > > > On Tue, Sep 18, 2007 at 01:18:41AM -0700, Andrew Morton wrote:
> > >
> > > ---
> > > a/include/asm-powerpc/smp.h~convert-cpu_sibling_map-to-a-per_cpu-data-array-ppc64-fix-2
> > >
> > > +++ a/include/asm-powerpc/smp.h
> > > @@ -25,8 +25,8 @@
> > >
> > > #ifdef CONFIG_PPC64
> > > #include 
> > > -#include 
> > > #endif
> > > +#include 
> > >
> > > extern int boot_cpuid;


> > Signed-off-by: Kamalesh Babulal <[EMAIL PROTECTED]>
> > ---
> > --- linux-2.6.23-rc6 /drivers/net/mace.c 2007-09-20 17:16:50.0+0530
> > +++ linux-2.6.23-rc6/drivers/net/~mace.c2007-09-20 17:12:
> > 47.0 +0530
> > @@ -633,7 +633,7 @@ static void mace_set_multicast(struct ne
> >  spin_unlock_irqrestore(>lock, flags);
> >  }
> >
> > -static void mace_handle_misc_intrs(struct mace_data *mp, int intr)
> > +static void mace_handle_misc_intrs(struct mace_data *mp, int intr, struct
> > net_device *dev)
> >  {
> >  volatile struct mace __iomem *mb = mp->mace;
> >  static int mace_babbles, mace_jabbers;
> > @@ -669,7 +669,7 @@ static irqreturn_t mace_interrupt(int ir
> >  spin_lock_irqsave(>lock, flags);
> >  intr = in_8(>ir);  /* read interrupt register */
> >  in_8(>xmtrc);  /* get retries */
> > -mace_handle_misc_intrs(mp, intr);
> > +mace_handle_misc_intrs(mp, intr, dev);
> >
> >  i = mp->tx_empty;
> >  while (in_8(>pr) & XMTSV) {
> > @@ -682,7 +682,7 @@ static irqreturn_t mace_interrupt(int ir
> >  */
> > intr = in_8(>ir);
> > if (intr != 0)
> > -   mace_handle_misc_intrs(mp, intr);
> > +   mace_handle_misc_intrs(mp, intr, dev);
> > if (mp->tx_bad_runt) {
> > fs = in_8(>xmtfs);
> > mp->tx_bad_runt = 0;
> > @@ -817,7 +817,7 @@ static void mace_tx_timeout(unsigned lon
> > goto out;
> >
> >  /* update various counters */
> > -mace_handle_misc_intrs(mp, in_8(>ir));
> > +mace_handle_misc_intrs(mp, in_8(>ir), dev);
> >
> >  cp = mp->tx_cmds + NCMDS_TX * mp->tx_empty;

Both these patches have built and booted for me.

I will send a patch for the following error separately, in what
will hopefully be canonical patch submission format, in case that's of
any use.

Thanks.

> drivers/net/mv643xx_eth.c: In function 'mv643xx_eth_int_handler':
> drivers/net/mv643xx_eth.c:564: error: 'bp' undeclared (first use in this
> function)
> drivers/net/mv643xx_eth.c:564: error: (Each undeclared identifier is
> reported only once
> drivers/net/mv643xx_eth.c:564: error: for each function it appears in.)
> drivers/net/mv643xx_eth.c: At top level:
> drivers/net/mv643xx_eth.c:1010: error: conflicting types for 'mv643xx_poll'
> drivers/net/mv643xx_eth.c:68: error: previous declaration of 'mv643xx_poll'
> was here
> make[2]: *** [drivers/net/mv643xx_eth.o] Error 1
> make[1]: *** [drivers/net] Error 2
> make: *** [drivers] Error 2


--
Joseph Fannin
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 409 matches

Mail list logo