Re: "BUG: held lock freed!" lock validator tripped by kswapd & xfs

2006-12-01 Thread Stephen Pollei

On 12/1/06, Mike Mattie <[EMAIL PROTECTED]> wrote:


In an attempt to debug another kernel issue I turned on the lock validator and
managed to generate this report.

As a side note the first attempt to boot with the lock validator failed with
a message indicating I had exceeded MAX_LOCK_DEPTH. To get this trace
I patched sched.h: MAX_LOCK_DEPTH to 60.

Dec  1 08:35:41 reforged [ 3052.513931] =
Dec  1 08:35:41 reforged [ 3052.513937] [ BUG: held lock freed! ]
Dec  1 08:35:41 reforged [ 3052.513939] -
Dec  1 08:35:41 reforged [ 3052.513943] kswapd0/183 is freeing memory
c3458000-c3458fff, with a lock still held there! Dec  1 08:35:41
reforged [ 3052.513947]  (&(>i_iolock)->mr_lock){}, at:
[] xfs_ilock+0x20/0x75 Dec  1 08:35:41 reforged
[ 3052.513959] 28 locks held by kswapd0/183: Dec  1 08:35:41 reforged
[ 3052.513961]  #0:  (&(>i_iolock)->mr_lock){}, at:
[] xfs_ilock+0x20/0x75 Dec  1 08:35:41 reforged
[ 3052.513968]  #1:  (&(>i_lock)->mr_lock){}, at: []
xfs_ilock+0x52/0x75 Dec  1 08:35:41 reforged [ 3052.513975]


seems to alternate between same two locks. But both c089 and
c0bb are not between the page(oxfff=4095 or about 4k) which kswapd
is trying to get rid of.
I think this trace is on crack somehow.


[ 3052.514136] stack backtrace: Dec  1 08:35:41 reforged
[ 3052.514139]  [] show_trace+0x16/0x19 Dec  1 08:35:41
reforged [ 3052.514146]  [] dump_stack+0x1a/0x1f Dec  1
08:35:41 reforged [ 3052.514150]  []
debug_check_no_locks_freed+0xe0/0xff Dec  1 08:35:41 reforged
[ 3052.514159]  [] free_hot_cold_page+0x96/0x109 Dec  1
08:35:41 reforged [ 3052.514166]  [] __pagevec_free+0x1c/0x27
Dec  1 08:35:41 reforged [ 3052.514170]  []
__pagevec_release_nonlru+0x65/0x71 Dec  1 08:35:41 reforged
[ 3052.514176]  [] shrink_inactive_list+0x4b1/0x722 Dec  1
08:35:41 reforged [ 3052.514181]  [] shrink_zone+0xba/0xd9
Dec  1 08:35:41 reforged [ 3052.514185]  []
kswapd+0x26a/0x361 Dec  1 08:35:41 reforged [ 3052.514189]
[] kthread+0xb0/0xe1 Dec  1 08:35:41 reforged [ 3052.514192]
[] kernel_thread_helper+0x5/0xb reforged log #




Linux reforged 2.6.18.3 #4 PREEMPT Fri Dec 1 06:15:05 PST 2006 i686 AMD 
Athlon(tm) XP 3000+ AuthenticAMD GNU/Linux


I know you are running preempt on up machine. I'd try running 2.6.18.4
with a small patch like this and see if you can't cause it to recrash
for you. print_freed_lock_bug uses printk which in theory might be
causing a preempt .

diff -urp linux-2.6.18.4/include/linux/sched.h linux-debug/include/linux/sched.h
--- linux-2.6.18.4/include/linux/sched.h2006-11-29
11:28:40.0 -0800
+++ linux-debug/include/linux/sched.h   2006-12-01 13:25:23.0 -0800
@@ -936,7 +936,7 @@ struct task_struct {
   int softirq_context;
#endif
#ifdef CONFIG_LOCKDEP
-# define MAX_LOCK_DEPTH 30UL
+# define MAX_LOCK_DEPTH (60UL)
   u64 curr_chain_key;
   int lockdep_depth;
   struct held_lock held_locks[MAX_LOCK_DEPTH];
diff -urp linux-2.6.18.4/kernel/lockdep.c linux-debug/kernel/lockdep.c
--- linux-2.6.18.4/kernel/lockdep.c 2006-11-29 11:28:40.0 -0800
+++ linux-debug/kernel/lockdep.c2006-12-01 14:22:14.0 -0800
@@ -2608,6 +2608,7 @@ void debug_check_no_locks_freed(const vo
   return;

   local_irq_save(flags);
+   preempt_disable();
   for (i = 0; i < curr->lockdep_depth; i++) {
   hlock = curr->held_locks + i;

@@ -2621,6 +2622,7 @@ void debug_check_no_locks_freed(const vo
   print_freed_lock_bug(curr, mem_from, mem_to, hlock);
   break;
   }
+   preempt_enable();
   local_irq_restore(flags);
}


--
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] Tigran Aivazian: remove bouncing email addresses

2006-12-01 Thread Stephen Pollei

On 12/1/06, Arjan van de Ven <[EMAIL PROTECTED]> wrote:

On Thu, 2006-11-30 at 22:00 -0800, Hua Zhong wrote:
> I am curious, what's the point?
>
> These email addresses serve a "historical" purpose: they tell when the 
contribution was made,  what the author's email addresses
> were at that point.


Approximately when I wish the copyright dates were comma separated
iso8601 date ranges myself.
I also am not likely to typically care what their email address was
then, I want current information in the current kernel sources.
If I want old email address I got old tarballs I can get at least.


.. and which company owns the copyright.


Not in the USA according to http://www.copyright.gov/title17/92chap4.html#401 .
[[ ... § 401. Notice of copyright: Visually perceptible copies ...
b) Form of Notice. — If a notice appears on the copies, it shall
consist of the following three elements:

(1) the symbol (c) (the letter C in a circle), or the word
"Copyright", or the abbreviation "Copr."; and

(2) the year of first publication of the work; in the case of
compilations or derivative works incorporating previously published
material, the year date of first publication of the compilation or
derivative work is sufficient. The year date may be omitted where a
pictorial, graphic, or sculptural work, with accompanying text matter,
if any, is reproduced in or on greeting cards, postcards, stationery,
jewelry, dolls, toys, or any useful articles; and

(3) the name of the owner of copyright in the work, or an abbreviation
by which the name can be recognized, or a generally known alternative
designation of the owner. ]]

For source code generally there are a few changes for typical copyright notices:
They use "Copyright (C)" because ASCII and EBCDIC didn't have native
copyright symbol like unicode does now.
They include years in which they were published and not just the first
year in which in this version was published.
The name of copyright owner typically also includes an email address.

Copyright (C) 1999,2000 Tigran Aivazian <[EMAIL PROTECTED]>
Copyright (C) 1999 Tigran Aivazian <[EMAIL PROTECTED]>
etc seems like only copyright notices changed effect Tigran and if
Tigran meant for it to be copyrighted by veritas he would have done
Copyright (C) 1999 Veritas Inc. http://www.veritas.com/
However he did not do so.

Of course I'd prefer something closer to
Copyright (C) 1999-07-05/2000-03-12 Tigran Aivazian
<[EMAIL PROTECTED]>
or at least
Copyright (C) 1999-07-05/2000-03 Tigran Aivazian <[EMAIL PROTECTED]>

Especially if the laws ever get changed to make copyright durations
shorter. Like 14 years instead of 50 years ,70 years, or as old as
Disney's Steam Boat Willie.



Lets not remove historical email addresses. Just make sure there's a
current one in MODULE_AUTHOR / MAINTAINERS.


I think whoever should either remove or update the email addresses.

--
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] Tigran Aivazian: remove bouncing email addresses

2006-12-01 Thread Stephen Pollei

On 12/1/06, Arjan van de Ven [EMAIL PROTECTED] wrote:

On Thu, 2006-11-30 at 22:00 -0800, Hua Zhong wrote:
 I am curious, what's the point?

 These email addresses serve a historical purpose: they tell when the 
contribution was made,  what the author's email addresses
 were at that point.


Approximately when I wish the copyright dates were comma separated
iso8601 date ranges myself.
I also am not likely to typically care what their email address was
then, I want current information in the current kernel sources.
If I want old email address I got old tarballs I can get at least.


.. and which company owns the copyright.


Not in the USA according to http://www.copyright.gov/title17/92chap4.html#401 .
[[ ... § 401. Notice of copyright: Visually perceptible copies ...
b) Form of Notice. — If a notice appears on the copies, it shall
consist of the following three elements:

(1) the symbol (c) (the letter C in a circle), or the word
Copyright, or the abbreviation Copr.; and

(2) the year of first publication of the work; in the case of
compilations or derivative works incorporating previously published
material, the year date of first publication of the compilation or
derivative work is sufficient. The year date may be omitted where a
pictorial, graphic, or sculptural work, with accompanying text matter,
if any, is reproduced in or on greeting cards, postcards, stationery,
jewelry, dolls, toys, or any useful articles; and

(3) the name of the owner of copyright in the work, or an abbreviation
by which the name can be recognized, or a generally known alternative
designation of the owner. ]]

For source code generally there are a few changes for typical copyright notices:
They use Copyright (C) because ASCII and EBCDIC didn't have native
copyright symbol like unicode does now.
They include years in which they were published and not just the first
year in which in this version was published.
The name of copyright owner typically also includes an email address.

Copyright (C) 1999,2000 Tigran Aivazian [EMAIL PROTECTED]
Copyright (C) 1999 Tigran Aivazian [EMAIL PROTECTED]
etc seems like only copyright notices changed effect Tigran and if
Tigran meant for it to be copyrighted by veritas he would have done
Copyright (C) 1999 Veritas Inc. http://www.veritas.com/
However he did not do so.

Of course I'd prefer something closer to
Copyright (C) 1999-07-05/2000-03-12 Tigran Aivazian
[EMAIL PROTECTED]
or at least
Copyright (C) 1999-07-05/2000-03 Tigran Aivazian [EMAIL PROTECTED]

Especially if the laws ever get changed to make copyright durations
shorter. Like 14 years instead of 50 years ,70 years, or as old as
Disney's Steam Boat Willie.



Lets not remove historical email addresses. Just make sure there's a
current one in MODULE_AUTHOR / MAINTAINERS.


I think whoever should either remove or update the email addresses.

--
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: held lock freed! lock validator tripped by kswapd xfs

2006-12-01 Thread Stephen Pollei

On 12/1/06, Mike Mattie [EMAIL PROTECTED] wrote:


In an attempt to debug another kernel issue I turned on the lock validator and
managed to generate this report.

As a side note the first attempt to boot with the lock validator failed with
a message indicating I had exceeded MAX_LOCK_DEPTH. To get this trace
I patched sched.h: MAX_LOCK_DEPTH to 60.

Dec  1 08:35:41 reforged [ 3052.513931] =
Dec  1 08:35:41 reforged [ 3052.513937] [ BUG: held lock freed! ]
Dec  1 08:35:41 reforged [ 3052.513939] -
Dec  1 08:35:41 reforged [ 3052.513943] kswapd0/183 is freeing memory
c3458000-c3458fff, with a lock still held there! Dec  1 08:35:41
reforged [ 3052.513947]  ((ip-i_iolock)-mr_lock){}, at:
[c089] xfs_ilock+0x20/0x75 Dec  1 08:35:41 reforged
[ 3052.513959] 28 locks held by kswapd0/183: Dec  1 08:35:41 reforged
[ 3052.513961]  #0:  ((ip-i_iolock)-mr_lock){}, at:
[c089] xfs_ilock+0x20/0x75 Dec  1 08:35:41 reforged
[ 3052.513968]  #1:  ((ip-i_lock)-mr_lock){}, at: [c0bb]
xfs_ilock+0x52/0x75 Dec  1 08:35:41 reforged [ 3052.513975]


seems to alternate between same two locks. But both c089 and
c0bb are not between the page(oxfff=4095 or about 4k) which kswapd
is trying to get rid of.
I think this trace is on crack somehow.


[ 3052.514136] stack backtrace: Dec  1 08:35:41 reforged
[ 3052.514139]  [c0103cb9] show_trace+0x16/0x19 Dec  1 08:35:41
reforged [ 3052.514146]  [c01040f7] dump_stack+0x1a/0x1f Dec  1
08:35:41 reforged [ 3052.514150]  [c012be74]
debug_check_no_locks_freed+0xe0/0xff Dec  1 08:35:41 reforged
[ 3052.514159]  [c014122d] free_hot_cold_page+0x96/0x109 Dec  1
08:35:41 reforged [ 3052.514166]  [c01412bc] __pagevec_free+0x1c/0x27
Dec  1 08:35:41 reforged [ 3052.514170]  [c01435dc]
__pagevec_release_nonlru+0x65/0x71 Dec  1 08:35:41 reforged
[ 3052.514176]  [c0144702] shrink_inactive_list+0x4b1/0x722 Dec  1
08:35:41 reforged [ 3052.514181]  [c0144a2d] shrink_zone+0xba/0xd9
Dec  1 08:35:41 reforged [ 3052.514185]  [c0144e9e]
kswapd+0x26a/0x361 Dec  1 08:35:41 reforged [ 3052.514189]
[c012742b] kthread+0xb0/0xe1 Dec  1 08:35:41 reforged [ 3052.514192]
[c0101005] kernel_thread_helper+0x5/0xb reforged log #




Linux reforged 2.6.18.3 #4 PREEMPT Fri Dec 1 06:15:05 PST 2006 i686 AMD 
Athlon(tm) XP 3000+ AuthenticAMD GNU/Linux


I know you are running preempt on up machine. I'd try running 2.6.18.4
with a small patch like this and see if you can't cause it to recrash
for you. print_freed_lock_bug uses printk which in theory might be
causing a preempt .

diff -urp linux-2.6.18.4/include/linux/sched.h linux-debug/include/linux/sched.h
--- linux-2.6.18.4/include/linux/sched.h2006-11-29
11:28:40.0 -0800
+++ linux-debug/include/linux/sched.h   2006-12-01 13:25:23.0 -0800
@@ -936,7 +936,7 @@ struct task_struct {
   int softirq_context;
#endif
#ifdef CONFIG_LOCKDEP
-# define MAX_LOCK_DEPTH 30UL
+# define MAX_LOCK_DEPTH (60UL)
   u64 curr_chain_key;
   int lockdep_depth;
   struct held_lock held_locks[MAX_LOCK_DEPTH];
diff -urp linux-2.6.18.4/kernel/lockdep.c linux-debug/kernel/lockdep.c
--- linux-2.6.18.4/kernel/lockdep.c 2006-11-29 11:28:40.0 -0800
+++ linux-debug/kernel/lockdep.c2006-12-01 14:22:14.0 -0800
@@ -2608,6 +2608,7 @@ void debug_check_no_locks_freed(const vo
   return;

   local_irq_save(flags);
+   preempt_disable();
   for (i = 0; i  curr-lockdep_depth; i++) {
   hlock = curr-held_locks + i;

@@ -2621,6 +2622,7 @@ void debug_check_no_locks_freed(const vo
   print_freed_lock_bug(curr, mem_from, mem_to, hlock);
   break;
   }
+   preempt_enable();
   local_irq_restore(flags);
}


--
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-14 Thread Stephen Pollei
On 8/14/05, Lee Revell <[EMAIL PROTECTED]> wrote:
> I know the alternatives are available.  That doesn't make it any less
> idiotic to use non ASCII characters as operators.  I think it's a very
> slippery slope.  We write code in ASCII, dammit.

Yes you and I might write 99.9% of our code in good'ol **American**
Standard Code for Information Interchange -- however not all the world
is USA. For instance notice the http://de.wikipedia.org/wiki/Umlaut/
in "Löwis"... Seems like lots of Europeans might want a bigger
charset, not to mention Asians, Hindus, and whomever else.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-14 Thread Stephen Pollei
On 8/14/05, Lee Revell [EMAIL PROTECTED] wrote:
 I know the alternatives are available.  That doesn't make it any less
 idiotic to use non ASCII characters as operators.  I think it's a very
 slippery slope.  We write code in ASCII, dammit.

Yes you and I might write 99.9% of our code in good'ol **American**
Standard Code for Information Interchange -- however not all the world
is USA. For instance notice the http://de.wikipedia.org/wiki/Umlaut/
in Löwis... Seems like lots of Europeans might want a bigger
charset, not to mention Asians, Hindus, and whomever else.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-13 Thread Stephen Pollei
On 8/13/05, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> This patch adds support for UTF-8 signatures (aka BOM, byte order
> mark) to binfmt_script. 

> With such support, creating scripts that reliably carry non-ASCII
> characters is simplified. 
> the approach would naturally extend to Perl to enhance/replace
> the "use utf8" pragma. 

Thats great for the perl6 people.
http://dev.perl.org/perl6/doc/design/syn/S03.html says they are going
to be using « and » as operators... So I'd imagine that a lot of perl6
scripts would be utf8.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-13 Thread Stephen Pollei
On 8/13/05, Martin v. Löwis [EMAIL PROTECTED] wrote:
 This patch adds support for UTF-8 signatures (aka BOM, byte order
 mark) to binfmt_script. 

 With such support, creating scripts that reliably carry non-ASCII
 characters is simplified. 
 the approach would naturally extend to Perl to enhance/replace
 the use utf8 pragma. 

Thats great for the perl6 people.
http://dev.perl.org/perl6/doc/design/syn/S03.html says they are going
to be using « and » as operators... So I'd imagine that a lot of perl6
scripts would be utf8.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: use kcalloc instead kmalloc/memset

2005-08-05 Thread Stephen Pollei
On 8/5/05, Christoph Lameter <[EMAIL PROTECTED]> wrote:

> Hmm. If we had kcmalloc then we may be able to add a zero bit to the slab
> allocator. If we would obtain zeroed pages for the slab then we may skip
> zeroing of individual entries. However, the cache warming effect of the
> current zeroing is then not occurring. Not sure if this would make sense
> but this is a possible optimization if we had kcmalloc.

Well there is kzalloc and kcalloc. I just thought a safe non-zeroing
version would be nice.
You could warm the cache with prefetch, but you'd need to profile the
diferent cases to see what is worth doing and what isn't.


-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: use kcalloc instead kmalloc/memset

2005-08-05 Thread Stephen Pollei
On 8/5/05, Roman Zippel <[EMAIL PROTECTED]> wrote:
> On Fri, 5 Aug 2005, Arjan van de Ven wrote:
> > > This would imply a similiar kmalloc() would be useful as well.
> > > Second, how relevant is it for the kernel?

> > we've had a non-negliable amount of security holes because of this

> So why don't we have a similiar kmalloc()?

You mean something like:

static void __bad_kmalloc_safe_nonconstant_size(void);
static void __bad_kmalloc_safe_zero_size(void);
static void __bad_kmalloc_safe_too_large_size(void);
static void __bad_kmalloc_safe_too_large(void);
static inline void *kmalloc_safe(size_t nmemb, size_t size,int flags) {
if (!__builtin_constant_p(size))
   __bad_kmalloc_safe_nonconstant_size();
if ( !size )
__bad_kmalloc_safe_zero_size();
if ( size > 0x1)
__bad_kmalloc_safe_too_large_size();
if (__builtin_constant_p(nmemb) && nmemb > 0x2/size)
__bad_kmalloc_safe_too_large();
if (nmemb <= 0x2/size)
return kmalloc(nmemb*size,flags);
else return 0; }


-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: use kcalloc instead kmalloc/memset

2005-08-05 Thread Stephen Pollei
On 8/5/05, Roman Zippel [EMAIL PROTECTED] wrote:
 On Fri, 5 Aug 2005, Arjan van de Ven wrote:
   This would imply a similiar kmalloc() would be useful as well.
   Second, how relevant is it for the kernel?

  we've had a non-negliable amount of security holes because of this

 So why don't we have a similiar kmalloc()?

You mean something like:

static void __bad_kmalloc_safe_nonconstant_size(void);
static void __bad_kmalloc_safe_zero_size(void);
static void __bad_kmalloc_safe_too_large_size(void);
static void __bad_kmalloc_safe_too_large(void);
static inline void *kmalloc_safe(size_t nmemb, size_t size,int flags) {
if (!__builtin_constant_p(size))
   __bad_kmalloc_safe_nonconstant_size();
if ( !size )
__bad_kmalloc_safe_zero_size();
if ( size  0x1)
__bad_kmalloc_safe_too_large_size();
if (__builtin_constant_p(nmemb)  nmemb  0x2/size)
__bad_kmalloc_safe_too_large();
if (nmemb = 0x2/size)
return kmalloc(nmemb*size,flags);
else return 0; }


-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: use kcalloc instead kmalloc/memset

2005-08-05 Thread Stephen Pollei
On 8/5/05, Christoph Lameter [EMAIL PROTECTED] wrote:

 Hmm. If we had kcmalloc then we may be able to add a zero bit to the slab
 allocator. If we would obtain zeroed pages for the slab then we may skip
 zeroing of individual entries. However, the cache warming effect of the
 current zeroing is then not occurring. Not sure if this would make sense
 but this is a possible optimization if we had kcmalloc.

Well there is kzalloc and kcalloc. I just thought a safe non-zeroing
version would be nice.
You could warm the cache with prefetch, but you'd need to profile the
diferent cases to see what is worth doing and what isn't.


-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QoS scheduler

2005-07-29 Thread Stephen Pollei
On 7/29/05, Vitor Curado <[EMAIL PROTECTED]> wrote:
> You assumed right, Stephen: I'm interested in QoS process scheduling,
> sorry for not specifying it...
> 
> I'm taking a deeper look at the qlinux, ckrm and the plugsched
> schedulers, if you have any more links, please send them to me...
Also you didn't specify what kind of clustering you are doing and for
what ultimate purpose.

http://www.beowulf.org/
http://www-unix.mcs.anl.gov/mpi/implementations.html
http://www.csm.ornl.gov/pvm/pvm_home.html
http://www.open-mpi.org/

http://openmosix.sourceforge.net/
http://www.mosix.org/

http://www.remote-dba.cc/teas_aegis_rac06.htm
http://www.dba-oracle.com/bp/bp_book1_rac.htm
Oracle DB Real Application Clusters (RAC)
transparent application failover (TAF)

http://pgcluster.projects.postgresql.org/feature.html
http://dev.mysql.com/doc/mysql/en/replication.html

High Availability (HA)
High Performance Computing (HPC)

That can strongly effect what solutions you would want to look at.
For instance if you were running a render farm, or a scientific
compute beowulf cluster, then
your "scheduling" will be handled more in the MPI or PVM code perhaps.
The running processes themselves would most likely be using something
like SCHED_BATCH, with larger than usual time-slices. Maybe you
monitor how many mips actually get consumed and then adjust which
nodes get scheduled with what, or how many work units get handed out
to get back to fairness.
 
clock_t times(struct tms *buf);
int getrusage(int who, struct rusage *usage);
to track system and user time is about on track, but I think someone
might be able to fool you, if thats all you could use to account for
cpu time taken from another userland process.

So maybe you just need better reporting/accounting hooks and then you
can do the rest in userland?

> On 7/28/05, Wes Felter <[EMAIL PROTECTED]> wrote:
> > Vitor Curado wrote:
> > > I'm working on a research about QoS schedulers for Linux clusters.
> > > Moreover, the ideal would be that the scheduler is implemented
> > > altering the native kernel scheduler. I'm kind of having trouble to
> > > find such schedulers, can anybody help me out?
> >
> > http://lass.cs.umass.edu/software/qlinux/
> > http://ckrm.sourceforge.net/

That qlinux one is new to me. I notice that the 2.6 kernel has support
for modular plugable disk I/O and network schedulers now.
So  a Hierarchical Start Time Fair Queuing (H-SFQ) network packet
scheduler module could be made.

I wonder how that Cello scheduler would stack-up to AS, Deadline, cfq,
noop, etc etc.

The qlinux cpu scheduler would be best to use plugsched for use with 2.6.x

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ZyXEL Kernel /BusyBox GPL violation?

2005-07-25 Thread Stephen Pollei
On 7/25/05, Lee Revell <[EMAIL PROTECTED]> wrote:
> On Mon, 2005-07-25 at 23:21 -0400, Mace Moneta wrote:
> > The response seems meaningless; does this constitute a violation of
> > GPL?
> > If so what, if any, action needs to be taken?

http://gpl-violations.org/
http://www.fsf.org/licensing/licenses/gpl-faq.html#ReportingViolation
http://www.fsf.org/licensing/licenses/gpl-violation.html
[[[You should report it. First, check the facts as best you can. Then
tell the publisher or copyright holder of the specific GPL-covered
program. If that is the Free Software Foundation, write to
<[EMAIL PROTECTED]>. Otherwise, the program's maintainer may
be the copyright holder, or else could tell you how to contact the
copyright holder, so report it to the maintainer.]]]

> Also if they didn't modify the kernel, they don't have to give you
> source, they can just refer you to kernel.org.

Wrong.

http://www.fsf.org/licensing/licenses/gpl-faq.html#DistributeWithSourceOnInternet
[[[I want to distribute binaries without accompanying sources. Can I
provide source code by FTP instead of by mail order?
You're supposed to provide the source code by mail-order on a
physical medium, if someone orders it. You are welcome to offer people
a way to copy the corresponding source code by FTP, in addition to the
mail-order option, but FTP access to the source is not sufficient to
satisfy section 3 of the GPL.

When a user orders the source, you have to make sure to get the
source to that user. If a particular user can conveniently get the
source from you by anonymous FTP, fine--that does the job. But not
every user can do such a download. The rest of the users are just as
entitled to get the source code from you, which means you must be
prepared to send it to them by post.

If the FTP access is convenient enough, perhaps no one will choose
to mail-order a copy. If so, you will never have to ship one. But you
cannot assume that.

Of course, it's easiest to just send the source with the binary in
the first place. ]]]

http://www.fsf.org/licensing/licenses/gpl-faq.html#TOCSourceAndBinaryOnDifferentSites
[[[Can I put the binaries on my Internet server and put the source on
a different Internet site?
The GPL says you must offer access to copy the source code "from
the same place"; that is, next to the binaries. However, if you make
arrangements with another site to keep the necessary source code
available, and put a link or cross-reference to the source code next
to the binaries, we think that qualifies as "from the same place".

Note, however, that it is not enough to find some site that
happens to have the appropriate source code today, and tell people to
look there. Tomorrow that site may have deleted that source code, or
simply replaced it with a newer version of the same program. Then you
would no longer be complying with the GPL requirements. To make a
reasonable effort to comply, you need to make a positive arrangement
with the other site, and thus ensure that the source will be available
there for as long as you keep the binaries available. ]]]

http://www.fsf.org/licensing/licenses/gpl.html
Section 3 mentions three choices of what you must do to copy and distribute:
a) Have it from the same location. They have not.
b) Have written offer good for three years None such mentioned.
c) Be noncommercial plus send some information. zyxel.com "seller of
routers" sounds like a commercial enterprise to me.

So no they must assume responsibility to have the sources availible
even if they didn't modify them.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ZyXEL Kernel /BusyBox GPL violation?

2005-07-25 Thread Stephen Pollei
On 7/25/05, Lee Revell [EMAIL PROTECTED] wrote:
 On Mon, 2005-07-25 at 23:21 -0400, Mace Moneta wrote:
  The response seems meaningless; does this constitute a violation of
  GPL?
  If so what, if any, action needs to be taken?

http://gpl-violations.org/
http://www.fsf.org/licensing/licenses/gpl-faq.html#ReportingViolation
http://www.fsf.org/licensing/licenses/gpl-violation.html
[[[You should report it. First, check the facts as best you can. Then
tell the publisher or copyright holder of the specific GPL-covered
program. If that is the Free Software Foundation, write to
[EMAIL PROTECTED]. Otherwise, the program's maintainer may
be the copyright holder, or else could tell you how to contact the
copyright holder, so report it to the maintainer.]]]

 Also if they didn't modify the kernel, they don't have to give you
 source, they can just refer you to kernel.org.

Wrong.

http://www.fsf.org/licensing/licenses/gpl-faq.html#DistributeWithSourceOnInternet
[[[I want to distribute binaries without accompanying sources. Can I
provide source code by FTP instead of by mail order?
You're supposed to provide the source code by mail-order on a
physical medium, if someone orders it. You are welcome to offer people
a way to copy the corresponding source code by FTP, in addition to the
mail-order option, but FTP access to the source is not sufficient to
satisfy section 3 of the GPL.

When a user orders the source, you have to make sure to get the
source to that user. If a particular user can conveniently get the
source from you by anonymous FTP, fine--that does the job. But not
every user can do such a download. The rest of the users are just as
entitled to get the source code from you, which means you must be
prepared to send it to them by post.

If the FTP access is convenient enough, perhaps no one will choose
to mail-order a copy. If so, you will never have to ship one. But you
cannot assume that.

Of course, it's easiest to just send the source with the binary in
the first place. ]]]

http://www.fsf.org/licensing/licenses/gpl-faq.html#TOCSourceAndBinaryOnDifferentSites
[[[Can I put the binaries on my Internet server and put the source on
a different Internet site?
The GPL says you must offer access to copy the source code from
the same place; that is, next to the binaries. However, if you make
arrangements with another site to keep the necessary source code
available, and put a link or cross-reference to the source code next
to the binaries, we think that qualifies as from the same place.

Note, however, that it is not enough to find some site that
happens to have the appropriate source code today, and tell people to
look there. Tomorrow that site may have deleted that source code, or
simply replaced it with a newer version of the same program. Then you
would no longer be complying with the GPL requirements. To make a
reasonable effort to comply, you need to make a positive arrangement
with the other site, and thus ensure that the source will be available
there for as long as you keep the binaries available. ]]]

http://www.fsf.org/licensing/licenses/gpl.html
Section 3 mentions three choices of what you must do to copy and distribute:
a) Have it from the same location. They have not.
b) Have written offer good for three years None such mentioned.
c) Be noncommercial plus send some information. zyxel.com seller of
routers sounds like a commercial enterprise to me.

So no they must assume responsibility to have the sources availible
even if they didn't modify them.

-- 
http://dmoz.org/profiles/pollei.html
http://sourceforge.net/users/stephen_pollei/
http://www.orkut.com/Profile.aspx?uid=2455954990164098214
http://stephen_pollei.home.comcast.net/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-15 Thread Stephen Pollei
On 7/14/05, Eric St-Laurent <[EMAIL PROTECTED]> wrote:
> On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote:
> > Trust me. When I say that the right thing to do is to just have a fixed
> > (but high) HZ value, and just changing the timer rate, I'm -right-.

> Of course you are, jiffies are simple and efficient.

> If i sum-up the discussion from my POV:

> - use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter
> on 64-bit platforms
If the 64bit counter doesn't have any overhead then sure.

> - keep the constant HZ=1000 (mS resolution) on 32-bit platforms
Which HZ Is that? CONFIG_JIFFIES_HZ or CONFIG_FIXED_PIT_HZ ?
I think you meant CONFIG_JIFFIES_HZ which I think for even 32bit
counters could go up to 1e4 to 5e4 , with some patching going on in
some places of course.

> - remove the assumption that timer interrupts and jiffies are 1:1 thing
> (jiffies may be incremented by >1 ticks at timer interrupt)
Yes maybe nuke CONFIG_HZ and replace it with CONFIG_JIFFIES_HZ and
CONFIG_(FIXED|DEFAULT|DYNAMIC)_PIT_HZ . Starting with just
CONFIG_FIXED_PIT_HZ, add others as needed.

Extreme might be to also just nuke HZ and replace it with JHZ and PHZ,
or whatever so that people are *crystal* clear about the difference.

> - determine jiffies_increment at boot
So CONFIG__PIT_HZ could be a per boot time thing maybe.
So you'd have CONFIG_DEFAULT_PIT_HZ if it was a per per boot or runtime thing.
CONFIG_DYNAMIC_PIT_HZ if it was changable as the system is running --
like windows.
CONFIG_FIXED_PIT_HZ if it is a compile time constant.
Or something like the that?

> - have a slow clock mode to help power management (adjust
> jiffies_increment by the slowdown factor)
CONFIG_DYNAMIC_PIT_HZ unless it's overhead is so low that everyone
just wants it by default.

> - it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on
> 64-bit platforms, if there are benefits such as better accuracy during
> time units conversions or if a higher frequency timer hardware is
> available/viable.
Too high starts to cause other troubles. I think that the real time
people want 10uS scheduling, but even the ipipe and rt-preempt has
18us-70uS delays at times IIRC. So 5e4 to 1e5 is about the extreme end
of the road for CONFIG_JIFFIES_HZ . I think even long term that 1e5 to
1e6 would be extreme because of speed of light issues, etc. Hpet is
only 1.4e7 IIRC.

I think that you should start with:
1) CONFIG_FIXED_PIT_HZ=50 CONFIG_JIFFIES_HZ=2000
2) try it out and fix any bugs, send the fixes to Linus to see if how
much he bitches.
3) if you still need CONFIG_JIFFIES_HZ to be larger, double it and then goto 2.
4) enjoy your higher frequency jiffies

I bet that even that going to somewhere between 2e3 through 1e5 will
make you want to change a few things for performance and sanity
reasons. So I'd focus on that before I even thought about 1e6 through
1e10 . Plus I think the interest level really fails off to go that
extreme.

Just making JIFFIES_HZ != PIT_HZ will require patches.
Dynamic pit hz or lazy update of jiffies based on tsc/hpet/other are
other patches.

> - it may be also useful to bump HZ on -RT (Real-time) kernels,
yes they sound like they want JIFFIES_HZ to be 1e3 through 1e5
depending on task. They also want hpet(or other), vertical retrace
interrupts(so xsync works for video), perhaps a nist mini atomic
clock, and a few other goodies AFAIK.
> -HRT (High-resolution timers support).
Yes tsc or hpet or whatever users might benefit in several ways.
1) both tsc and hpet might be able to bump up to a more accurate value
on entry to idle and then test to see if anything got scheduled.
2) hpet can set set one shot timers for the next up coming event on
idle if it's sooner than when the PIT interrupt is suppose to come in.
Of course update the jiffies when that hpet interrupt comes.

>Users of those kernel are willing
> to pay the cost of the overhead to have better resolution
Yes realtime users with something like hpet might not vary the pit
timer, but place hooks to update the jiffies between pit interrupts
like idle, scheduler(task switch), etc. And use the hpet one shot
interrupts as well.

> - avoid direct usage of the jiffies variable, instead use jiffies()
> (inline or MACRO), IMO monotonic_clock() would be a better name
I don't know I think it could remain a variable you usual just want it
to be a light-weight memory read not a call out to an hpet and then a
math conversion, or a call out to tsc that then has to known about if
the tsc represents work or time, and if the cpu has been slowed for
power save reasons etc etc etc. I think you want a symbol exported gpl
of something like void force_update_jiffies(void); that you can call
in different hook locations to force the update of jiffies from
non-interupt sources. Actually you might want more than one version of
that function or have it take an argument, becuase some people might
want to be super lazy and only update it when the enter or leave idle,
while 

Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt

2005-07-15 Thread Stephen Pollei
On 7/14/05, Eric St-Laurent [EMAIL PROTECTED] wrote:
 On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote:
  Trust me. When I say that the right thing to do is to just have a fixed
  (but high) HZ value, and just changing the timer rate, I'm -right-.

 Of course you are, jiffies are simple and efficient.

 If i sum-up the discussion from my POV:

 - use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter
 on 64-bit platforms
If the 64bit counter doesn't have any overhead then sure.

 - keep the constant HZ=1000 (mS resolution) on 32-bit platforms
Which HZ Is that? CONFIG_JIFFIES_HZ or CONFIG_FIXED_PIT_HZ ?
I think you meant CONFIG_JIFFIES_HZ which I think for even 32bit
counters could go up to 1e4 to 5e4 , with some patching going on in
some places of course.

 - remove the assumption that timer interrupts and jiffies are 1:1 thing
 (jiffies may be incremented by 1 ticks at timer interrupt)
Yes maybe nuke CONFIG_HZ and replace it with CONFIG_JIFFIES_HZ and
CONFIG_(FIXED|DEFAULT|DYNAMIC)_PIT_HZ . Starting with just
CONFIG_FIXED_PIT_HZ, add others as needed.

Extreme might be to also just nuke HZ and replace it with JHZ and PHZ,
or whatever so that people are *crystal* clear about the difference.

 - determine jiffies_increment at boot
So CONFIG_foo_PIT_HZ could be a per boot time thing maybe.
So you'd have CONFIG_DEFAULT_PIT_HZ if it was a per per boot or runtime thing.
CONFIG_DYNAMIC_PIT_HZ if it was changable as the system is running --
like windows.
CONFIG_FIXED_PIT_HZ if it is a compile time constant.
Or something like the that?

 - have a slow clock mode to help power management (adjust
 jiffies_increment by the slowdown factor)
CONFIG_DYNAMIC_PIT_HZ unless it's overhead is so low that everyone
just wants it by default.

 - it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on
 64-bit platforms, if there are benefits such as better accuracy during
 time units conversions or if a higher frequency timer hardware is
 available/viable.
Too high starts to cause other troubles. I think that the real time
people want 10uS scheduling, but even the ipipe and rt-preempt has
18us-70uS delays at times IIRC. So 5e4 to 1e5 is about the extreme end
of the road for CONFIG_JIFFIES_HZ . I think even long term that 1e5 to
1e6 would be extreme because of speed of light issues, etc. Hpet is
only 1.4e7 IIRC.

I think that you should start with:
1) CONFIG_FIXED_PIT_HZ=50 CONFIG_JIFFIES_HZ=2000
2) try it out and fix any bugs, send the fixes to Linus to see if how
much he bitches.
3) if you still need CONFIG_JIFFIES_HZ to be larger, double it and then goto 2.
4) enjoy your higher frequency jiffies

I bet that even that going to somewhere between 2e3 through 1e5 will
make you want to change a few things for performance and sanity
reasons. So I'd focus on that before I even thought about 1e6 through
1e10 . Plus I think the interest level really fails off to go that
extreme.

Just making JIFFIES_HZ != PIT_HZ will require patches.
Dynamic pit hz or lazy update of jiffies based on tsc/hpet/other are
other patches.

 - it may be also useful to bump HZ on -RT (Real-time) kernels,
yes they sound like they want JIFFIES_HZ to be 1e3 through 1e5
depending on task. They also want hpet(or other), vertical retrace
interrupts(so xsync works for video), perhaps a nist mini atomic
clock, and a few other goodies AFAIK.
 -HRT (High-resolution timers support).
Yes tsc or hpet or whatever users might benefit in several ways.
1) both tsc and hpet might be able to bump up to a more accurate value
on entry to idle and then test to see if anything got scheduled.
2) hpet can set set one shot timers for the next up coming event on
idle if it's sooner than when the PIT interrupt is suppose to come in.
Of course update the jiffies when that hpet interrupt comes.

Users of those kernel are willing
 to pay the cost of the overhead to have better resolution
Yes realtime users with something like hpet might not vary the pit
timer, but place hooks to update the jiffies between pit interrupts
like idle, scheduler(task switch), etc. And use the hpet one shot
interrupts as well.

 - avoid direct usage of the jiffies variable, instead use jiffies()
 (inline or MACRO), IMO monotonic_clock() would be a better name
I don't know I think it could remain a variable you usual just want it
to be a light-weight memory read not a call out to an hpet and then a
math conversion, or a call out to tsc that then has to known about if
the tsc represents work or time, and if the cpu has been slowed for
power save reasons etc etc etc. I think you want a symbol exported gpl
of something like void force_update_jiffies(void); that you can call
in different hook locations to force the update of jiffies from
non-interupt sources. Actually you might want more than one version of
that function or have it take an argument, becuase some people might
want to be super lazy and only update it when the enter or leave idle,
while others(real timers) might want