[patch 6/9] signalfd/timerfd v3 - timerfd core ...

2007-03-11 Thread Davide Libenzi
This patch introduces a new system call for timers events delivered
though file descriptors. This allows timer event to be used with
standard POSIX poll(2), select(2) and read(2). As a consequence of
supporting the Linux f_op->poll subsystem, they can be used with
epoll(2) too.
The system call is defined as:

int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr);

The "ufd" parameter allows for re-use (re-programming) of an existing
timerfd w/out going through the close/open cycle (same as signalfd).
If "ufd" is -1, s new file descriptor will be created, otherwise the
existing "ufd" will be re-programmed.
The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME.
The "tmrtype" parameter allows to specify the timer type. The following
values are supported:

TFD_TIMER_REL
The time specified in the "utmr" parameter is a relative time
from NOW.

TFD_TIMER_ABS
The timer specified in the "utmr" parameter is an absolute time.

TFD_TIMER_SEQ
The time specified in the "utmr" parameter is an interval at
which a continuous clock rate will be generated.

The function returns the new (or same, in case "ufd" is a valid timerfd
descriptor) file, or -1 in case of error.
As stated before, the timerfd file descriptor supports poll(2), select(2)
and epoll(2). When a timer event happened on the timerfd, a POLLIN mask
will be returned.
The read(2) call can be used, and it will return a u32 variable holding
the number of "ticks" that happened on the interface since the last call
to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN
will be returned if no ticks happened.
A quick test program, shows timerfd working correctly on my amd64 box:

http://www.xmailserver.org/timerfd-test.c




Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.20.ep2/fs/timerfd.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.20.ep2/fs/timerfd.c   2007-03-11 14:32:47.0 -0700
@@ -0,0 +1,295 @@
+/*
+ *  fs/timerfd.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi 
+ *
+ *
+ *  Thanks to Thomas Gleixner for code review and useful comments.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+
+
+struct timerfd_ctx {
+   struct hrtimer tmr;
+   int clockid;
+   enum hrtimer_mode htmode;
+   ktime_t texp, tintv;
+   int tmrtype;
+   spinlock_t lock;
+   wait_queue_head_t wqh;
+   unsigned long ticks;
+};
+
+
+static int timerfd_tmrproc(struct hrtimer *htmr);
+static int timerfd_setup(struct timerfd_ctx *ctx, int clockid, int tmrtype,
+const struct itimerspec *ktmr);
+static void timerfd_cleanup(struct timerfd_ctx *ctx);
+static int timerfd_close(struct inode *inode, struct file *file);
+static unsigned int timerfd_poll(struct file *file, poll_table *wait);
+static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count,
+   loff_t *ppos);
+
+
+
+static const struct file_operations timerfd_fops = {
+   .release= timerfd_close,
+   .poll   = timerfd_poll,
+   .read   = timerfd_read,
+};
+static struct kmem_cache *timerfd_ctx_cachep;
+
+
+
+static int timerfd_tmrproc(struct hrtimer *htmr)
+{
+   struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr);
+   int rval = HRTIMER_NORESTART;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   ctx->ticks++;
+   wake_up_locked(>wqh);
+   if (ctx->tmrtype == TFD_TIMER_SEQ) {
+   hrtimer_forward(htmr, htmr->base->softirq_time, ctx->tintv);
+   rval = HRTIMER_RESTART;
+   }
+   spin_unlock_irqrestore(>lock, flags);
+
+   return rval;
+}
+
+
+static int timerfd_setup(struct timerfd_ctx *ctx, int clockid, int tmrtype,
+const struct itimerspec *ktmr)
+{
+   enum hrtimer_mode htmode;
+   ktime_t texp, tintv;
+
+   if (clockid != CLOCK_MONOTONIC &&
+   clockid != CLOCK_REALTIME)
+   return -EINVAL;
+   switch (tmrtype) {
+   case TFD_TIMER_SEQ:
+   if (!timespec_valid(>it_interval))
+   return -EINVAL;
+   tintv = timespec_to_ktime(ktmr->it_interval);
+   case TFD_TIMER_ABS:
+   if (!timespec_valid(>it_value))
+   return -EINVAL;
+   htmode = HRTIMER_ABS;
+   texp = timespec_to_ktime(ktmr->it_value);
+   break;
+   case TFD_TIMER_REL:
+   if (!timespec_valid(>it_interval))
+   return -EINVAL;
+   texp = timespec_to_ktime(ktmr->it_interval);
+   tintv = ktime_set(0, 0);
+   htmode = HRTIMER_REL;
+

[patch 2/9] signalfd/timerfd v3 - signalfd core ...

2007-03-11 Thread Davide Libenzi
This patch series implements the new signalfd() system call.
I took part of the original Linus code (and you know how
badly it can be broken :), and I added even more breakage ;)
Signals are fetched from the same signal queue used by the process,
so signalfd will compete with standard kernel delivery in dequeue_signal().
If you want to reliably fetch signals on the signalfd file, you need to
block them with sigprocmask(SIG_BLOCK).
This seems to be working fine on my Dual Opteron machine. I made a quick 
test program for it:

http://www.xmailserver.org/signafd-test.c

The signalfd() system call implements signal delivery into a file 
descriptor receiver. The signalfd file descriptor if created with the 
following API:

int signalfd(int ufd, const sigset_t *mask, size_t masksize);

The "ufd" parameter allows to change an existing signalfd sigmask, w/out 
going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a 
brand new signalfd file.
The "mask" allows to specify the signal mask of signals that we are 
interested in. The "masksize" parameter is the size of "mask".
The signalfd fd supports the poll(2) and read(2) system calls. The poll(2)
will return POLLIN when signals are available to be dequeued. As a direct
consequence of supporting the Linux poll subsystem, the signalfd fd can use
used together with epoll(2) too.
The read(2) system call will return a "struct signalfd_siginfo" structure
in the userspace supplied buffer. The return value is the number of bytes
copied in the supplied buffer, or -1 in case of error. The read(2) call
can also return 0, in case the sighand structure to which the signalfd
was attached, has been orphaned. The O_NONBLOCK flag is also supported, and
read(2) will return -EAGAIN in case no signal is available.
The format of the struct signalfd_siginfo is, and the valid fields depends
of the (->code & __SI_MASK) value, in the same way a struct siginfo would:

struct signalfd_siginfo {
__u32 signo;/* si_signo */
__s32 err;  /* si_errno */
__s32 code; /* si_code */
__u32 pid;  /* si_pid */
__u32 uid;  /* si_uid */
__s32 fd;   /* si_fd */
__u32 tid;  /* si_fd */
__u32 band; /* si_band */
__u32 overrun;  /* si_overrun */
__u32 trapno;   /* si_trapno */
__s32 status;   /* si_status */
__s32 svint;/* si_int */
__u64 svptr;/* si_ptr */
__u64 utime;/* si_utime */
__u64 stime;/* si_stime */
__u64 addr; /* si_addr */
};



Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.20.ep2/fs/signalfd.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.20.ep2/fs/signalfd.c  2007-03-11 14:28:37.0 -0700
@@ -0,0 +1,381 @@
+/*
+ *  fs/signalfd.c
+ *
+ *  Copyright (C) 2003  Linus Torvalds
+ *
+ *  Mon Mar 5, 2007: Davide Libenzi 
+ *  Changed ->read() to return a siginfo strcture instead of signal number.
+ *  Fixed locking in ->poll().
+ *  Added sighand-detach notification.
+ *  Added fd re-use in sys_signalfd() syscall.
+ *  Now using anonymous inode source.
+ *  Thanks to Oleg Nesterov for useful code review and suggestions.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+
+
+struct signalfd_ctx {
+   struct list_head lnk;
+   wait_queue_head_t wqh;
+   sigset_t sigmask;
+   struct task_struct *tsk;
+};
+
+
+
+static struct sighand_struct *signalfd_get_sighand(struct signalfd_ctx *ctx,
+  unsigned long *flags);
+static void signalfd_put_sighand(struct signalfd_ctx *ctx,
+struct sighand_struct *sighand,
+unsigned long *flags);
+static void signalfd_cleanup(struct signalfd_ctx *ctx);
+static int signalfd_close(struct inode *inode, struct file *file);
+static unsigned int signalfd_poll(struct file *file, poll_table *wait);
+static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo,
+siginfo_t const *kinfo);
+static ssize_t signalfd_read(struct file *file, char __user *buf, size_t count,
+loff_t *ppos);
+
+
+
+static const struct file_operations signalfd_fops = {
+   .release= signalfd_close,
+   .poll   = signalfd_poll,
+   .read   = signalfd_read,
+};
+static struct kmem_cache *signalfd_ctx_cachep;
+
+
+
+static struct sighand_struct *signalfd_get_sighand(struct signalfd_ctx *ctx,
+  unsigned long *flags)
+{
+   struct sighand_struct *sighand;
+
+   rcu_read_lock();
+   sighand = lock_task_sighand(ctx->tsk, flags);
+   rcu_read_unlock();
+
+   if (sighand && list_empty(>lnk)) {
+

[patch 9/9] signalfd/timerfd v3 - timerfd compat code ...

2007-03-11 Thread Davide Libenzi
This patch implement the necessary compat code for the timerfd system call.


Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.20.ep2/fs/compat.c
===
--- linux-2.6.20.ep2.orig/fs/compat.c   2007-03-11 14:28:48.0 -0700
+++ linux-2.6.20.ep2/fs/compat.c2007-03-11 14:35:22.0 -0700
@@ -2257,3 +2257,23 @@
return sys_signalfd(ufd, ksigmask, sizeof(sigset_t));
 }
 
+
+asmlinkage long compat_sys_timerfd(int ufd, int clockid, int tmrtype,
+  const struct compat_itimerspec __user *utmr)
+{
+   long res;
+   struct itimerspec t;
+   struct itimerspec __user *ut;
+
+   res = -EFAULT;
+   if (get_compat_itimerspec(, utmr))
+   goto err_exit;
+   ut = compat_alloc_user_space(sizeof(*ut));
+   if (copy_to_user(ut, , sizeof(t)) )
+   goto err_exit;
+
+   res = sys_timerfd(ufd, clockid, tmrtype, ut);
+err_exit:
+   return res;
+}
+
Index: linux-2.6.20.ep2/include/linux/compat.h
===
--- linux-2.6.20.ep2.orig/include/linux/compat.h2007-03-11 
14:39:53.0 -0700
+++ linux-2.6.20.ep2/include/linux/compat.h 2007-03-11 14:45:07.0 
-0700
@@ -225,6 +225,11 @@
return lhs->tv_nsec - rhs->tv_nsec;
 }
 
+extern int get_compat_itimerspec(struct itimerspec *dst,
+const struct compat_itimerspec __user *src);
+extern int put_compat_itimerspec(struct compat_itimerspec __user *dst,
+const struct itimerspec *src);
+
 asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp);
 
 extern int compat_printk(const char *fmt, ...);
Index: linux-2.6.20.ep2/kernel/compat.c
===
--- linux-2.6.20.ep2.orig/kernel/compat.c   2007-03-11 14:39:18.0 
-0700
+++ linux-2.6.20.ep2/kernel/compat.c2007-03-11 14:45:13.0 -0700
@@ -475,8 +475,8 @@
return min_length;
 }
 
-static int get_compat_itimerspec(struct itimerspec *dst, 
-struct compat_itimerspec __user *src)
+int get_compat_itimerspec(struct itimerspec *dst,
+ const struct compat_itimerspec __user *src)
 { 
if (get_compat_timespec(>it_interval, >it_interval) ||
get_compat_timespec(>it_value, >it_value))
@@ -484,8 +484,8 @@
return 0;
 } 
 
-static int put_compat_itimerspec(struct compat_itimerspec __user *dst, 
-struct itimerspec *src)
+int put_compat_itimerspec(struct compat_itimerspec __user *dst,
+ const struct itimerspec *src)
 { 
if (put_compat_timespec(>it_interval, >it_interval) ||
put_compat_timespec(>it_value, >it_value))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix read past end of array in md/linear.c

2007-03-11 Thread Neil Brown
On Thursday March 8, [EMAIL PROTECTED] wrote:
> On Thu, Mar 08, 2007 at 12:52:04PM -0800, Andy Isaacson wrote:
> > Index: linus/drivers/md/linear.c
> > ===
> > --- linus.orig/drivers/md/linear.c  2007-03-02 11:35:55.0 -0800
> > +++ linus/drivers/md/linear.c   2007-03-07 13:10:30.0 -0800
> > @@ -188,7 +188,7 @@
> > for (i=0; i < cnt-1 ; i++) {
> > sector_t sz = 0;
> > int j;
> > -   for (j=i; i > +   for (j=i; j > sz += conf->disks[j].size;
> > if (sz >= min_spacing && sz < conf->hash_spacing)
> > conf->hash_spacing = sz;
> 
> Forgot to add:
> 
> Signed-off-by: Andrew Isaacson <[EMAIL PROTECTED]>

And
 Acked-by: NeilBrown <[EMAIL PROTECTED]>

Thanks!

I would have replied earlier but I wanted to make sure I understood
exactly what the possible consequences of this bug were.. and they are
quite benign.
The worst possible outcome is going so far off the end of the array
that you hit un-mapped memory and Oops.

If that doesn't happen, then the next worst option is that the hash
table is sized poorly and you spend a few more cycles than needed
choosing the target device for the request (we still always choose the
right device).

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 8/9] signalfd/timerfd v3 - timerfd wire up x86_64 arch ...

2007-03-11 Thread Davide Libenzi
This patch wire the timerfd system call to the x86_64 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.20.ep2.orig/arch/x86_64/ia32/ia32entry.S  2007-03-11 
14:28:46.0 -0700
+++ linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S   2007-03-11 
14:33:56.0 -0700
@@ -720,4 +720,5 @@
.quad sys_getcpu
.quad sys_epoll_pwait
.quad sys_signalfd  /* 320 */
+   .quad sys_timerfd
 ia32_syscall_end:
Index: linux-2.6.20.ep2/include/asm-x86_64/unistd.h
===
--- linux-2.6.20.ep2.orig/include/asm-x86_64/unistd.h   2007-03-11 
14:28:46.0 -0700
+++ linux-2.6.20.ep2/include/asm-x86_64/unistd.h2007-03-11 
14:33:56.0 -0700
@@ -621,8 +621,10 @@
 __SYSCALL(__NR_move_pages, sys_move_pages)
 #define __NR_signalfd  280
 __SYSCALL(__NR_signalfd, sys_signalfd)
+#define __NR_timerfd   281
+__SYSCALL(__NR_timerfd, sys_timerfd)
 
-#define __NR_syscall_max __NR_signalfd
+#define __NR_syscall_max __NR_timerfd
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...

2007-03-11 Thread Davide Libenzi
On Sun, 11 Mar 2007, Davide Libenzi wrote:

> This patch introduces a new system call for timers events delivered
> though file descriptors. This allows timer event to be used with
> standard POSIX poll(2), select(2) and read(2). As a consequence of
> supporting the Linux f_op->poll subsystem, they can be used with
> epoll(2) too.
> The system call is defined as:
> 
> int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr);
> 
> The "ufd" parameter allows for re-use (re-programming) of an existing
> timerfd w/out going through the close/open cycle (same as signalfd).
> If "ufd" is -1, s new file descriptor will be created, otherwise the
> existing "ufd" will be re-programmed.
> The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME.
> The "tmrtype" parameter allows to specify the timer type. The following
> values are supported:
> 
> TFD_TIMER_REL
> The time specified in the "utmr" parameter is a relative time
>   from NOW.
> 
> TFD_TIMER_ABS
> The timer specified in the "utmr" parameter is an absolute time.
> 
> TFD_TIMER_SEQ
> The time specified in the "utmr" parameter is an interval at
>   which a continuous clock rate will be generated.
> 

Duh! Forgot to update the documenation. Now timerfd() gets an itimerspec.
For TFD_TIMER_REL only the it_interval is valid, and it's the relative 
time. For TFD_TIMER_ABS, only the it_value is valid, and that the expiry 
absolute time. For TFD_TIMER_SEQ, it_value tells when the first tick 
should be generated, and it_interval tells the period of the following 
ticks.



- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Jan Engelhardt

On Mar 11 2007 18:01, Kyle Moffett wrote:
> On Mar 11, 2007, at 16:41:51, Daniel Hazelton wrote:
>> On Sunday 11 March 2007 16:35:50 Jan Engelhardt wrote:
>> > On Mar 11 2007 22:15, Cong WANG wrote:
>> > > So can I say using NULL is better than 0 in kernel?
>> > 
>> > On what basis? Do you even know what NULL is defined as in (C, not
>> > C++) userspace? Think about it.
>> 
>> IIRC, the glibc and GCC headers define NULL as (void*)0  :)
>
> On the other hand when __cplusplus is defined they define it to the
> "__null" builtin, which GCC uses to give type conversion errors for
> "int foo = NULL" but not "char *foo = NULL".  A "((void *)0)"
> definition gives C++ type errors for both due to the broken C++
> void pointer conversion problems.

I think that the primary reason they use __null is so that you can
actually do

class foo *ptr = NULL;

because

class foo *ptr = (void *)0;

would throw an error or at least a warning (implicit cast from void*
to class foo*).


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] BUILD_BUG_ON_ZERO -> BUILD_BUG_OR_ZERO

2007-03-11 Thread Rusty Russell
BUILD_BUG_ON_ZERO is named perfectly wrong, and BUILD_BUG_ON_RETURN_ZERO
is too long.  Flip three bits, and the name is much more suitable.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r 6fb745a5bb51 include/linux/compiler-gcc.h
--- a/include/linux/compiler-gcc.h  Mon Mar 12 09:12:20 2007 +1100
+++ b/include/linux/compiler-gcc.h  Mon Mar 12 09:51:18 2007 +1100
@@ -24,7 +24,7 @@
 
 /* [0] degrades to a pointer: a different type from an array */
 #define __must_be_array(a) \
-  BUILD_BUG_ON_ZERO(__builtin_types_compatible_p(typeof(a), typeof([0])))
+  BUILD_BUG_OR_ZERO(__builtin_types_compatible_p(typeof(a), typeof([0])))
 
 #define inline inline  __attribute__((always_inline))
 #define __inline__ __inline__  __attribute__((always_inline))
diff -r 6fb745a5bb51 include/linux/kernel.h
--- a/include/linux/kernel.hMon Mar 12 09:12:20 2007 +1100
+++ b/include/linux/kernel.hMon Mar 12 09:51:25 2007 +1100
@@ -341,7 +341,7 @@ struct sysinfo {
result (of value 0 and type size_t), so the expression can be used
e.g. in a structure initializer (or where-ever else comma expressions
aren't permitted). */
-#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1)
+#define BUILD_BUG_OR_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1)
 
 /* Trap pasters of __FUNCTION__ at compile-time */
 #define __FUNCTION__ (__func__)
diff -r 6fb745a5bb51 include/linux/moduleparam.h
--- a/include/linux/moduleparam.h   Mon Mar 12 09:12:20 2007 +1100
+++ b/include/linux/moduleparam.h   Mon Mar 12 09:51:42 2007 +1100
@@ -65,7 +65,7 @@ struct kparam_array
 #define __module_param_call(prefix, name, set, get, arg, perm) \
/* Default value instead of permissions? */ \
static int __param_perm_check_##name __attribute__((unused)) =  \
-   BUILD_BUG_ON_ZERO((perm) < 0 || (perm) > 0777 || ((perm) & 2)); \
+   BUILD_BUG_OR_ZERO((perm) < 0 || (perm) > 0777 || ((perm) & 2)); \
static char __param_str_##name[] = prefix #name;\
static struct kernel_param const __param_##name \
__attribute_used__  \


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] MMC: Clean up low voltage range handling

2007-03-11 Thread Pierre Ossman
Philip Langdale wrote:
> Clean up the handling of low voltage MMC cards.
>
>   
> The latest MMC and SD specs both agree that the low
> voltage range is defined as 1.65-1.95V and is signified
> by bit 7 in the OCR. An old Sandisk spec implied that
> bits 7-0 represented voltages below 2.0V in 1V increments,
> and the code was accordingly written with that expectation.
>
>   

We must not have the same specs. My simplified SD 2.0 physical spec
defines everything below bit 15 as reserved.

> This change switches the code to conform to the specs and
> fixes the SDHCI driver. It also removes the explicit
> defines for the host vdd and updates the SDHCI driver
> to convert the bit number back to the mask value
> for comparisons. Having only a single set of defines
> ensures there's nothing to get out of sync.
>
>   

Although this is a nice change, it confuses things to have two changes
in one commit. Could you split them up and base it on my "for-andrew"
branch?

Rgds

-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [RSDL] sched: rsdl accounting fixes

2007-03-11 Thread Con Kolivas
Andrew the following patch can be rolled into the
sched-implement-rsdl-cpu-scheduler.patch file or added separately if
that's easier. All the oopses and bitmap errors of previous versions of rsdl
were fixed by v0.29 so I think RSDL is ready for another round in -mm.

Thanks.
---
Higher priority tasks should always preempt lower priority tasks if they
are queued higher than their static priority as non-rt tasks. Fix it.

The deadline mechanism can be triggered before tasks' quota ever gets added
to the runqueue priority level's quota. Add 1 to the quota in anticipation
of this.

The deadline mechanism should only be triggered if the quota is overrun
instead of as soon as the quota is expired allowing some aliasing errors in
scheduler_tick accounting. Fix that 

Signed-off-by: Con Kolivas <[EMAIL PROTECTED]>

---
 kernel/sched.c |   24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

Index: linux-2.6.21-rc3-mm2/kernel/sched.c
===
--- linux-2.6.21-rc3-mm2.orig/kernel/sched.c2007-03-12 08:47:43.0 
+1100
+++ linux-2.6.21-rc3-mm2/kernel/sched.c 2007-03-12 09:10:33.0 +1100
@@ -96,10 +96,9 @@ unsigned long long __attribute__((weak))
  * provided it is not a realtime comparison.
  */
 #define TASK_PREEMPTS_CURR(p, curr) \
-   (((p)->prio < (curr)->prio) || (((p)->prio == (curr)->prio) && \
+   (((p)->prio < (curr)->prio) || (!rt_task(p) && \
((p)->static_prio < (curr)->static_prio && \
-   ((curr)->static_prio > (curr)->prio)) && \
-   !rt_task(p)))
+   ((curr)->static_prio > (curr)->prio
 
 /*
  * This is the time all tasks within the same priority round robin.
@@ -3323,7 +3322,7 @@ static inline void major_prio_rotation(s
  */
 static inline void rotate_runqueue_priority(struct rq *rq)
 {
-   int new_prio_level, remaining_quota;
+   int new_prio_level;
struct prio_array *array;
 
/*
@@ -3334,7 +,6 @@ static inline void rotate_runqueue_prior
if (unlikely(sched_find_first_bit(rq->dyn_bitmap) < rq->prio_level))
return;
 
-   remaining_quota = rq_quota(rq, rq->prio_level);
array = rq->active;
if (rq->prio_level > MAX_PRIO - 2) {
/* Major rotation required */
@@ -3368,10 +3366,11 @@ static inline void rotate_runqueue_prior
}
rq->prio_level = new_prio_level;
/*
-* While we usually rotate with the rq quota being 0, it is possible
-* to be negative so we subtract any deficit from the new level.
+* As we are merging to a prio_level that may not have anything in
+* its quota we add 1 to ensure the tasks get to run in schedule() to
+* add their quota to it.
 */
-   rq_quota(rq, new_prio_level) += remaining_quota;
+   rq_quota(rq, new_prio_level) += 1;
 }
 
 static void task_running_tick(struct rq *rq, struct task_struct *p)
@@ -3397,12 +3396,11 @@ static void task_running_tick(struct rq 
if (!--p->time_slice)
task_expired_entitlement(rq, p);
/*
-* The rq quota can become negative due to a task being queued in
-* scheduler without any quota left at that priority level. It is
-* cheaper to allow it to run till this scheduler tick and then
-* subtract it from the quota of the merged queues.
+* We only employ the deadline mechanism if we run over the quota.
+* It allows aliasing problems around the scheduler_tick to be
+* less harmful.
 */
-   if (!rt_task(p) && --rq_quota(rq, rq->prio_level) <= 0) {
+   if (!rt_task(p) && --rq_quota(rq, rq->prio_level) < 0) {
if (unlikely(p->first_time_slice))
p->first_time_slice = 0;
rotate_runqueue_priority(rq);


-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...

2007-03-11 Thread Nicholas Miell
On Sun, 2007-03-11 at 16:13 -0700, Davide Libenzi wrote:
> On Sun, 11 Mar 2007, Davide Libenzi wrote:
> 
> > This patch introduces a new system call for timers events delivered
> > though file descriptors. This allows timer event to be used with
> > standard POSIX poll(2), select(2) and read(2). As a consequence of
> > supporting the Linux f_op->poll subsystem, they can be used with
> > epoll(2) too.
> > The system call is defined as:
> > 
> > int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr);
> > 
> > The "ufd" parameter allows for re-use (re-programming) of an existing
> > timerfd w/out going through the close/open cycle (same as signalfd).
> > If "ufd" is -1, s new file descriptor will be created, otherwise the
> > existing "ufd" will be re-programmed.
> > The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME.
> > The "tmrtype" parameter allows to specify the timer type. The following
> > values are supported:
> > 
> > TFD_TIMER_REL
> > The time specified in the "utmr" parameter is a relative time
> > from NOW.
> > 
> > TFD_TIMER_ABS
> > The timer specified in the "utmr" parameter is an absolute time.
> > 
> > TFD_TIMER_SEQ
> > The time specified in the "utmr" parameter is an interval at
> > which a continuous clock rate will be generated.
> > 
> 
> Duh! Forgot to update the documenation. Now timerfd() gets an itimerspec.
> For TFD_TIMER_REL only the it_interval is valid, and it's the relative 
> time. For TFD_TIMER_ABS, only the it_value is valid, and that the expiry 
> absolute time. For TFD_TIMER_SEQ, it_value tells when the first tick 
> should be generated, and it_interval tells the period of the following 
> ticks.
> 

You should probably make it behave like the other things that use
itimerspec, just to avoid confusion -- i.e. timers are relative by
default, there's a flag that makes them absolute, they expire when
it_value specifies, and repeat every it_interval nanoseconds if
it_interval is non-zero.

i.e.

int timerfd(int ufd, int clockid, int flags, const struct timespec
*utmr);

with TFD_TIMER_ABS in flags making the timer absolute instead of
relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all).

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...

2007-03-11 Thread Nicholas Miell
On Sun, 2007-03-11 at 16:50 -0700, Nicholas Miell wrote:
> You should probably make it behave like the other things that use
> itimerspec, just to avoid confusion -- i.e. timers are relative by
> default, there's a flag that makes them absolute, they expire when
> it_value specifies, and repeat every it_interval nanoseconds if
> it_interval is non-zero.
> 
> i.e.
> 
> int timerfd(int ufd, int clockid, int flags, const struct timespec
> *utmr);
> 
> with TFD_TIMER_ABS in flags making the timer absolute instead of
> relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all).
> 

Sorry, that should be

int timerfd(int ufd, int clockid, int flags, const struct itimerspec
*utmr);

and TFD_TIMER_ABSTIME.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RSDL v0.30 cpu scheduler for mainline kernels

2007-03-11 Thread Con Kolivas
There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and 2.6.21-rc3-mm2 
to bring RSDL up to version 0.30 for download here:

Full patches:

http://ck.kolivas.org/patches/staircase-deadline/2.6.20-sched-rsdl-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2-rsdl-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-sched-rsdl-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2-rsdl-0.30.patch

incrementals:

http://ck.kolivas.org/patches/staircase-deadline/2.6.20/2.6.20.2-rsdl-0.29-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2/2.6.20.2-rsdl-0.29-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3/2.6.21-rc3-rsdl-0.29-0.30.patch
http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2/2.6.21-rc3-mm2-rsdl-0.29-0.30.patch

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] cosmetic adaption of drivers/ide/Kconfig concerning SATA

2007-03-11 Thread Patrick Ringl

Hello,
since Serial ATA has it's own menu point now, I guess we can change the 
description of the deprecated SATA driver as well, since the new S-ATA 
subsystem is not configured through a SCSI low-level driver anymore.


The following patch is against 2.6.21-rc3:

--- linux-2.6.20.orig/drivers/ide/Kconfig2007-03-12 
01:34:38.0 +0100

+++ linux-2.6.20/drivers/ide/Kconfig2007-03-12 01:47:10.0 +0100
@@ -103,7 +103,7 @@
---help---
  There are two drivers for Serial ATA controllers.

-  The main driver, "libata", exists inside the SCSI subsystem
+  The main driver, "libata", exists in the "Serial ATA subsystem"
  and supports most modern SATA controllers.

  The IDE driver (which you are currently configuring) supports


Since I am not subscribed to the list, I'd find it great if I were 
personally CC'ed. :-)



Best regards
Patrick


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] MMC: Clean up low voltage range handling

2007-03-11 Thread Philip Langdale
Pierre Ossman wrote:
> 
> We must not have the same specs. My simplified SD 2.0 physical spec
> defines everything below bit 15 as reserved.

I was a little unclear. Both specs define bit 7 as the low-voltage
range but only the MMC spec defines the actual voltage. As such, there
is no complete definition of a low voltage SD card. That's why I added
the sanity check in the actual code.

> Although this is a nice change, it confuses things to have two changes
> in one commit. Could you split them up and base it on my "for-andrew"
> branch?

Yeah, I thought you'd think that :-) I'll post the two diffs shortly.

Thanks,

--phil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MMC: Consolidate voltage definitions

2007-03-11 Thread Philip Langdale
Consolidate the list of available voltages.

Up until now, a separate set of defines has been
used for host->vdd than that used for the OCR
voltage mask values. Having two sets of defines
allows them to get out of sync and the current
sets are already inconsistent with one claiming
to describe ranges and the other specific voltages.

Only the SDHCI driver uses the host->vdd defines and
it is easily fixed to use the OCR defines.

Signed-off-by: Philip Langdale <[EMAIL PROTECTED]>

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 86d0957..2f34ae3 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -668,20 +668,17 @@ static void sdhci_set_power(struct sdhci

pwr = SDHCI_POWER_ON;

-   switch (power) {
-   case MMC_VDD_170:
-   case MMC_VDD_180:
-   case MMC_VDD_190:
+   switch (1 << power) {
+   case MMC_VDD_17_18:
+   case MMC_VDD_18_19:
pwr |= SDHCI_POWER_180;
break;
-   case MMC_VDD_290:
-   case MMC_VDD_300:
-   case MMC_VDD_310:
+   case MMC_VDD_29_30:
+   case MMC_VDD_30_31:
pwr |= SDHCI_POWER_300;
break;
-   case MMC_VDD_320:
-   case MMC_VDD_330:
-   case MMC_VDD_340:
+   case MMC_VDD_32_33:
+   case MMC_VDD_33_34:
pwr |= SDHCI_POWER_330;
break;
default:
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 43bf6a5..496f540 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -16,30 +16,7 @@ struct mmc_ios {
unsigned intclock;  /* clock rate */
unsigned short  vdd;

-#defineMMC_VDD_150 0
-#defineMMC_VDD_155 1
-#defineMMC_VDD_160 2
-#defineMMC_VDD_165 3
-#defineMMC_VDD_170 4
-#defineMMC_VDD_180 5
-#defineMMC_VDD_190 6
-#defineMMC_VDD_200 7
-#defineMMC_VDD_210 8
-#defineMMC_VDD_220 9
-#defineMMC_VDD_230 10
-#defineMMC_VDD_240 11
-#defineMMC_VDD_250 12
-#defineMMC_VDD_260 13
-#defineMMC_VDD_270 14
-#defineMMC_VDD_280 15
-#defineMMC_VDD_290 16
-#defineMMC_VDD_300 17
-#defineMMC_VDD_310 18
-#defineMMC_VDD_320 19
-#defineMMC_VDD_330 20
-#defineMMC_VDD_340 21
-#defineMMC_VDD_350 22
-#defineMMC_VDD_360 23
+/* vdd stores the bit number of the selected voltage range from below. */

unsigned char   bus_mode;   /* command output mode */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...

2007-03-11 Thread Davide Libenzi
On Sun, 11 Mar 2007, Nicholas Miell wrote:

> You should probably make it behave like the other things that use
> itimerspec, just to avoid confusion -- i.e. timers are relative by
> default, there's a flag that makes them absolute, they expire when
> it_value specifies, and repeat every it_interval nanoseconds if
> it_interval is non-zero.
> 
> i.e.
> 
> int timerfd(int ufd, int clockid, int flags, const struct timespec
> *utmr);
> 
> with TFD_TIMER_ABS in flags making the timer absolute instead of
> relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all).

Sounds sane to me. Will do...


- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] MMC: Fix handling of low-voltage cards

2007-03-11 Thread Philip Langdale
Fix handling of low voltage MMC cards.

The latest MMC and SD specs both agree that support for
low-voltage operations is indicated by bit 7 in the OCR.
The MMC spec states that the low voltage range is
1.65-1.95V while the SD spec leaves the actual voltage
range undefined - meaning that there is still no such
thing as a low voltage SD card.

However, an old Sandisk spec implied that bits 7.0
represented voltages below 2.0V in 1V or 0.5V increments,
and the code was accordingly written with that expectation.

This confusion meant that host drivers attempting to support
the typical low voltage (1.8V) would set the wrong bits in
the host OCR mask (usually bits 5 and/or 6) resulting in the
the low voltage mode never being used.

This change corrects the low voltage range and adds sanity
checks on the reserved bits (0-6) and for SD cards that
claim to support low-voltage operations.

Signed-off-by: Philip Langdale <[EMAIL PROTECTED]>

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index c87ce56..74ebd97 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -317,6 +317,24 @@ static u32 mmc_select_voltage(struct mmc
 {
int bit;

+   /*
+* Sanity check the voltages that the card claims to
+* support.
+*/
+   if (ocr & 0x7F) {
+   printk("%s: card claims to support voltages below "
+  "the defined range. These will be ignored.\n",
+  mmc_hostname(host));
+   ocr &= ~0x7F;
+   }
+
+   if (host->mode == MMC_MODE_SD && (ocr & MMC_VDD_165_195)) {
+   printk("%s: SD card claims to support the incompletely "
+  "defined 'low voltage range'. This will be ignored.\n",
+  mmc_hostname(host));
+   ocr &= ~MMC_VDD_165_195;
+   }
+
ocr &= host->ocr_avail;

bit = ffs(ocr);
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 2f34ae3..a80c043 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -669,8 +669,7 @@ static void sdhci_set_power(struct sdhci
pwr = SDHCI_POWER_ON;

switch (1 << power) {
-   case MMC_VDD_17_18:
-   case MMC_VDD_18_19:
+   case MMC_VDD_165_195:
pwr |= SDHCI_POWER_180;
break;
case MMC_VDD_29_30:
@@ -1290,7 +1289,7 @@ static int __devinit sdhci_probe_slot(st
if (caps & SDHCI_CAN_VDD_300)
mmc->ocr_avail |= MMC_VDD_29_30|MMC_VDD_30_31;
if (caps & SDHCI_CAN_VDD_180)
-   mmc->ocr_avail |= MMC_VDD_17_18|MMC_VDD_18_19;
+   mmc->ocr_avail |= MMC_VDD_165_195;

if (mmc->ocr_avail == 0) {
printk(KERN_ERR "%s: Hardware doesn't report any "
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 496f540..2aac62a 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -65,14 +65,7 @@ struct mmc_host {
unsigned intf_max;
u32 ocr_avail;

-#define MMC_VDD_145_1500x0001  /* VDD voltage 1.45 - 
1.50 */
-#define MMC_VDD_150_1550x0002  /* VDD voltage 1.50 - 
1.55 */
-#define MMC_VDD_155_1600x0004  /* VDD voltage 1.55 - 
1.60 */
-#define MMC_VDD_160_1650x0008  /* VDD voltage 1.60 - 
1.65 */
-#define MMC_VDD_165_1700x0010  /* VDD voltage 1.65 - 
1.70 */
-#define MMC_VDD_17_18  0x0020  /* VDD voltage 1.7 - 1.8 */
-#define MMC_VDD_18_19  0x0040  /* VDD voltage 1.8 - 1.9 */
-#define MMC_VDD_19_20  0x0080  /* VDD voltage 1.9 - 2.0 */
+#define MMC_VDD_165_1950x0080  /* VDD voltage 1.65 - 
1.95 */
 #define MMC_VDD_20_21  0x0100  /* VDD voltage 2.0 ~ 2.1 */
 #define MMC_VDD_21_22  0x0200  /* VDD voltage 2.1 ~ 2.2 */
 #define MMC_VDD_22_23  0x0400  /* VDD voltage 2.2 ~ 2.3 */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: _proxy_pda still makes linking modules fail

2007-03-11 Thread Andi Kleen
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
> 
> I've heard that it now builds with gcc-4.2.0 snapshots. This is strange:
> if the problem has been fixed for gcc-4.2.0, why doesn't it work for
> gcc-4.1.2? arch/i386/kernel/vmlinux.lds.S does contain _proxy_pda = 0;

Hmm, it probably needs a EXPORT_SYMBOL. The previous change only
fixed the in kernel build.

Does it work with this patch?

-Andi

Export _proxy_pda for gcc 4.2

The symbol is not actually used, but the compiler unforunately generates
a (unused) reference to it. This can happen even in modules. So export it.

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

Index: linux/arch/i386/kernel/i386_ksyms.c
===
--- linux.orig/arch/i386/kernel/i386_ksyms.c
+++ linux/arch/i386/kernel/i386_ksyms.c
@@ -28,3 +28,5 @@ EXPORT_SYMBOL(__read_lock_failed);
 #endif
 
 EXPORT_SYMBOL(csum_partial);
+
+EXPORT_SYMBOL(_proxy_pda);
Index: linux/arch/x86_64/kernel/x8664_ksyms.c
===
--- linux.orig/arch/x86_64/kernel/x8664_ksyms.c
+++ linux/arch/x86_64/kernel/x8664_ksyms.c
@@ -61,3 +61,4 @@ EXPORT_SYMBOL(empty_zero_page);
 EXPORT_SYMBOL(init_level4_pgt);
 EXPORT_SYMBOL(load_gs_index);
 
+EXPORT_SYMBOL(_proxy_pda);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: _proxy_pda still makes linking modules fail

2007-03-11 Thread Jeremy Fitzhardinge
Andi Kleen wrote:
> Hmm, it probably needs a EXPORT_SYMBOL. The previous change only
> fixed the in kernel build.
>
> Does it work with this patch?
>
> -Andi
>
> Export _proxy_pda for gcc 4.2
>   

Gak.  It seemed like such a good idea at the time.

Rusty's pda->per_cpu patch will deal with this once and for all; have
you picked it up yet?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irda rmmod lockdep trace.

2007-03-11 Thread Samuel Ortiz
Hi Dave,

On Sat, Mar 10, 2007 at 07:43:26PM +0200, Samuel Ortiz wrote:
> Hi Dave,
> 
> On Thu, Mar 08, 2007 at 05:54:36PM -0500, Dave Jones wrote:
> > modprobe irda ; rmmod irda in 2.6.21rc3 gets me the spew below..
> Well it seems that we call __irias_delete_object() from hashbin_delete(). Then
> __irias_delete_object() calls itself hashbin_delete() again. We're trying to
> get the lock recursively.
Looking at the code more carefully, this seems to be a false positive:
iriap_cleanup and and __irias_delete_object are taking 2 different locks from
2 different hashbin instances. The locks belong to the same lock class but
they are hierarchically different. We need to tell the validator about it and
the following patch does that. Comments are welcomed as I'm planning to push
it to netdev soon:

 include/net/irda/irqueue.h |4 +++-
 net/irda/irias_object.c|3 ++-
 net/irda/irqueue.c |   13 +
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/net/irda/irqueue.h b/include/net/irda/irqueue.h
index 335b0ac..ce9fa7c 100644
--- a/include/net/irda/irqueue.h
+++ b/include/net/irda/irqueue.h
@@ -77,7 +77,8 @@ typedef struct hashbin_t {
 } hashbin_t;
 
 hashbin_t *hashbin_new(int type);
-int  hashbin_delete(hashbin_t* hashbin, FREE_FUNC func);
+int  hashbin_delete_nested(hashbin_t* hashbin, FREE_FUNC func,
+  u8 nested_depth);
 int  hashbin_clear(hashbin_t* hashbin, FREE_FUNC free_func);
 void hashbin_insert(hashbin_t* hashbin, irda_queue_t* entry, long hashv, 
const char* name);
@@ -92,5 +93,6 @@ irda_queue_t *hashbin_get_first(hashbin_t *hashbin);
 irda_queue_t *hashbin_get_next(hashbin_t *hashbin);
 
 #define HASHBIN_GET_SIZE(hashbin) hashbin->hb_size
+#define hashbin_delete(hashbin, func) hashbin_delete_nested(hashbin, func, 0)
 
 #endif
diff --git a/net/irda/iriap.c b/net/irda/iriap.c
diff --git a/net/irda/irias_object.c b/net/irda/irias_object.c
index 4adaae2..4238d23 100644
--- a/net/irda/irias_object.c
+++ b/net/irda/irias_object.c
@@ -142,7 +142,8 @@ void __irias_delete_object(struct ias_object *obj)
 
kfree(obj->name);
 
-   hashbin_delete(obj->attribs, (FREE_FUNC) __irias_delete_attrib);
+   hashbin_delete_nested(obj->attribs, (FREE_FUNC) __irias_delete_attrib,
+ SINGLE_DEPTH_NESTING);
 
obj->magic = ~IAS_OBJECT_MAGIC;
 
diff --git a/net/irda/irqueue.c b/net/irda/irqueue.c
index 9266233..c669a86 100644
--- a/net/irda/irqueue.c
+++ b/net/irda/irqueue.c
@@ -378,13 +378,14 @@ EXPORT_SYMBOL(hashbin_new);
 
 
 /*
- * Function hashbin_delete (hashbin, free_func)
+ * Function hashbin_delete_nested (hashbin, free_func, nested_lock)
  *
  *Destroy hashbin, the free_func can be a user supplied special routine
  *for deallocating this structure if it's complex. If not the user can
  *just supply kfree, which should take care of the job.
  */
-int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func)
+int hashbin_delete_nested( hashbin_t* hashbin, FREE_FUNC free_func,
+  u8 nested_depth)
 {
irda_queue_t* queue;
unsigned long flags = 0;
@@ -395,7 +396,11 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC 
free_func)
 
/* Synchronize */
if ( hashbin->hb_type & HB_LOCK ) {
-   spin_lock_irqsave(>hb_spinlock, flags);
+   if (nested_depth > 0)
+   spin_lock_irqsave_nested(>hb_spinlock, flags,
+nested_depth);
+   else
+   spin_lock_irqsave(>hb_spinlock, flags);
}
 
/*
@@ -428,7 +433,7 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func)
 
return 0;
 }
-EXPORT_SYMBOL(hashbin_delete);
+EXPORT_SYMBOL(hashbin_delete_nested);
 
 /* HASHBIN LIST OPERATIONS */
 



> I'll try to fix that soon, thanks for the report.
> 
> Cheers,
> Samuel.
> 
> 
> > Dave
> > 
> > NET: Registered protocol family 23
> > NET: Unregistered protocol family 23
> > 
> > =
> > [ INFO: possible recursive locking detected ]
> > 2.6.20-1.2966.fc7 #1
> > -
> > rmmod/16712 is trying to acquire lock:
> >  (>hb_spinlock){}, at: [] 
> > hashbin_delete+0x29/0x94 [irda]
> > 
> > but task is already holding lock:
> >  (>hb_spinlock){}, at: [] 
> > hashbin_delete+0x29/0x94 [irda]
> > 
> > other info that might help us debug this:
> > 1 lock held by rmmod/16712:
> >  #0:  (>hb_spinlock){}, at: [] 
> > hashbin_delete+0x29/0x94 [irda]
> > 
> > stack backtrace:
> > 
> > Call Trace:
> >  [] __lock_acquire+0x151/0xbc4
> >  [] :irda:__irias_delete_attrib+0x0/0x31
> >  [] lock_acquire+0x4c/0x65
> >  [] :irda:hashbin_delete+0x29/0x94
> >  [] _spin_lock_irqsave+0x2c/0x3c
> >  [] :irda:hashbin_delete+0x29/0x94
> >  [] 

Re: [RFC][PATCH 2/7] RSS controller core

2007-03-11 Thread Herbert Poetzl
On Sun, Mar 11, 2007 at 06:04:28PM +0300, Pavel Emelianov wrote:
> Herbert Poetzl wrote:
> > On Sun, Mar 11, 2007 at 12:08:16PM +0300, Pavel Emelianov wrote:
> >> Herbert Poetzl wrote:
> >>> On Tue, Mar 06, 2007 at 02:00:36PM -0800, Andrew Morton wrote:
>  On Tue, 06 Mar 2007 17:55:29 +0300
>  Pavel Emelianov <[EMAIL PROTECTED]> wrote:
> 
> > +struct rss_container {
> > +   struct res_counter res;
> > +   struct list_head page_list;
> > +   struct container_subsys_state css;
> > +};
> > +
> > +struct page_container {
> > +   struct page *page;
> > +   struct rss_container *cnt;
> > +   struct list_head list;
> > +};
>  ah. This looks good. I'll find a hunk of time to go through this
>  work and through Paul's patches. It'd be good to get both patchsets
>  lined up in -mm within a couple of weeks. But..
> >>> doesn't look so good for me, mainly becaus of the 
> >>> additional per page data and per page processing
> >>>
> >>> on 4GB memory, with 100 guests, 50% shared for each
> >>> guest, this basically means ~1mio pages, 500k shared
> >>> and 1500k x sizeof(page_container) entries, which
> >>> roughly boils down to ~25MB of wasted memory ...
> >>>
> >>> increase the amount of shared pages and it starts
> >>> getting worse, but maybe I'm missing something here
> >> You are. Each page has only one page_container associated
> >> with it despite the number of containers it is shared
> >> between.
> >>
>  We need to decide whether we want to do per-container memory
>  limitation via these data structures, or whether we do it via
>  a physical scan of some software zone, possibly based on Mel's
>  patches.
> >>> why not do simple page accounting (as done currently
> >>> in Linux) and use that for the limits, without
> >>> keeping the reference from container to page?
> >> As I've already answered in my previous letter simple
> >> limiting w/o per-container reclamation and per-container
> >> oom killer isn't a good memory management. It doesn't allow
> >> to handle resource shortage gracefully.
> > 
> > per container OOM killer does not require any container
> > page reference, you know _what_ tasks belong to the 
> > container, and you know their _badness_ from the normal
> > OOM calculations, so doing them for a container is really
> > straight forward without having any page 'tagging'
> 
> That's true. If you look at the patches you'll
> find out that no code in oom killer uses page 'tag'.

so what do we keep the context -> page reference
then at all?

> > for the reclamation part, please elaborate how that will
> > differ in a (shared memory) guest from what the kernel
> > currently does ...
> 
> This is all described in the code and in the
> discussions we had before.

must have missed some of them, please can you
point me to the relevant threads ...

TIA,
Herbert

> > TIA,
> > Herbert
> > 
> >> This patchset provides more grace way to handle this, but
> >> full memory management includes accounting of VMA-length
> >> as well (returning ENOMEM from system call) but we've decided
> >> to start with RSS.
> >>
> >>> best,
> >>> Herbert
> >>>
>  ___
>  Containers mailing list
>  [EMAIL PROTECTED]
>  https://lists.osdl.org/mailman/listinfo/containers
> >>> -
> >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>> the body of a message to [EMAIL PROTECTED]
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> Please read the FAQ at  http://www.tux.org/lkml/
> >>>
> > 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2/7] RSS controller core

2007-03-11 Thread Herbert Poetzl
On Sun, Mar 11, 2007 at 04:51:11AM -0800, Andrew Morton wrote:
> > On Sun, 11 Mar 2007 15:26:41 +0300 Kirill Korotaev <[EMAIL PROTECTED]> 
> > wrote:
> > Andrew Morton wrote:
> > > On Tue, 06 Mar 2007 17:55:29 +0300
> > > Pavel Emelianov <[EMAIL PROTECTED]> wrote:
> > > 
> > > 
> > >>+struct rss_container {
> > >>+ struct res_counter res;
> > >>+ struct list_head page_list;
> > >>+ struct container_subsys_state css;
> > >>+};
> > >>+
> > >>+struct page_container {
> > >>+ struct page *page;
> > >>+ struct rss_container *cnt;
> > >>+ struct list_head list;
> > >>+};
> > > 
> > > 
> > > ah. This looks good. I'll find a hunk of time to go through
> > > this work and through Paul's patches. It'd be good to get both
> > > patchsets lined up in -mm within a couple of weeks. But..
> > >
> > > We need to decide whether we want to do per-container memory
> > > limitation via these data structures, or whether we do it via
> > > a physical scan of some software zone, possibly based on Mel's
> > > patches.
> > i.e. a separate memzone for each container?
> 
> Yep. Straightforward machine partitioning. An attractive thing is that
> it 100% reuses existing page reclaim, unaltered.
> 
> > imho memzone approach is inconvinient for pages sharing and shares
> > accounting. it also makes memory management more strict, forbids
> > overcommiting per-container etc.
> 
> umm, who said they were requirements?

well, I guess all existing OS-Level virtualizations
(Linux-VServer, OpenVZ, and FreeVPS) have stated more
than one time that _sharing_ of resources is a central
element, and one especially important resource to share
is memory (RAM) ...

if your aim is full partitioning, we do not need to
bother with OS-Level isolation, we can simply use
Paravirtualization and be done ...

> > Maybe you have some ideas how we can decide on this?
> 
> We need to work out what the requirements are before we can 
> settle on an implementation.

Linux-VServer (and probably OpenVZ):

 - shared mappings of 'shared' files (binaries 
   and libraries) to allow for reduced memory
   footprint when N identical guests are running

 - virtual 'physical' limit should not cause
   swap out when there are still pages left on
   the host system (but pages of over limit guests
   can be preferred for swapping)

 - accounting and limits have to be consistent
   and should roughly represent the actual used
   memory/swap (modulo optimizations, I can go
   into detail here, if necessary)

 - OOM handling on a per guest basis, i.e. some
   out of memory condition in guest A must not
   affect guest B

HTC,
Herbert

> Sigh.  Who is running this show?   Anyone?
> 
> You can actually do a form of overcommittment by allowing multiple
> containers to share one or more of the zones. Whether that is
> sufficient or suitable I don't know. That depends on the requirements,
> and we haven't even discussed those, let alone agreed to them.
> 
> ___
> Containers mailing list
> [EMAIL PROTECTED]
> https://lists.osdl.org/mailman/listinfo/containers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [REVISED] drivers/media/video/videocodec.c: check kmalloc() return value.

2007-03-11 Thread Amit Choudhary
Description: Check the return value of kmalloc() in function 
videocodec_build_table(), in file drivers/media/video/videocodec.c.

Signed-off-by: Amit Choudhary <[EMAIL PROTECTED]>

diff --git a/drivers/media/video/videocodec.c b/drivers/media/video/videocodec.c
index 290e641..f2bbd7a 100644
--- a/drivers/media/video/videocodec.c
+++ b/drivers/media/video/videocodec.c
@@ -348,6 +348,9 @@ #define LINESIZE 100
kfree(videocodec_buf);
videocodec_buf = kmalloc(size, GFP_KERNEL);
 
+   if (!videocodec_buf)
+   return 0;
+
i = 0;
i += scnprintf(videocodec_buf + i, size - 1,
  "lave or attached aster name  type flagsmagic   
 ");
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ckrm-tech] [PATCH 0/2] resource control file system - aka containers on top of nsproxy!

2007-03-11 Thread Paul Jackson
Sam, responding to Herbert:
> > from my personal PoV the following would be fine:
> >
> >  spaces (for the various 'spaces')
> >...
> >  container (for resource accounting/limits)
> >...
> 
> I like these a lot ...

Hmmm ... ok ...

Let me see if I understand this.

We have actors, known as threads, tasks or processes, which use things,
which are instances of such classes of things as disk partitions,
file systems, memory, cpus, and semaphores.

We assign names to these things, such as SysV id's to the semaphores,
mount points to the file systems, pathnames to files and file
descriptors to open files.  These names provide handles that
are typically more convenient and efficient to use, but alas less
persistent, less ubiquitous, and needing of some dereferencing when
used, to identify the underlying thing.

Any particular assignment of names to some of the things in particular
class forms one namespace (aka 'space', above).  For each class of
things, a given task is assigned one such namespace.  Typically many
related tasks (such as all those of a login session or a job) will be
assigned the same set of namespaces, leading to various opportunities
for optimizing the management of namespaces in the kernel.

This assignment of names to things is neither injective nor surjective
nor even a complete map.

For example, not all file systems are mounted, certainly not all
possible mount points (all directories) serve as mount points,
sometimes the same file system is mounted in multiple places, and
sometimes more than one file system is mounted on the same mount point,
one hiding the other.

In so far as the code managing this naming is concerned, the names are
usually fairly arbitrary, except that there seems to be a tendency
toward properly virtualizing these namespaces, presenting to a task
the namespaces assigned it as if that was all there was, hiding the
presence of alternative namespaces, and intentionally not providing a
'global view' that encompasses all namespaces of a given class.

This tendency culminates in the full blown virtual machines, such as
Xen and KVM, which virtualize more or less all namespaces.

Because the essential semantics relating one namespace to another are
rather weak (the namespaces for any given class of things are or can
be pretty much independent of each other), there is a preference and
a tradition to keep such sets of namespaces a simple flat space.

Conclusions regarding namespaces, aka spaces:

A namespace provide a set of convenient handles for things of a
particular class.

For each class of things, every task gets one namespace (perhaps
a Null or Default one.)

Namespaces are partial virtualizations, the 'space of namespaces'
is pretty flat, and the assignment of names in one namespace is
pretty independent of the next.

===

That much covers what I understand (perhaps in error) of namespaces.

So what's this resource accounting/limits stuff?

I think this depends on adding one more category to our universe.

For the purposes of introducing yet more terms, I will call this
new category a "metered class."

Each time we set about to manage some resource, we tend to construct
some more elaborate "metered classes" out of the elemental classes
of things (partitions, cpus, ...) listed above.

Examples of these more elaborate metered classes include percentages
of a networks bandwidth, fractions of a nodes memory (the fake numa
patch), subsets of the systems cpus and nodes (cpusets), ...

These more elaborate metered classes each have fairly 'interesting'
and specialized forms.  Their semantics are closely adapted to the
underlying class of things from which they are formed, and to the
usually challenging, often conflicting, constraints on managing the
usage of such a resource.

For example, the rules that apply to percentages of a networks
bandwidth have little in common with the rules that apply to sets of
subsets of a systems cpus and nodes.

We then attach tasks to these metered classes.  Each task is assigned
one metered instance from each metered class.  For example, each task
is assigned to a cpuset.

For metered classes that are visible across the system, we tend
to name these classes, and then use those names when attaching
tasks to them.  See for example cpusets.

For metered classes that are only privately visible within the
current context of a task, such as setrlimit, set_mempolicy,
mbind and set_mempolicy, we tend to implicitly attach each task
to its current metered class and provide it explicit means
to manipulate the individual attributes of that metered class
by direct system calls.

Conclusions regarding metered classes, aka containers:

Unlike namespaces, metered classes have rich and varied semantics,
sometimes elaborate inheritance and transfer rules, and frequently
non-flat topologies.

Depending on the scope of visibility of a metered class, it may
or may not have much of a formal name space.


[PATCH] [REVISED] drivers/media/video/stv680.c: check kmalloc() return value.

2007-03-11 Thread Amit Choudhary
Description: Check the return value of kmalloc() in function 
stv680_start_stream(), in file drivers/media/video/stv680.c.

Signed-off-by: Amit Choudhary <[EMAIL PROTECTED]>

diff --git a/drivers/media/video/stv680.c b/drivers/media/video/stv680.c
index 6d1ef1e..f35c664 100644
--- a/drivers/media/video/stv680.c
+++ b/drivers/media/video/stv680.c
@@ -687,7 +687,11 @@ static int stv680_start_stream (struct u
stv680->sbuf[i].data = kmalloc (stv680->rawbufsize, GFP_KERNEL);
if (stv680->sbuf[i].data == NULL) {
PDEBUG (0, "STV(e): Could not kmalloc raw data buffer 
%i", i);
-   return -1;
+   for (i = i - 1; i >= 0; i--) {
+   kfree(stv680->sbuf[i].data);
+   stv680->sbuf[i].data = NULL;
+   }
+   return -ENOMEM;
}
}
 
@@ -698,15 +702,25 @@ static int stv680_start_stream (struct u
stv680->scratch[i].data = kmalloc (stv680->rawbufsize, 
GFP_KERNEL);
if (stv680->scratch[i].data == NULL) {
PDEBUG (0, "STV(e): Could not kmalloc raw scratch 
buffer %i", i);
-   return -1;
+   for (i = i - 1; i >= 0; i--) {
+   kfree(stv680->scratch[i].data);
+   stv680->scratch[i].data = NULL;
+   }
+   goto nomem_sbuf;
}
stv680->scratch[i].state = BUFFER_UNUSED;
}
 
for (i = 0; i < STV680_NUMSBUF; i++) {
urb = usb_alloc_urb (0, GFP_KERNEL);
-   if (!urb)
-   return -ENOMEM;
+   if (!urb) {
+   for (i = i - 1; i >= 0; i--) {
+   usb_kill_urb(stv680->urb[i]);
+   usb_free_urb(stv680->urb[i]);
+   stv680->urb[i] = NULL;
+   }
+   goto nomem_scratch;
+   }
 
/* sbuf is urb->transfer_buffer, later gets memcpyed to scratch 
*/
usb_fill_bulk_urb (urb, stv680->udev,
@@ -721,6 +735,18 @@ static int stv680_start_stream (struct u
 
stv680->framecount = 0;
return 0;
+
+ nomem_scratch:
+   for (i=0; iscratch[i].data);
+   stv680->scratch[i].data = NULL;
+   }
+ nomem_sbuf:
+   for (i=0; isbuf[i].data);
+   stv680->sbuf[i].data = NULL;
+   }
+   return -ENOMEM;
 }
 
 static int stv680_stop_stream (struct usb_stv *stv680)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/7] Resource counters

2007-03-11 Thread Herbert Poetzl
On Sun, Mar 11, 2007 at 01:00:15PM -0600, Eric W. Biederman wrote:
> Herbert Poetzl <[EMAIL PROTECTED]> writes:
> 
> >
> > Linux-VServer does the accounting with atomic counters,
> > so that works quite fine, just do the checks at the
> > beginning of whatever resource allocation and the
> > accounting once the resource is acquired ...
> 
> Atomic operations versus locks is only a granularity thing.
> You still need the cache line which is the cost on SMP.
> 
> Are you using atomic_add_return or atomic_add_unless or 
> are you performing you actions in two separate steps 
> which is racy? What I have seen indicates you are using 
> a racy two separate operation form.

yes, this is the current implementation which
is more than sufficient, but I'm aware of the
potential issues here, and I have an experimental
patch sitting here which removes this race with
the following change:

 - doesn't store the accounted value but
   limit - accounted (i.e. the free resource)
 - uses atomic_add_return() 
 - when negative, an error is returned and
   the resource amount is added back

changes to the limit have to adjust the 'current'
value too, but that is again simple and atomic

best,
Herbert

PS: atomic_add_unless() didn't exist back then
(at least I think so) but that might be an option
too ...

> >> If we'll remove failcnt this would look like
> >>while (atomic_cmpxchg(...))
> >> which is also not that good.
> >> 
> >> Moreover - in RSS accounting patches I perform page list
> >> manipulations under this lock, so this also saves one atomic op.
> >
> > it still hasn't been shown that this kind of RSS limit
> > doesn't add big time overhead to normal operations
> > (inside and outside of such a resource container)
> >
> > note that the 'usual' memory accounting is much more
> > lightweight and serves similar purposes ...
> 
> Perhaps
> 
> Eric
> ___
> Containers mailing list
> [EMAIL PROTECTED]
> https://lists.osdl.org/mailman/listinfo/containers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Kyle Moffett

On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote:

On Mar 11 2007 18:01, Kyle Moffett wrote:
On the other hand when __cplusplus is defined they define it to  
the "__null" builtin, which GCC uses to give type conversion  
errors for "int foo = NULL" but not "char *foo = NULL".  A "((void  
*)0)" definition gives C++ type errors for both due to the broken C 
++ void pointer conversion problems.


I think that the primary reason they use __null is so that you can
actually do

class foo *ptr = NULL;

because

class foo *ptr = (void *)0;

would throw an error or at least a warning (implicit cast from void*
to class foo*).


Isn't that what I said? :-D

Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Jan Engelhardt

On Mar 11 2007 21:27, Kyle Moffett wrote:
> On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote:
>> On Mar 11 2007 18:01, Kyle Moffett wrote:
>> > On the other hand when __cplusplus is defined they define it to the
>> > "__null" builtin, which GCC uses to give type conversion errors for
>> > "int foo = NULL" but not "char *foo = NULL".

>> I think that the primary reason they use __null is so that you can
>> actually do[...]
>
> Isn't that what I said? :-D

Ya. Though I was picking at

|"__null" builtin, which GCC uses to give type conversion errors for
|"int foo = NULL"

since C's (void *)0 would also barf when being assigned to int.
So it's not a genuine __null feature ;-)


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Kyle Moffett

On Mar 11, 2007, at 21:32:00, Jan Engelhardt wrote:

On Mar 11 2007 21:27, Kyle Moffett wrote:

On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote:

On Mar 11 2007 18:01, Kyle Moffett wrote:

On the other hand when __cplusplus is defined they define it to the
"__null" builtin, which GCC uses to give type conversion errors for
"int foo = NULL" but not "char *foo = NULL".



I think that the primary reason they use __null is so that you can
actually do[...]


Isn't that what I said? :-D


Ya. Though I was picking at

"__null" builtin, which GCC uses to give type conversion errors  
for "int foo = NULL"


since C's (void *)0 would also barf when being assigned to int.  So  
it's not a genuine __null feature ;-)


You chopped my sentence in half! :-D  What I *really* said was:
...give type conversion errors for 'int foo = NULL' but not 'char  
*foo = NULL'.


The pseudo-standard "#define NULL (0)" that the C++ standards ask for  
does *NOT* give an error for "int foo = NULL;", and in C++ the C- 
standard "#define NULL ((void *)0)" *does* give an error for "char  
*foo = NULL;"  Ergo I think I was correct when I said "GCC uses  
[__null] to give type conversion errors for  but not second>"


Cheers,
Kyle Moffett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add nForce MCP61 support to i2c-nforce2

2007-03-11 Thread Petr Vandrovec

Jean Delvare wrote:

Hi Petr,

On Sat, 10 Mar 2007 09:00:03 +0100, Petr Vandrovec wrote:

Hello,
  patch below adds support for nVidia's SMBus adapter present on Gateway's GT5414E 
motherboard (ECS's MCP61 PM-AM).  Patch is for current Linus's git tree.


We already have a patch doing exactly this in -mm:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc3/2.6.21-rc3-mm2/broken-out/jdelvare-i2c-i2c-nforce2-add-mcp61-mcp65-support.patch


Thanks.


00:01.1 SMBus: nVidia Corporation MCP61 SMBus (rev a2)
Subsystem: Elitegroup Computer Systems Unknown device 2601
Flags: 66MHz, fast devsel, IRQ 10
I/O ports at fc00 [size=64]
I/O ports at 1c00 [size=64]
I/O ports at f400 [size=64]
Capabilities: [44] Power Management version 2


BTW, note how the MCP61 has not 2 but 3 64-byte I/O areas declared. The
previous chips used BAR 4 and 5, this new one additionally uses BAR 0.
Without documentation it's hard to be sure this is a 3rd SMBus channel,
but it sure looks so. Maybe you'll want to hack the i2c-nforce2 driver
a bit to confirm or infirm this theory.


I had same idea as you have, so I tried to modify driver to use BAR0 as 
well, and (1) i2cdump then said that nobody is there and (2) dump of 
range fc00 was quite different from range 1c00 and f400.


So for my hardware I'm sure that BAR0 is of no use for me - if it is 3rd 
channel then either it uses different interface from nforce2, or nothing 
is connected to it.

Petr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Bitbanging i2c bus driver using the GPIO API

2007-03-11 Thread Wu, Bryan
On Sat, 2007-03-10 at 14:13 +0100, Haavard Skinnemoen wrote:
> This is a very simple bitbanging i2c bus driver utilizing the new
> arch-neutral GPIO API. Useful for chips that don't have a built-in
> i2c controller, additional i2c busses, or testing purposes.
> 

Sorry for missing this hot discussion. Your idea is exactly what I want.
So many arch specific GPIO based I2C adapter implementation will benefit
from this.

> To use, include something similar to the following in the
> board-specific setup code:
> 
>   #include 
> 
>   static struct i2c_gpio_platform_data i2c_gpio_data = {
>   .sda_pin= GPIO_PIN_FOO,
>   .scl_pin= GPIO_PIN_BAR,
>   };

Is this usage right, because 3 flags are added to this structure as
below:

struct i2c_gpio_platform_data {
unsigned int sda_pin;
unsigned int scl_pin;
unsigned int sda_is_open_drain:1;
unsigned int scl_is_open_drain:1;
unsigned int scl_is_output_only:1;
};

>   static struct platform_device i2c_gpio_device = {
>   .name   = "i2c-gpio",
>   .id = 0,
>   .dev= {
>   .platform_data  = _gpio_data,
>   },
>   };
> 
> Register this platform_device, set up the i2c pins as GPIO if
> required and you're ready to go.
> 
> Signed-off-by: Haavard Skinnemoen <[EMAIL PROTECTED]>
> ---
> This patch is different from the first patch in the following ways:
>   * Handles pins set up as open drain (aka multidrive) by toggling
> the output value instead of the direction
>   * Handles output-only SCL pins the same way, and also does not
> install a getscl() callback for such pins
>   * Does not add anything to include/linux/i2c-ids.h
>   * Sets the output value explicitly after changing the direction to
> output.
>   * Plugs a memory leak in remove() -- algo_data wasn't freed.
>   * Prints out the pin IDs in decimal, with an extra note when clock
> stretching isn't supported
> 
> This version has been compile-tested only. I'll give it a spin when I
> get back to work on monday.
> 
> Dave, does this address your concerns?
> 
> Haavard   

Thanks a lot,  I will drop our GPIO based I2C driver and try this one on
our platform.

> + if (!pdata->scl_is_output_only)
> + bit_data->getscl = i2c_gpio_getscl,
> +
> + bit_data->getsda= i2c_gpio_getsda,
> + bit_data->udelay= 5,/* 100 kHz */
> + bit_data->timeout   = HZ / 10,  /* 100 ms */

Can we add these udelay/timeout to struct i2c_gpio_platform_data? And
let customer to choose these according their specific requirement. We
use Kconfig to do this, but Jean and David don't like the idea, -:(

Regards,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i.MX/MX1 SDHC fix/workaround of SD card recognition problems

2007-03-11 Thread Pavel Pisa
On Monday 12 March 2007 00:36, you wrote:
> Pavel Pisa wrote:
> > The SDHC controllers cannot process shorter transfers.
> > They has to be handled as longer ones, but it such case CRC
> > error is evaluated. There was a case in the code still,
> > where this error is not ignored as it should to be process
> > these transfers.
> >
> > Signed-off-by: Pavel Pisa <[EMAIL PROTECTED]>
>
> Thanks, applied. Is this something critical that should be in 2.6.21?
>
> Rgds

Hello Pierre,

this should go to 2.6.21, I have hold this for some
months and I have discussed it in the thread
"Re: CRC Errors with SD cards in 4bits mode (on i.MXl)"
You have been CCed. This is not solution for seen data CRC
problem, but solves problems with recognition of cards
which has been timing sensitive sometimes.

I have sent it into Russell's patch queue with my others
MX1 fixes I have intended to be included in 2.6.21.
It was probably mistake for this one, because it should
go through your tree. If you send it to mainline
yourself, I would discard patch from patch daemon.

We have spoken about MX1 SDHC maintainership.
I am attaching my subscription.
I am not sure about mailing list field there.
Do you suggest this one, ALKML or other?

Best wishes

  Pavel Pisa

--
Subject: i.MX/MX1 SDHC maintainer

I am reporting to responsibility for i.MX MMC driver
bugs and coordination of the fighting against problems
of this hardware beast.

Signed-off-by: Pavel Pisa <[EMAIL PROTECTED]>

 MAINTAINERS |7 +++
 1 file changed, 7 insertions(+)

Index: linux-2.6.21-rc1/MAINTAINERS
===
--- linux-2.6.21-rc1.orig/MAINTAINERS
+++ linux-2.6.21-rc1/MAINTAINERS
@@ -1713,6 +1713,13 @@ M:   [EMAIL PROTECTED]
 L: [EMAIL PROTECTED] (subscribers-only)
 S: Maintained
 
+IMX MMC/SD HOST CONTROLLER INTERFACE DRIVER
+P: Pavel Pisa
+M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
+W: http://mmc.drzeus.cx/wiki/Controllers/Freescale/SDHC
+S: Maintained
+
 INFINIBAND SUBSYSTEM
 P: Roland Dreier
 M: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd

2007-03-11 Thread Joe Jin
> 
> This is a bug actually in the megaraid.

Aha, I'll track it.

> 
> And this is a direct command submission path:  it already passed both
> online check gates in this path *after* the device was offlined, so
> adding a third won't fix this. 

Yeah, I have notice that, however, from the logs, the device have offline, 
but why still can send cmd to device? isn't the sequences of printk suspectful?

> single disk, so the I/O was definitely bound for sda?  Secondly, can you
> reproduce with a modern (2.6.20) kernel.  Your trace strongly suggests
> that the device came back online for some reason and then the megaraid
> driver died.

It's hard to update the kernel for the system is a production system, and we
cannot debug it at the box :( 

I dont know if you have notice, the logs come from diskdump, if it caused by
diskdump?

Thanks,
Joe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] fs: introduce perform_write aop

2007-03-11 Thread Mark Fasheh
On Sat, Mar 10, 2007 at 09:25:41AM +, Christoph Hellwig wrote:
> On Fri, Mar 09, 2007 at 03:33:01PM -0800, Mark Fasheh wrote:
> > ->kernel_write() as opposed to genericizing ->perform_write() would be fine
> > with me. Just so long as we get rid of ->prepare_write and ->commit_write in
> > that other kernel code doesn't call them directly. That interface just
> > doesn't work for Ocfs2.
> 
> It doesn't work for any filesystem that needs slightly fancy locking.
> That and the reason that's an interface that doesn't fit into our
> layering is why I want to get rid of it.  Note that fops->kernel_write
> might in fact use ->perform_write with an actor as Nick suggested.
> I'm not quite sure how it'll look like - I'd rather take care of the
> buffered write path first and then handle this issue once the first
> changes have stabilized.
> 
> > Right now I've got Ocfs2 implementing it's own lowest-level buffered write
> > code - think generic_file_buffered_write() replacement for Ocfs2. With some
> > duplicated code above that layer. What's nice is that I can abstract away
> > the "copy data into some target pages" bits such that the majority of that
> > code is re-usable for ocfs2's splice write operation. I'm not sure we could
> > have that low a level of abstraction for anyhing above individual the file
> > system though which also has to deal with non-kernel writes though. That's
> > where a ->kernel_write() might come in handy.
> 
> Why do you need your own low level buffered write functionality?  As in
> past times when filesystems want to come up I'd like to have a very
> good exaplanation on why you think it's needed and whether we shouldn't
> improve the generic buffered write code instead.

Fair enough - I personally tried everything I could before coming to the
conclusion that for the time being, Ocfs2 should have a seperate write path.

As you know, I've been adding sparse file support for Ocfs2. Putting aside
all the reasons to have real support for sparse files (as opposed to zeroing
allocated regions), the tree code changes alone has gotten us 90% the way to
supporting unwritten extents (much like xfs does).

Ocfs2 supports atomic data allocation units ('clusters', to use an
overloaded term) which can range in size from 4k to 1 meg. This means that
for allocating writes on page size < cluster size file systems, we have to
zero pages adjacent to the one being written so that a re-read doesn't
return dirty data. This alone requires page locking which we can't
realistically achieve via ->prepare_write() and ->commit_write(). I believe
NTFS has a similar restriction, which has lead to their own file write.

So, page locking was definitely the straw that broke the camels back. Some
other things which were akward or slightly less critically broken than the
page locking:

Since ocfs2 has a rather large (compared to a local file system) context to
build up during an allocating write, it became uncomfortable to pass that
around ->prepare_write() and ->commit_write() without putting that context
on our struct inode and protecting it with a lock. And since the existing
interfaces were so rigid, it actually required a lot more context to be
passed around than in my current code.

There's also the cluster lock / page lock inversion which we have to deal
with (it gets even worse if we fault in pages in the middle of the user copy
for a write). Granted, we fixed a lot of that before merging, but allocating
in write means taking even more cluster locks and I don't really feel
comfortable nesting so many of those within the page locks.

Finally, we get to the optimization problem - writing stuff one page at a
time. To be fair, my current stuff doesn't do a very good job of optimizing
the amount of data written in a given pass, but the groundwork is there to
easily write at least one clusters worth of user data at a time. My priority
has been mostly to stabilize it as opposed to performance tuning.

So, quite possibly, I overstated what Ocfs2 was doing earlier - we still
make use of as much generic code as we can. The O_DIRECT path for instance
wasn't touched. Ocfs2 still makes use of block_commit_write(), the standard
jbd mechanisms for ordered data mode, and though we got rid of
block_prepare_write() (for zeroing reasons), what we do is a much simpler
version.

By the way, the code in question can be found in the sparse_files branch of
ocfs2.git:

http://git.kernel.org/?p=linux/kernel/git/mfasheh/ocfs2.git;a=log;h=sparse_files

Your review has been extremely useful in the past, so I welcome any comments
you might have.

Though it's getting close to being put in ALL (for a spin in -mm), it's
definitely a work in progress branch. There's 3 patches to generic code
which I need to push out for review (it's pretty much just exporting symbols
which we'd need in any case). Also, some of the bug fixes and feature
adjustments need to get folded back into their respective patches.

> This codepath is so nasty that any 

Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd

2007-03-11 Thread Joe Jin
> The 2.6.9 base is very old in mainline terms.  Are you sure the bug hasn't
> been fixed in mainline by other means?

I cannot confirm if it have fixed in latest kernel, the server is a
production system, it's hard to debug it and try reproduce.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd

2007-03-11 Thread Andrew Morton
> On Mon, 12 Mar 2007 10:52:22 +0800 Joe Jin <[EMAIL PROTECTED]> wrote:
> > The 2.6.9 base is very old in mainline terms.  Are you sure the bug hasn't
> > been fixed in mainline by other means?
> 
> I cannot confirm if it have fixed in latest kernel, the server is a
> production system, it's hard to debug it and try reproduce.

Well.  That makes it hard to run tests, but perhaps it can be determined
from code review..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [git patches] libata fixes

2007-03-11 Thread Linus Torvalds


On Sun, 11 Mar 2007, Paul Rolland wrote:
>
> My machine is having two problems : the one you are describing above,
> which is due to a SIL controler being connected to one port of the ICH7
> (at least, it seems to), and probing it goes  timeout, but nothing is
> connected on it.

Ok, so that's just a message irritation, not actually bothersome 
otherwise?

> The second problem is a Jmicron363 controler that is failing to detect
> the DVD-RW that is connected, unless I use the irqpoll option as Tejun has
> suggested.

.. and this one has never worked without irqpoll?

> But, as you suggest it, I'm adding pci=nomsi to the command line
> rebooting... no change for this part of the problem.
> 
> OK, the /proc/interrupt for this config, and the dmesg attached.
> 
> 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts 
>CPU0   CPU1   
>   0: 297549  0   IO-APIC-edge  timer
>   1:  7  0   IO-APIC-edge  i8042
>   4: 13  0   IO-APIC-edge  serial
>   6:  5  0   IO-APIC-edge  floppy
>   8:  1  0   IO-APIC-edge  rtc
>   9:  0  0   IO-APIC-fasteoi   acpi
>  12:126  0   IO-APIC-edge  i8042
>  14:   8313  0   IO-APIC-edge  libata
>  15:  0  0   IO-APIC-edge  libata
>  16:  0  0   IO-APIC-fasteoi   eth1, libata

So it's the irq16 one that is the Jmicron controller and just isn't 
getting any interrupts?

Since all the other interrupts work (and MSI worked for other 
controllers), I don't think it's interrupt-routing related. Especially as 
MSI shouldn't even care about things like that.

And since it all works when "irqpoll" is used, that implies that the 
*only* thing that is broken is literally irq delivery.

Is there possibly some jmicron-specific "enable interrupts" bit? 

> PS : I'd like to try 2.6.21-rc3, but it seems that this is breaking my
> config : disk naming is no more the same, and I end up with a panic
> Warning: unable to open an initial console
> though i've been compiling with the same .config I was using for 2.6.21-rc2

Gaah. Can you get a log through serial console or netconsole to see what 
changed?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA resume slowness, e1000 MSI warning

2007-03-11 Thread Michael S. Tsirkin
> Quoting Eric W. Biederman <[EMAIL PROTECTED]>:
> Subject: Re: SATA resume slowness, e1000 MSI warning
> 
> "Michael S. Tsirkin" <[EMAIL PROTECTED]> writes:
> 
> > OK I guess. I gather we assume writing read-only registers has no side 
> > effects?
> > Are there rumors circulating wrt to these?
> 
> I haven't heard anything about that, and if we are writing the same value back
> it should be pretty safe.
> 
> I have heard it asserted that at least one version of the pci spec
> only required 32bit accesses to be supported by the hardware.  One of
> these days I will have to look that and see if it is true.

Maybe. But surely before the PCI-X days.

> I do know
> it can be weird for hardware developers to support multiple kinds of
> decode.

Is this the only place where Linux uses 
pci_read_config_word/pci_read_config_dword?
I think such hardware will be pretty much DOA on all OS-es.  Why don't we wait
and see whether someone reports a broken config?

> As I recall for pci and pci-x at the hardware level the only
> difference in between 32bit transactions and smaller ones is the state
> of the byte-enable lines.

True, and same holds for PCI-Express.

So let's assume hardware implements RO correctly but ignores the BE bits -
nothing bad happens then, right?

-- 
MST
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-11 Thread Al Boldi
Con Kolivas wrote:
> On Monday 12 March 2007 08:52, Con Kolivas wrote:
> > And thank you! I think I know what's going on now. I think each rotation
> > is followed by another rotation before the higher priority task is
> > getting a look in in schedule() to even get quota and add it to the
> > runqueue quota. I'll try a simple change to see if that helps. Patch
> > coming up shortly.
>
> Can you try the following patch and see if it helps. There's also one
> minor preemption logic fix in there that I'm planning on including.
> Thanks!

Applied on top of v0.28 mainline, and there is no difference.

What's it look like on your machine?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs

2007-03-11 Thread David Chinner
On Wed, Mar 07, 2007 at 11:13:14AM +0100, Christoph Hellwig wrote:
> xfs_buf_get_noaddr.  There's a subtile change because
> xfs_buf_get_empty returns the buffer locked, but xfs_buf_get_noaddr
> returns it unlocked.  From my auditing and testing nothing in the
> log I/O code cares about this distincition, but I'd be happy if
> someone could try to prove this independently.

Looks safe to me - we initialise all the fields in the xfs_buf_t
when we allocate out of the slab, so it doesn't really matter what
state the buffer is in when we free it.

OTOH, all other buffers are supposed to be locked when under I/O.
This change makes a special case for the log buffers, and I'd prefer
not to have to remember that this behaviour changed fo log buffers
at some point in time.

I suggest that adding:

> - iclog->hic_data = (xlog_in_core_2_t *)
> -   kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE);
> -
>   iclog->ic_prev = prev_iclog;
>   prev_iclog = iclog;
> +
> + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp);
> + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone);
> + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb);
> + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1);

+   XFS_BUF_PSEMA(bp, PRIBIO);

> + iclog->ic_bp = bp;
> + iclog->hic_data = bp->b_addr;
> +
>   log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header);
>  
>   head = >ic_header;

To lock the buffer should be added here. That way we don't change
any semantics of the code at all.

> @@ -1216,11 +1221,6 @@
>   INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT);
>   memcpy(>h_fs_uuid, >m_sb.sb_uuid, sizeof(uuid_t));
>  
> - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp);
> - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone);
> - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb);
> - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1);
> - iclog->ic_bp = bp;
>  
>   iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize;
>   iclog->ic_state = XLOG_STATE_ACTIVE;
> @@ -1229,7 +1229,6 @@
>   iclog->ic_datap = (char *)iclog->hic_data + log->l_iclog_hsize;
>  
>   ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp));
> - ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0);

And this assert can then stay...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Null pointer in autofs4 (_spin_lock) in 2.6.21-rc2

2007-03-11 Thread Ian Kent
On Sun, 11 Mar 2007, Thomas Renninger wrote:

> On Thu, 2007-03-08 at 19:39 +0900, Ian Kent wrote:
> > On Thu, 2007-03-08 at 11:12 +0100, Thomas Renninger wrote:
> > > On Thu, 2007-03-08 at 01:28 -0800, Andrew Morton wrote:
> > > > > On Thu, 08 Mar 2007 09:57:56 +0100 Thomas Renninger <[EMAIL 
> > > > > PROTECTED]> wrote:
> > > > > I saw this happening several times on 2.6.21-rc2.
> > > > > Tell me how I can help...
> > > > > Some nfs partitions are mounted via nfs using autofs.
> > > > > It takes some hours to run into this:
> > > > > 
> > > > > Unable to handle kernel NULL pointer dereference at 0008
> > > > > RIP:
> > > > >  [] _spin_lock+0x0/0xf
> > > > > PGD 1dde23067 PUD 1d3060067 PMD 0
> > > > > Oops: 0002 [1] SMP
> > > > > CPU 3
> > > > > Modules linked in: autofs4 nfs lockd nfs_acl sunrpc asus_acpi 
> > > > > af_packet
> > > > > tg3 ipv6 button battery ac ext2 mbcache loop dm_mod floppy parport_pc 
> > > > > lp
> > > > > parport reiserfs pata_amd edd fan thermal sg processor sata_sil libata
> > > > > amd74xx sd_mod scsi_mod ide_disk ide_core
> > > > > Pid: 11373, comm: touch Not tainted 2.6.21-rc2-default #6
> > > > > RIP: 0010:[]  [] 
> > > > > _spin_lock+0x0/0xf
> > > > > RSP: 0018:8101c50a5a50  EFLAGS: 00010202
> > > > > RAX: 8100eb8916f8 RBX: 81010007dcd8 RCX: 8100ea45b280
> > > > > RDX: 10e58c2e RSI: 810163bf9e50 RDI: 0008
> > > > > RBP: 810163bf9e50 R08: 8101c50a4000 R09: 8101c50a5ea8
> > > > > R10: 81010003fca8 R11: 802299ad R12: 
> > > > > R13: 8100eb891680 R14: 0005 R15: 8101c50a5b48
> > > > > FS:  2b8ae744bf20() GS:81010016a7c0()
> > > > > knlGS:b7bd88d0
> > > > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > > > CR2: 0008 CR3: 0001b925f000 CR4: 06e0
> > > > > Process touch (pid: 11373, threadinfo 8101c50a4000, task
> > > > > 8101b78bd100)
> > > > > Stack:  882d5f38 8101c50a5ea8 8100ec8df4b0
> > > > > 00d0
> > > > >  8100eb8916f8 810163bf9efc 10e58c2eea45b220 8100ea45b220
> > > > >  810163bf9e50 8100ea45b220 8100ec8df4b0 8100ec8df568
> > > > > Call Trace:
> > > > >  [] :autofs4:autofs4_lookup+0xcb/0x311
> > > > >  [] do_lookup+0xc4/0x1ae
> > > > >  [] __link_path_walk+0x8ec/0xd9d
> > > > >  [] :sunrpc:rpcauth_lookup_credcache+0x12e/0x24a
> > > > >  [] link_path_walk+0x58/0xe0
> > > > >  [] __strncpy_from_user+0x17/0x41
> > > > >  [] __link_path_walk+0x5c9/0xd9d
> > > > >  [] link_path_walk+0x58/0xe0
> > > > >  [] __strncpy_from_user+0x17/0x41
> > > > >  [] do_path_lookup+0x1b6/0x217
> > > > >  [] __path_lookup_intent_open+0x56/0x97
> > > > >  [] open_namei+0xa9/0x64c
> > > > >  [] do_page_fault+0x45e/0x7ad
> > > > >  [] do_filp_open+0x1c/0x38
> > > > >  [] __strncpy_from_user+0x17/0x41
> > > > >  [] do_sys_open+0x44/0xc1
> > > > >  [] system_call+0x7e/0x83
> > > > > 
> > > > > 
> > > > > Code: f0 ff 0f 79 09 f3 90 83 3f 00 7e f9 eb f2 c3 f0 81 2f 00 00
> > > > > RIP  [] _spin_lock+0x0/0xf
> > > > >  RSP 
> > > > > CR2: 0008
> > > > 
> > > > I assume 2.6.20 is OK?
> > > Can't say for sure, I expect yes.
> > > Set up with 2.6.20 now and let it run for a day or two.
> > > Maybe someone has worked in that area and has an idea meanwhile...
> > 
> > Do we have any idea on what was being opened here?
> > Might be useful to see the autofs maps if possible.
> I sent that stuff to Ian...
> 
> However, I couldn't run into that with 2.6.20 and also not with
> *2.6.21-rc3* (yet). Maybe it already got fixed?
> Machine still running, I'll report back if this should happen again.

I suspect the problem is still present but maybe a bit hard to trigger.
I'm not convinced this is needed but it is the only thing that looks at 
all suspicious so if (when) you see this again could you give the patch 
below a try please.

Ian

---

--- linux-2.6.21-rc3/fs/autofs4/root.c.sbi-check2007-03-12 
13:29:42.0 +0900
+++ linux-2.6.21-rc3/fs/autofs4/root.c  2007-03-12 13:30:04.0 +0900
@@ -503,6 +503,9 @@ static struct dentry *autofs4_lookup_unh
const unsigned char *str = name->name;
struct list_head *p, *head;
 
+   if (!sbi)
+   return NULL;
+
spin_lock(_lock);
spin_lock(>rehash_lock);
head = >rehash_list;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [PATCH -mm 3/7] Freezer: Remove PF_NOFREEZE from rcutorture thread)

2007-03-11 Thread Paul E. McKenney
On Sun, Mar 11, 2007 at 06:49:08PM +0100, Rafael J. Wysocki wrote:
> On Saturday, 3 March 2007 18:32, Oleg Nesterov wrote:
> > On 03/02, Paul E. McKenney wrote:
> > >
> > > On Sat, Mar 03, 2007 at 02:33:37AM +0300, Oleg Nesterov wrote:
> > > > On 03/02, Paul E. McKenney wrote:
> > > > >
> > > > > One way to embed try_to_freeze() into kthread_should_stop() might be
> > > > > as follows:
> > > > > 
> > > > >   int kthread_should_stop(void)
> > > > >   {
> > > > >   if (kthread_stop_info.k == current)
> > > > >   return 1;
> > > > >   try_to_freeze();
> > > > >   return 0;
> > > > >   }
> > > > 
> > > > I think this is dangerous. For example, worker_thread() will probably
> > > > need some special actions after return from refrigerator. Also, a kernel
> > > > thread may check kthread_should_stop() in the place where 
> > > > try_to_freeze()
> > > > is not safe.
> > > > 
> > > > Perhaps we should introduce a new helper which does this.
> > > 
> > > Good point -- the return value from try_to_freeze() is lost if one uses
> > > the above approach.  About one third of the calls to try_to_freeze()
> > > in 2.6.20 pay attention to the return value.
> > > 
> > > One approach would be to have a kthread_should_stop_nofreeze() for those
> > > cases, and let the default be to try to freeze.
> > 
> > I personally think we should do the opposite, add 
> > kthread_should_stop_check_freeze()
> > or something. kthread_should_stop() is like signal_pending(), we can use
> > it under spin_lock (and it is probably used this way by some out-of-tree
> > driver). The new helper is obviously "might_sleep()".
> 
> Something like this, perhaps:

Looks good to me!  The other kthread_should_stop() calls in
rcutorture.c should also become kthread_should_top_check_freeze().

Acked-by: Paul E. McKenney <[EMAIL PROTECTED]>

>  include/linux/kthread.h |1 +
>  kernel/kthread.c|   16 
>  kernel/rcutorture.c |5 ++---
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> Index: linux-2.6.21-rc3-mm2/kernel/kthread.c
> ===
> --- linux-2.6.21-rc3-mm2.orig/kernel/kthread.c2007-03-08 
> 21:58:48.0 +0100
> +++ linux-2.6.21-rc3-mm2/kernel/kthread.c 2007-03-11 18:32:59.0 
> +0100
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  /*
> @@ -60,6 +61,21 @@ int kthread_should_stop(void)
>  }
>  EXPORT_SYMBOL(kthread_should_stop);
> 
> +/**
> + * kthread_should_stop_check_freeze - check if the thread should return now 
> and
> + * if not, check if there is a freezing request pending for it.
> + */
> +int kthread_should_stop_check_freeze(void)
> +{
> + might_sleep();
> + if (kthread_stop_info.k == current)
> + return 1;
> +
> + try_to_freeze();
> + return 0;
> +}
> +EXPORT_SYMBOL(kthread_should_stop_check_freeze);
> +
>  static void kthread_exit_files(void)
>  {
>   struct fs_struct *fs;
> Index: linux-2.6.21-rc3-mm2/include/linux/kthread.h
> ===
> --- linux-2.6.21-rc3-mm2.orig/include/linux/kthread.h 2007-02-04 
> 19:44:54.0 +0100
> +++ linux-2.6.21-rc3-mm2/include/linux/kthread.h  2007-03-11 
> 18:37:10.0 +0100
> @@ -29,5 +29,6 @@ struct task_struct *kthread_create(int (
>  void kthread_bind(struct task_struct *k, unsigned int cpu);
>  int kthread_stop(struct task_struct *k);
>  int kthread_should_stop(void);
> +int kthread_should_stop_check_freeze(void);
> 
>  #endif /* _LINUX_KTHREAD_H */
> Index: linux-2.6.21-rc3-mm2/kernel/rcutorture.c
> ===
> --- linux-2.6.21-rc3-mm2.orig/kernel/rcutorture.c 2007-03-11 
> 11:39:06.0 +0100
> +++ linux-2.6.21-rc3-mm2/kernel/rcutorture.c  2007-03-11 18:45:00.0 
> +0100
> @@ -540,10 +540,9 @@ rcu_torture_writer(void *arg)
>   }
>   rcu_torture_current_version++;
>   oldbatch = cur_ops->completed();
> - try_to_freeze();
> - } while (!kthread_should_stop() && !fullstop);
> + } while (!kthread_should_stop_check_freeze() && !fullstop);
>   VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping");
> - while (!kthread_should_stop())
> + while (!kthread_should_stop_check_freeze())
>   schedule_timeout_uninterruptible(1);
>   return 0;
>  }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.30 cpu scheduler for ... 2.6.18.8 kernel

2007-03-11 Thread Con Kolivas
On Monday 12 March 2007 19:17, Vincent Fortier wrote:
> > There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and
> > 2.6.21-rc3-mm2 to bring RSDL up to version 0.30 for download here:
> >
> > Full patches:
> >
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.20-sched-rsdl-0.30.p
> >at ch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2-rsdl-0.30.patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-sched-rsdl-0.
> >30 .patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2-rsdl-0.30
> >.p atch
> >
> > incrementals:
> >
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.20/2.6.20.2-rsdl-0.2
> >9- 0.30.patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2/2.6.20.2-rsdl-0
> >.2 9-0.30.patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3/2.6.21-rc3-rs
> >dl -0.29-0.30.patch
> > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2/2.6.21-rc
> >3- mm2-rsdl-0.29-0.30.patch

> And here are the backported RSDL 0.30 patches in case any of you would
> still be running an older 2.6.18.8 kernel ...

Thanks, your efforts are appreciated as it would take me quite a while to do a 
variety of backports that people are already requesting.

> Just for info, verison 0.30 seems around 2 seconds faster than 0.26-0.29
> versions at boot time.  I used to have around 2-3 seconds of difference
> between a vanilla and a rsdl patched kernel.  Now it looks more like 5
> seconds faster!  Wow.. nice work CK!
>
> 2.6.18.8 vanilla kernel:
> [   68.514248] ACPI: Power Button (CM) [PWRB]

> 2.6.18.8-rsdl-0.30:
> [   63.739337] ACPI: Power Button (CM) [PWRB]

Indeed there's almost 5 seconds difference there. To be honest, the boot time 
speedups are an unexpected bonus, but everyone seems to be reporting them on 
all flavours so perhaps all those timeout related driver setups are 
inadvertently benefiting.

> - vin

Thanks

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1

2007-03-11 Thread Paul E. McKenney
On Sun, Mar 11, 2007 at 06:02:31PM +0100, Michal Piotrowski wrote:
> On 10/03/07, Paul E. McKenney <[EMAIL PROTECTED]> wrote:
> >On Fri, Mar 09, 2007 at 06:18:51PM -0800, Andrew Morton wrote:
> >> > On Thu, 08 Mar 2007 21:50:29 +0100 Michal Piotrowski 
> ><[EMAIL PROTECTED]> wrote:
> >> > Andrew Morton napisaƂ(a):
> >> > > Temporarily at
> >> > >
> >> > >   http://userweb.kernel.org/~akpm/2.6.21-rc3-mm1/
> >> > >
> >> > > Will appear later at
> >> > >
> >> > >   
> >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc3/2.6.21-rc3-mm1/
> >> > >
> >> >
> >> > cpu_hotplug (AutoTest) hangs at this
> >> >
> >> > =
> >> > [ INFO: possible recursive locking detected ]
> >> > 2.6.21-rc3-mm1 #2
> >> > -
> >> > sh/7213 is trying to acquire lock:
> >> >  (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> >> >
> >> > but task is already holding lock:
> >> >  (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> >> >
> >> > other info that might help us debug this:
> >> > 4 locks held by sh/7213:
> >> >  #0:  (cpu_add_remove_lock){--..}, at: [] 
> >mutex_lock+0x1c/0x1f
> >> >  #1:  (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> >> >  #2:  (cache_chain_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> >> >  #3:  (workqueue_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> >>
> >> That's pretty useless, isn't it?  We need to know the mutex_lock() caller
> >> here.
> >>
> >> > stack backtrace
> >> >  [] show_trace_log_lvl+0x1a/0x2f
> >> >  [] show_trace+0x12/0x14
> >> >  [] dump_stack+0x16/0x18
> >> >  [] __lock_acquire+0x1aa/0xceb
> >> >  [] lock_acquire+0x79/0x93
> >> >  [] __mutex_lock_slowpath+0x107/0x349
> >> >  [] mutex_lock+0x1c/0x1f
> >> >  [] sched_getaffinity+0x14/0x91
> >> >  [] __synchronize_sched+0x11/0x5f
> >> >  [] detach_destroy_domains+0x2c/0x30
> >> >  [] update_sched_domains+0x27/0x3a
> >> >  [] notifier_call_chain+0x2b/0x4a
> >> >  [] __raw_notifier_call_chain+0x19/0x1e
> >> >  [] _cpu_down+0x70/0x282
> >> >  [] cpu_down+0x26/0x38
> >> >  [] store_online+0x27/0x5a
> >> >  [] sysdev_store+0x20/0x25
> >> >  [] sysfs_write_file+0xc1/0xe9
> >> >  [] vfs_write+0xd1/0x15a
> >> >  [] sys_write+0x3d/0x72
> >> >  [] syscall_call+0x7/0xb
> >> >
> >> > l *0xc033883a
> >> > 0xc033883a is in mutex_lock 
> >(/mnt/md0/devel/linux-mm/kernel/mutex.c:92).
> >> > 87  /*
> >> > 88   * The locking fastpath is the 1->0 transition from
> >> > 89   * 'unlocked' into 'locked' state.
> >> > 90   */
> >> > 91  __mutex_fastpath_lock(>count, 
> >__mutex_lock_slowpath);
> >> > 92  }
> >> > 93
> >> > 94  EXPORT_SYMBOL(mutex_lock);
> >> > 95
> >> > 96  static void fastcall noinline __sched
> >> >
> >> > I didn't test other -mm's with this test.
> >> >
> >> > 
> >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3-mm1/console.log
> >> > 
> >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3-mm1/mm-config
> >>
> >> I can't immediately spot the bug.  Probably it's caused by rcu-preempt's
> >> changes to synchronize_sched(): that function now does a heap more than 
> >it
> >> used to, including taking sched_hotcpu_muex.
> >>
> >> So, what to do about this.  Paul, I'm thinking that I should drop
> >> rcu-preempt for now - I don't think we ended up being able to identify 
> >any
> >> particular benefit which it brings to current mainline, and I suspect 
> >that
> >> things will become simpler if/when we start using the process freezer for
> >> CPU hotplug.
> >
> >It certainly makes sense for Michal to try backing out rcu-preempt using
> >your broken-out list of patches.  If that makes the problem go away,
> 
> Problem is caused by rcu-preempt.patch.

OK, clearly we need to fix this.  You might be right about the freezer
code having to go in first, Andrew -- will see!

Thanx, Paul

> >then I would certainly have a hard time arguing with you.  We are working
> >on getting measurements showing benefit of rcu-preempt, but aren't there
> >yet.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-11 Thread Con Kolivas
On Monday 12 March 2007 15:42, Al Boldi wrote:
> Con Kolivas wrote:
> > On Monday 12 March 2007 08:52, Con Kolivas wrote:
> > > And thank you! I think I know what's going on now. I think each
> > > rotation is followed by another rotation before the higher priority
> > > task is getting a look in in schedule() to even get quota and add it to
> > > the runqueue quota. I'll try a simple change to see if that helps.
> > > Patch coming up shortly.
> >
> > Can you try the following patch and see if it helps. There's also one
> > minor preemption logic fix in there that I'm planning on including.
> > Thanks!
>
> Applied on top of v0.28 mainline, and there is no difference.
>
> What's it look like on your machine?

The higher priority one always get 6-7ms whereas the lower priority one runs 
6-7ms and then one larger perfectly bound expiration amount. Basically 
exactly as I'd expect. The higher priority task gets precisely RR_INTERVAL 
maximum latency whereas the lower priority task gets RR_INTERVAL min and full 
expiration (according to the virtual deadline) as a maximum. That's exactly 
how I intend it to work. Yes I realise that the max latency ends up being 
longer intermittently on the niced task but that's -in my opinion- perfectly 
fine as a compromise to ensure the nice 0 one always gets low latency.

Eg:
nice 0 vs nice 10

nice 0:
pid 6288, prio   0, out for7 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms
pid 6288, prio   0, out for6 ms

nice 10:
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for   66 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms
pid 6290, prio  10, out for6 ms

exactly as I'd expect. If you want fixed latencies _of niced tasks_ in the 
presence of less niced tasks you will not get them with this scheduler. What 
you will get, though, is a perfectly bound relationship knowing exactly what 
the maximum latency will ever be.

Thanks for the test case. It's interesting and nice that it confirms this 
scheduler works as I expect it to.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Cong WANG

2007/3/12, Jan Engelhardt <[EMAIL PROTECTED]>:


On Mar 11 2007 22:15, Cong WANG wrote:
>
> I have a question about coding style in linux kernel. In
> Documention/CodingStyle, it is said that "Linux style for comments is
> the C89 "/* ... */" style. Don't use C99-style "// ..." comments."
> _But_ I see a lot of '//' style comments in current kernel code.
>
> Which is wrong? The documentions or the code, or neither? And why?

The code. And because it's not always reviewed but silently pushed.

> Another question is about NULL. AFAIK, in user space, using NULL is
> better than directly using 0 in C. In kernel, I know it used its own
> NULL, which may be defined as ((void*)0), but it's _still_ different
> from raw zero.

In what way?


The following code is picked from drivers/kvm/kvm_main.c:

static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot)
{
  struct kvm_vcpu *vcpu = >vcpus[vcpu_slot];

  mutex_lock(>mutex);
  if (unlikely(!vcpu->vmcs)) {
  mutex_unlock(>mutex);
  return 0;
  }
  return kvm_arch_ops->vcpu_load(vcpu);
}

Obviously, it used 0 rather than NULL when returning a pointer to
indicate an error. Should we fix such issue?



>So can I say using NULL is better than 0 in kernel?

On what basis? Do you even know what NULL is defined as in
(C, not C++) userspace? Think about it.



I think it's more clear to indicate we are using a pointer rather than
an integer when we use NULL in kernel. But in userspace, using NULL is
for portbility of the program, although most (*just* most, NOT all) of
NULL's defination is ((void*)0). ;-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-11 Thread Gene Heskett
On Sunday 11 March 2007, Con Kolivas wrote:
>On Sunday 11 March 2007 15:03, Matt Mackall wrote:
>> On Sat, Mar 10, 2007 at 10:01:32PM -0600, Matt Mackall wrote:
>> > On Sun, Mar 11, 2007 at 01:28:22PM +1100, Con Kolivas wrote:
>> > > Ok I don't think there's any actual accounting problem here per se
>> > > (although I did just recently post a bugfix for rsdl however I
>> > > think that's unrelated). What I think is going on in the ccache
>> > > testcase is that all the work is being offloaded to kernel threads
>> > > reading/writing to/from the filesystem and the make is not getting
>> > > any actual cpu time.
>> >
>> > I don't see significant system time while this is happening.
>>
>> Also, it's running pretty much entirely out of page cache so there
>> wouldn't be a whole lot for kernel threads to do.
>
>Well I can't reproduce that behaviour here at all whether from disk or
> the pagecache with ccache, so I'm not entirely sure what's different at
> your end. However both you and the other person reporting bad behaviour
> were using ATI drivers. That's about the only commonality? I wonder if
> they do need to yield... somewhat instead of not at all.

I hate to say it Con, but this one seems to have broken the amanda-tar 
symbiosis.

I haven't tried a plain 21-rc3, so the problem may exist there, and in 
fact it did for 21-rc1, but I don't recall if it was true for -rc2.  But 
I will have a plain 21-rc3 running by tomorrow nights amanda run to test.

What happens is that when amanda tells tar to do a level 1 or 2, tar still 
thinks its doing a level 0.  The net result is that the tape is filled 
completely and amanda does an EOT exit in about 10 of my 42 dle's.  This 
is tar-1.15-1 for fedora core 6.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
While it may be true that a watched pot never boils, the one you don't
keep an eye on can make an awful mess of your stove.
-- Edward Stevenson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Jan Engelhardt

On Mar 12 2007 13:37, Cong WANG wrote:
>
> The following code is picked from drivers/kvm/kvm_main.c:
>
> static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot)
> {
> struct kvm_vcpu *vcpu = >vcpus[vcpu_slot];
>
> mutex_lock(>mutex);
> if (unlikely(!vcpu->vmcs)) {
> mutex_unlock(>mutex);
> return 0;
> }
> return kvm_arch_ops->vcpu_load(vcpu);
> }
>
> Obviously, it used 0 rather than NULL when returning a pointer to
> indicate an error. Should we fix such issue?

Indeed. If it was for me, something like that should throw a compile error.

>>[...]
> I think it's more clear to indicate we are using a pointer rather than
> an integer when we use NULL in kernel. But in userspace, using NULL is
> for portbility of the program, although most (*just* most, NOT all) of
> NULL's defination is ((void*)0). ;-)

NULL has the same bit pattern as the number zero. (I'm not saying the bit
pattern is all zeroes. And I am not even sure if NULL ought to have the same
pattern as zero.) So C++ could use (void *)0, if it would let itself :p


>
>

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-11 Thread Con Kolivas
Hi Gene.

On Monday 12 March 2007 16:38, Gene Heskett wrote:
> I hate to say it Con, but this one seems to have broken the amanda-tar
> symbiosis.
>
> I haven't tried a plain 21-rc3, so the problem may exist there, and in
> fact it did for 21-rc1, but I don't recall if it was true for -rc2.  But
> I will have a plain 21-rc3 running by tomorrow nights amanda run to test.
>
> What happens is that when amanda tells tar to do a level 1 or 2, tar still
> thinks its doing a level 0.  The net result is that the tape is filled
> completely and amanda does an EOT exit in about 10 of my 42 dle's.  This
> is tar-1.15-1 for fedora core 6.

I'm sorry but I have to say I have no idea what any of this means. I gather 
you're making an association between some application combination failing and 
RSDL cpu scheduler. Unfortunately the details of what the problem is, or how 
the cpu scheduler is responsible, escape me :(

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Nicholas Miell
On Mon, 2007-03-12 at 06:40 +0100, Jan Engelhardt wrote:
> On Mar 12 2007 13:37, Cong WANG wrote:
> >
> > The following code is picked from drivers/kvm/kvm_main.c:
> >
> > static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot)
> > {
> > struct kvm_vcpu *vcpu = >vcpus[vcpu_slot];
> >
> > mutex_lock(>mutex);
> > if (unlikely(!vcpu->vmcs)) {
> > mutex_unlock(>mutex);
> > return 0;
> > }
> > return kvm_arch_ops->vcpu_load(vcpu);
> > }
> >
> > Obviously, it used 0 rather than NULL when returning a pointer to
> > indicate an error. Should we fix such issue?
> 
> Indeed. If it was for me, something like that should throw a compile error.
> 
> >>[...]
> > I think it's more clear to indicate we are using a pointer rather than
> > an integer when we use NULL in kernel. But in userspace, using NULL is
> > for portbility of the program, although most (*just* most, NOT all) of
> > NULL's defination is ((void*)0). ;-)
> 
> NULL has the same bit pattern as the number zero. (I'm not saying the bit
> pattern is all zeroes. And I am not even sure if NULL ought to have the same
> pattern as zero.) So C++ could use (void *)0, if it would let itself :p

Not necessarily. You can use 0 at the source level, but the compiler has
to convert it to the actual NULL pointer bit pattern, whatever it may
be.

In C++, NULL is typically defined to 0 (with no void* cast) by most
compilers because 0 (and only 0) can be implicitly converted to to null
pointer of any ponter type without a cast. 

GCC introduced the __null extension so that NULL still works correctly
in C++ when passed to a varargs function on 64-bit platforms.

(This just works in C because C makes NULL ((void*)0) is thus is the
right size. In C++, the 0 ends up being an int instead of a pointer when
passed to a varargs function, and things tend to blow up when they read
the garbage high bits. Of course, nobody else does this, so you still
have to use (void*)NULL to be portable.)

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: "Make nenuconfig" does not save parameters.

2007-03-11 Thread Cyrill Gorcunov

On 3/11/07, Sam Ravnborg <[EMAIL PROTECTED]> wrote:
[..snip..]
| > To make the conversion we should consider renaming from
| > current "Load alternate" to "Open config file..."
| > and likewise "Save alternate" to "Save config file as..."
| >
| > Comments?
| >
| >Sam
[..snip...]

I think that is excellent. (Actually I can't test it now but the idea
is just perfect)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Style Question

2007-03-11 Thread Randy.Dunlap
On Mon, 12 Mar 2007, Jan Engelhardt wrote:

>
> On Mar 12 2007 13:37, Cong WANG wrote:
> >
> > The following code is picked from drivers/kvm/kvm_main.c:
> >
> > static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot)
> > {
> > struct kvm_vcpu *vcpu = >vcpus[vcpu_slot];
> >
> > mutex_lock(>mutex);
> > if (unlikely(!vcpu->vmcs)) {
> > mutex_unlock(>mutex);
> > return 0;
> > }
> > return kvm_arch_ops->vcpu_load(vcpu);
> > }
> >
> > Obviously, it used 0 rather than NULL when returning a pointer to
> > indicate an error. Should we fix such issue?
>
> Indeed. If it was for me, something like that should throw a compile error.

At least it does throw a sparse warning, and yes, it should
be fixed.

> >>[...]
> > I think it's more clear to indicate we are using a pointer rather than
> > an integer when we use NULL in kernel. But in userspace, using NULL is
> > for portbility of the program, although most (*just* most, NOT all) of
> > NULL's defination is ((void*)0). ;-)
>
> NULL has the same bit pattern as the number zero. (I'm not saying the bit
> pattern is all zeroes. And I am not even sure if NULL ought to have the same
> pattern as zero.) So C++ could use (void *)0, if it would let itself :p
>
>
> >
> >
>
> Jan
>

-- 
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL for 2.6.21-rc3- 0.29

2007-03-11 Thread Gene Heskett
On Monday 12 March 2007, Con Kolivas wrote:
>Hi Gene.
>
>On Monday 12 March 2007 16:38, Gene Heskett wrote:
>> I hate to say it Con, but this one seems to have broken the amanda-tar
>> symbiosis.
>>
>> I haven't tried a plain 21-rc3, so the problem may exist there, and in
>> fact it did for 21-rc1, but I don't recall if it was true for -rc2. 
>> But I will have a plain 21-rc3 running by tomorrow nights amanda run
>> to test.
>>
>> What happens is that when amanda tells tar to do a level 1 or 2, tar
>> still thinks its doing a level 0.  The net result is that the tape is
>> filled completely and amanda does an EOT exit in about 10 of my 42
>> dle's.  This is tar-1.15-1 for fedora core 6.
>
>I'm sorry but I have to say I have no idea what any of this means. I
> gather you're making an association between some application
> combination failing and RSDL cpu scheduler. Unfortunately the details
> of what the problem is, or how the cpu scheduler is responsible, escape
> me :(

I have another backup running right now, after building a plain 
2.6.21-rc3, and rebooting just now for the test.  I don't think its the 
scheduler itself, but is something post 2.6.20 that is messing with tars 
mind and making it think the files it just read to do the estimate phase, 
are all new, so even a level 2 is in effect a level 0.  I'll have an 
answer in about an hour, but its also 2:36am here and I'm headed for the 
rack to get some zzz's.  So I'll report in the morning as to whether or 
not this backup ran as it was supposed to.  I have a feeling its not 
going to though.


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"When it comes to humility, I'm the greatest."
-- Bullwinkle Moose

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]Replace 0 with NULL when returning a pointer

2007-03-11 Thread Cong WANG

Use NULL to indicate we are returning a pointer rather than an integer
and to eliminate some sparse warnings.

Signed-off-by: Cong WANG <[EMAIL PROTECTED]>

---
--- drivers/kvm/kvm_main.c.orig 2007-03-11 21:41:23.0 +0800
+++ drivers/kvm/kvm_main.c  2007-03-12 14:26:17.0 +0800
@@ -205,7 +205,7 @@ static struct kvm_vcpu *vcpu_load(struct
mutex_lock(>mutex);
if (unlikely(!vcpu->vmcs)) {
mutex_unlock(>mutex);
-   return 0;
+   return NULL;
}
return kvm_arch_ops->vcpu_load(vcpu);
}
@@ -799,7 +799,7 @@ struct kvm_memory_slot *gfn_to_memslot(s
&& gfn < memslot->base_gfn + memslot->npages)
return memslot;
}
-   return 0;
+   return NULL;
}
EXPORT_SYMBOL_GPL(gfn_to_memslot);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]Replace 0 with NULL when returning a pointer

2007-03-11 Thread Cong WANG

Use NULL to indicate we are returning a pointer rather than an integer
and to eliminate some sparse warnings.

Signed-off-by: Cong WANG <[EMAIL PROTECTED]>
---
--- drivers/kvm/vmx.c.orig  2007-03-11 21:41:03.0 +0800
+++ drivers/kvm/vmx.c   2007-03-12 14:25:11.0 +0800
@@ -98,7 +98,7 @@ static struct vmx_msr_entry *find_msr_en
for (i = 0; i < vcpu->nmsrs; ++i)
if (vcpu->guest_msrs[i].index == msr)
return >guest_msrs[i];
-   return 0;
+   return NULL;
}

static void vmcs_clear(struct vmcs *vmcs)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/8] per backing_dev dirty and writeback page accounting

2007-03-11 Thread David Chinner
On Tue, Mar 06, 2007 at 07:04:46PM +0100, Miklos Szeredi wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> 
> [EMAIL PROTECTED]: bugfix]
> 
> Miklos Szeredi <[EMAIL PROTECTED]>:
> 
> Changes:
>  - updated to apply after clear_page_dirty_for_io() race fix
> 
> This is needed for
> 
>  - balance_dirty_pages() deadlock fix
>  - fuse dirty page accounting
> 
> I have no idea how serious the scalability problems with this are.  If
> they are serious, different solutions can probably be found for the
> above, but this is certainly the simplest.

Atomic operations to a single per-backing device from all CPUs at once?
That's a pretty serious scalability issue and it will cause a major
performance regression for XFS.

I'd call this a showstopper right now - maybe you need to look at
something like the ZVC code that Christoph Lameter wrote, perhaps?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2

2007-03-11 Thread Radoslaw Szkodzinski

On 3/11/07, Gene Heskett <[EMAIL PROTECTED]> wrote:

On Sunday 11 March 2007, Mike Galbraith wrote:

Just to comment, I've been running one of the patches between 20-ck1 and
this latest one, which is building as I type, but I also run gkrellm
here, version 2.2.9.

Since I have been running this middle of this series patch, something is
killing gkrellm about once a day, and there is nothing in the logs to
indicate a problem.  I see a blink out of the corner of my eye, and its
gone.  And it always starts right back up from a kmenu click.

No idea if anyone else is experiencing this or not.

--
Cheers, Gene


I've had such an issue with 0.20 or something. Sometimes, the
xfce4-panel would disappear (die) when I displayed its menu.
Very rare issue.

Doesn't happen with 0.28 anyway. :-) Which looks really good, though
I'll update to 0.30.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] two more device ids for dm9601 usbnet driver

2007-03-11 Thread Peter Korsgaard
> "Jon" == Jon Dowland <[EMAIL PROTECTED]> writes:

Hi,

 Jon> This patch for the linux-usb-devel tree adds two more
 Jon> product ids to the dm9601 driver. These ids were found on
 Jon> rebadged dm9601 devices in the wild.

 Jon> Signed-off-by: Jon Dowland <[EMAIL PROTECTED]>

Acked-by: Peter Korsgaard <[EMAIL PROTECTED]>

-- 
Bye, Peter Korsgaard
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc2-mm2: drivers/net/wireless/libertas/debugfs.c addr bogosity

2007-03-11 Thread Tony Breeds
On Fri, Mar 09, 2007 at 09:14:29AM -0800, Randy Dunlap wrote:
 
> Good to use FIELD_SIZEOF(),

Thanks.

> but in general, we prefer to use it
> directly, not in yet another wrapper.

I left the item_{size,addr} in place as it seemed to make the item[]
more compact.

I'm not certain using the FIELD_SIZEOF() macro directly is a win.

From: Tony Breeds <[EMAIL PROTECTED]>

Cleanup drivers/net/wireless/libertas/debugfs.c to use standard kernel macros 
and functions.

Signed-off-by: Tony Breeds <[EMAIL PROTECTED]>

---
only compile tested on x86

 drivers/net/wireless/libertas/debugfs.c |   56 +++
 1 files changed, 12 insertions(+), 44 deletions(-)

diff --git a/drivers/net/wireless/libertas/debugfs.c 
b/drivers/net/wireless/libertas/debugfs.c
index 3ad1e03..8b0e3ec 100644
--- a/drivers/net/wireless/libertas/debugfs.c
+++ b/drivers/net/wireless/libertas/debugfs.c
@@ -1771,58 +1771,26 @@ void libertas_debugfs_remove_one(wlan_private *priv)
 }
 
 /* debug entry */
-
-#define item_size(n) (sizeof ((wlan_adapter *)0)->n)
-#define item_addr(n) ((u32) &((wlan_adapter *)0)->n)
-
 struct debug_data {
char name[32];
u32 size;
u32 addr;
 };
 
-/* To debug any member of wlan_adapter, simply add one line here.
- */
+/* To debug any member of wlan_adapter, simply add a record here. */
 static struct debug_data items[] = {
-   {"intcounter", item_size(intcounter), item_addr(intcounter)},
-   {"psmode", item_size(psmode), item_addr(psmode)},
-   {"psstate", item_size(psstate), item_addr(psstate)},
+   { .name = "intcounter",
+ .size = FIELD_SIZEOF(wlan_adapter, intcounter),
+ .addr = offsetof(wlan_adapter, intcounter) },
+   { .name = "psmode",
+ .size = FIELD_SIZEOF(wlan_adapter, psmode),
+ .addr = offsetof(wlan_adapter, psmode) },
+   { .name = "psstate",
+ .size = FIELD_SIZEOF(wlan_adapter, psstate),
+ .addr = offsetof(wlan_adapter, psstate) },
 };
 
-static int num_of_items = sizeof(items) / sizeof(items[0]);
-
-/**
- *  @brief convert string to number
- *
- *  @param s  pointer to numbered string
- *  @return   converted number from string s
- */
-static int string_to_number(char *s)
-{
-   int r = 0;
-   int base = 0;
-
-   if ((strncmp(s, "0x", 2) == 0) || (strncmp(s, "0X", 2) == 0))
-   base = 16;
-   else
-   base = 10;
-
-   if (base == 16)
-   s += 2;
-
-   for (s = s; *s != 0; s++) {
-   if ((*s >= 48) && (*s <= 57))
-   r = (r * base) + (*s - 48);
-   else if ((*s >= 65) && (*s <= 70))
-   r = (r * base) + (*s - 55);
-   else if ((*s >= 97) && (*s <= 102))
-   r = (r * base) + (*s - 87);
-   else
-   break;
-   }
-
-   return r;
-}
+static int num_of_items = ARRAY_SIZE(items);
 
 /**
  *  @brief proc read function
@@ -1912,7 +1880,7 @@ static int wlan_debugfs_write(struct file *f, const char 
__user *buf,
if (!p2)
break;
p2++;
-   r = string_to_number(p2);
+   r = simple_strtoul(p2, NULL, 0);
if (d[i].size == 1)
*((u8 *) d[i].addr) = (u8) r;
else if (d[i].size == 2)


Yours Tony

  linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] libata fixes

2007-03-11 Thread Tejun Heo
Hello, Linus.

Linus Torvalds wrote:
> On Sun, 11 Mar 2007, Paul Rolland wrote:
>> My machine is having two problems : the one you are describing above,
>> which is due to a SIL controler being connected to one port of the ICH7
>> (at least, it seems to), and probing it goes  timeout, but nothing is
>> connected on it.
> 
> Ok, so that's just a message irritation, not actually bothersome 
> otherwise?

It involves a long timeout, so it's bothersome.  This is caused by
Silicon Image 4726/3726 storage processor (SATA Port Multiplier with
extra features) attached to one of the ICH ports.

If the first  downstream port in the PMP is empty and it gets reset in
non-PMP way, it identifies itself as "Config Disk" of quite small size.
 It's probably used to configure the extra features using standard ATA
RW commands.  Anyways, this "Config Disk" is a bit peculiar and doesn't
work very well with the current ATA reset sequence and gets identified
only after a few failures thus causing long timeout.

I keep forgetting about this.  I'll ask SIMG how to deal with this.  For
the time being, connecting a device to the PMP port should remove the
timeouts.

>> The second problem is a Jmicron363 controler that is failing to detect
>> the DVD-RW that is connected, unless I use the irqpoll option as Tejun has
>> suggested.
> 
> .. and this one has never worked without irqpoll?
> 
>> But, as you suggest it, I'm adding pci=nomsi to the command line
>> rebooting... no change for this part of the problem.
>>
>> OK, the /proc/interrupt for this config, and the dmesg attached.
>>
>> 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts 
>>CPU0   CPU1   
>>   0: 297549  0   IO-APIC-edge  timer
>>   1:  7  0   IO-APIC-edge  i8042
>>   4: 13  0   IO-APIC-edge  serial
>>   6:  5  0   IO-APIC-edge  floppy
>>   8:  1  0   IO-APIC-edge  rtc
>>   9:  0  0   IO-APIC-fasteoi   acpi
>>  12:126  0   IO-APIC-edge  i8042
>>  14:   8313  0   IO-APIC-edge  libata
>>  15:  0  0   IO-APIC-edge  libata
>>  16:  0  0   IO-APIC-fasteoi   eth1, libata
> 
> So it's the irq16 one that is the Jmicron controller and just isn't 
> getting any interrupts?
> 
> Since all the other interrupts work (and MSI worked for other 
> controllers), I don't think it's interrupt-routing related. Especially as 
> MSI shouldn't even care about things like that.
> 
> And since it all works when "irqpoll" is used, that implies that the 
> *only* thing that is broken is literally irq delivery.
> 
> Is there possibly some jmicron-specific "enable interrupts" bit? 

(cc'ing Justin of JMicron.  Hello, please correct me if I'm wrong.)

Not that I know of.  The PATA portion of JMB controllers is bog standard
PCI BMDMA ATA device where ATA_NIEN is the way to turn IRQ on and off.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] libata fixes

2007-03-11 Thread Tejun Heo
Of course I forgot to CC.  :-)  Quoting whole message for Justin.

Tejun Heo wrote:
> Hello, Linus.
> 
> Linus Torvalds wrote:
>> On Sun, 11 Mar 2007, Paul Rolland wrote:
>>> My machine is having two problems : the one you are describing above,
>>> which is due to a SIL controler being connected to one port of the ICH7
>>> (at least, it seems to), and probing it goes  timeout, but nothing is
>>> connected on it.
>> Ok, so that's just a message irritation, not actually bothersome 
>> otherwise?
> 
> It involves a long timeout, so it's bothersome.  This is caused by
> Silicon Image 4726/3726 storage processor (SATA Port Multiplier with
> extra features) attached to one of the ICH ports.
> 
> If the first  downstream port in the PMP is empty and it gets reset in
> non-PMP way, it identifies itself as "Config Disk" of quite small size.
>  It's probably used to configure the extra features using standard ATA
> RW commands.  Anyways, this "Config Disk" is a bit peculiar and doesn't
> work very well with the current ATA reset sequence and gets identified
> only after a few failures thus causing long timeout.
> 
> I keep forgetting about this.  I'll ask SIMG how to deal with this.  For
> the time being, connecting a device to the PMP port should remove the
> timeouts.
> 
>>> The second problem is a Jmicron363 controler that is failing to detect
>>> the DVD-RW that is connected, unless I use the irqpoll option as Tejun has
>>> suggested.
>> .. and this one has never worked without irqpoll?
>>
>>> But, as you suggest it, I'm adding pci=nomsi to the command line
>>> rebooting... no change for this part of the problem.
>>>
>>> OK, the /proc/interrupt for this config, and the dmesg attached.
>>>
>>> 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts 
>>>CPU0   CPU1   
>>>   0: 297549  0   IO-APIC-edge  timer
>>>   1:  7  0   IO-APIC-edge  i8042
>>>   4: 13  0   IO-APIC-edge  serial
>>>   6:  5  0   IO-APIC-edge  floppy
>>>   8:  1  0   IO-APIC-edge  rtc
>>>   9:  0  0   IO-APIC-fasteoi   acpi
>>>  12:126  0   IO-APIC-edge  i8042
>>>  14:   8313  0   IO-APIC-edge  libata
>>>  15:  0  0   IO-APIC-edge  libata
>>>  16:  0  0   IO-APIC-fasteoi   eth1, libata
>> So it's the irq16 one that is the Jmicron controller and just isn't 
>> getting any interrupts?
>>
>> Since all the other interrupts work (and MSI worked for other 
>> controllers), I don't think it's interrupt-routing related. Especially as 
>> MSI shouldn't even care about things like that.
>>
>> And since it all works when "irqpoll" is used, that implies that the 
>> *only* thing that is broken is literally irq delivery.
>>
>> Is there possibly some jmicron-specific "enable interrupts" bit? 
> 
> (cc'ing Justin of JMicron.  Hello, please correct me if I'm wrong.)
> 
> Not that I know of.  The PATA portion of JMB controllers is bog standard
> PCI BMDMA ATA device where ATA_NIEN is the way to turn IRQ on and off.
> 
> Thanks.
> 


-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5