[patch 6/9] signalfd/timerfd v3 - timerfd core ...
This patch introduces a new system call for timers events delivered though file descriptors. This allows timer event to be used with standard POSIX poll(2), select(2) and read(2). As a consequence of supporting the Linux f_op->poll subsystem, they can be used with epoll(2) too. The system call is defined as: int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr); The "ufd" parameter allows for re-use (re-programming) of an existing timerfd w/out going through the close/open cycle (same as signalfd). If "ufd" is -1, s new file descriptor will be created, otherwise the existing "ufd" will be re-programmed. The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The "tmrtype" parameter allows to specify the timer type. The following values are supported: TFD_TIMER_REL The time specified in the "utmr" parameter is a relative time from NOW. TFD_TIMER_ABS The timer specified in the "utmr" parameter is an absolute time. TFD_TIMER_SEQ The time specified in the "utmr" parameter is an interval at which a continuous clock rate will be generated. The function returns the new (or same, in case "ufd" is a valid timerfd descriptor) file, or -1 in case of error. As stated before, the timerfd file descriptor supports poll(2), select(2) and epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be returned. The read(2) call can be used, and it will return a u32 variable holding the number of "ticks" that happened on the interface since the last call to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will be returned if no ticks happened. A quick test program, shows timerfd working correctly on my amd64 box: http://www.xmailserver.org/timerfd-test.c Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/timerfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.20.ep2/fs/timerfd.c 2007-03-11 14:32:47.0 -0700 @@ -0,0 +1,295 @@ +/* + * fs/timerfd.c + * + * Copyright (C) 2007 Davide Libenzi + * + * + * Thanks to Thomas Gleixner for code review and useful comments. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + + + +struct timerfd_ctx { + struct hrtimer tmr; + int clockid; + enum hrtimer_mode htmode; + ktime_t texp, tintv; + int tmrtype; + spinlock_t lock; + wait_queue_head_t wqh; + unsigned long ticks; +}; + + +static int timerfd_tmrproc(struct hrtimer *htmr); +static int timerfd_setup(struct timerfd_ctx *ctx, int clockid, int tmrtype, +const struct itimerspec *ktmr); +static void timerfd_cleanup(struct timerfd_ctx *ctx); +static int timerfd_close(struct inode *inode, struct file *file); +static unsigned int timerfd_poll(struct file *file, poll_table *wait); +static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count, + loff_t *ppos); + + + +static const struct file_operations timerfd_fops = { + .release= timerfd_close, + .poll = timerfd_poll, + .read = timerfd_read, +}; +static struct kmem_cache *timerfd_ctx_cachep; + + + +static int timerfd_tmrproc(struct hrtimer *htmr) +{ + struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr); + int rval = HRTIMER_NORESTART; + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + ctx->ticks++; + wake_up_locked(>wqh); + if (ctx->tmrtype == TFD_TIMER_SEQ) { + hrtimer_forward(htmr, htmr->base->softirq_time, ctx->tintv); + rval = HRTIMER_RESTART; + } + spin_unlock_irqrestore(>lock, flags); + + return rval; +} + + +static int timerfd_setup(struct timerfd_ctx *ctx, int clockid, int tmrtype, +const struct itimerspec *ktmr) +{ + enum hrtimer_mode htmode; + ktime_t texp, tintv; + + if (clockid != CLOCK_MONOTONIC && + clockid != CLOCK_REALTIME) + return -EINVAL; + switch (tmrtype) { + case TFD_TIMER_SEQ: + if (!timespec_valid(>it_interval)) + return -EINVAL; + tintv = timespec_to_ktime(ktmr->it_interval); + case TFD_TIMER_ABS: + if (!timespec_valid(>it_value)) + return -EINVAL; + htmode = HRTIMER_ABS; + texp = timespec_to_ktime(ktmr->it_value); + break; + case TFD_TIMER_REL: + if (!timespec_valid(>it_interval)) + return -EINVAL; + texp = timespec_to_ktime(ktmr->it_interval); + tintv = ktime_set(0, 0); + htmode = HRTIMER_REL; +
[patch 2/9] signalfd/timerfd v3 - signalfd core ...
This patch series implements the new signalfd() system call. I took part of the original Linus code (and you know how badly it can be broken :), and I added even more breakage ;) Signals are fetched from the same signal queue used by the process, so signalfd will compete with standard kernel delivery in dequeue_signal(). If you want to reliably fetch signals on the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This seems to be working fine on my Dual Opteron machine. I made a quick test program for it: http://www.xmailserver.org/signafd-test.c The signalfd() system call implements signal delivery into a file descriptor receiver. The signalfd file descriptor if created with the following API: int signalfd(int ufd, const sigset_t *mask, size_t masksize); The "ufd" parameter allows to change an existing signalfd sigmask, w/out going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new signalfd file. The "mask" allows to specify the signal mask of signals that we are interested in. The "masksize" parameter is the size of "mask". The signalfd fd supports the poll(2) and read(2) system calls. The poll(2) will return POLLIN when signals are available to be dequeued. As a direct consequence of supporting the Linux poll subsystem, the signalfd fd can use used together with epoll(2) too. The read(2) system call will return a "struct signalfd_siginfo" structure in the userspace supplied buffer. The return value is the number of bytes copied in the supplied buffer, or -1 in case of error. The read(2) call can also return 0, in case the sighand structure to which the signalfd was attached, has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will return -EAGAIN in case no signal is available. The format of the struct signalfd_siginfo is, and the valid fields depends of the (->code & __SI_MASK) value, in the same way a struct siginfo would: struct signalfd_siginfo { __u32 signo;/* si_signo */ __s32 err; /* si_errno */ __s32 code; /* si_code */ __u32 pid; /* si_pid */ __u32 uid; /* si_uid */ __s32 fd; /* si_fd */ __u32 tid; /* si_fd */ __u32 band; /* si_band */ __u32 overrun; /* si_overrun */ __u32 trapno; /* si_trapno */ __s32 status; /* si_status */ __s32 svint;/* si_int */ __u64 svptr;/* si_ptr */ __u64 utime;/* si_utime */ __u64 stime;/* si_stime */ __u64 addr; /* si_addr */ }; Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/signalfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.20.ep2/fs/signalfd.c 2007-03-11 14:28:37.0 -0700 @@ -0,0 +1,381 @@ +/* + * fs/signalfd.c + * + * Copyright (C) 2003 Linus Torvalds + * + * Mon Mar 5, 2007: Davide Libenzi + * Changed ->read() to return a siginfo strcture instead of signal number. + * Fixed locking in ->poll(). + * Added sighand-detach notification. + * Added fd re-use in sys_signalfd() syscall. + * Now using anonymous inode source. + * Thanks to Oleg Nesterov for useful code review and suggestions. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + + + +struct signalfd_ctx { + struct list_head lnk; + wait_queue_head_t wqh; + sigset_t sigmask; + struct task_struct *tsk; +}; + + + +static struct sighand_struct *signalfd_get_sighand(struct signalfd_ctx *ctx, + unsigned long *flags); +static void signalfd_put_sighand(struct signalfd_ctx *ctx, +struct sighand_struct *sighand, +unsigned long *flags); +static void signalfd_cleanup(struct signalfd_ctx *ctx); +static int signalfd_close(struct inode *inode, struct file *file); +static unsigned int signalfd_poll(struct file *file, poll_table *wait); +static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo, +siginfo_t const *kinfo); +static ssize_t signalfd_read(struct file *file, char __user *buf, size_t count, +loff_t *ppos); + + + +static const struct file_operations signalfd_fops = { + .release= signalfd_close, + .poll = signalfd_poll, + .read = signalfd_read, +}; +static struct kmem_cache *signalfd_ctx_cachep; + + + +static struct sighand_struct *signalfd_get_sighand(struct signalfd_ctx *ctx, + unsigned long *flags) +{ + struct sighand_struct *sighand; + + rcu_read_lock(); + sighand = lock_task_sighand(ctx->tsk, flags); + rcu_read_unlock(); + + if (sighand && list_empty(>lnk)) { +
[patch 9/9] signalfd/timerfd v3 - timerfd compat code ...
This patch implement the necessary compat code for the timerfd system call. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/fs/compat.c === --- linux-2.6.20.ep2.orig/fs/compat.c 2007-03-11 14:28:48.0 -0700 +++ linux-2.6.20.ep2/fs/compat.c2007-03-11 14:35:22.0 -0700 @@ -2257,3 +2257,23 @@ return sys_signalfd(ufd, ksigmask, sizeof(sigset_t)); } + +asmlinkage long compat_sys_timerfd(int ufd, int clockid, int tmrtype, + const struct compat_itimerspec __user *utmr) +{ + long res; + struct itimerspec t; + struct itimerspec __user *ut; + + res = -EFAULT; + if (get_compat_itimerspec(, utmr)) + goto err_exit; + ut = compat_alloc_user_space(sizeof(*ut)); + if (copy_to_user(ut, , sizeof(t)) ) + goto err_exit; + + res = sys_timerfd(ufd, clockid, tmrtype, ut); +err_exit: + return res; +} + Index: linux-2.6.20.ep2/include/linux/compat.h === --- linux-2.6.20.ep2.orig/include/linux/compat.h2007-03-11 14:39:53.0 -0700 +++ linux-2.6.20.ep2/include/linux/compat.h 2007-03-11 14:45:07.0 -0700 @@ -225,6 +225,11 @@ return lhs->tv_nsec - rhs->tv_nsec; } +extern int get_compat_itimerspec(struct itimerspec *dst, +const struct compat_itimerspec __user *src); +extern int put_compat_itimerspec(struct compat_itimerspec __user *dst, +const struct itimerspec *src); + asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp); extern int compat_printk(const char *fmt, ...); Index: linux-2.6.20.ep2/kernel/compat.c === --- linux-2.6.20.ep2.orig/kernel/compat.c 2007-03-11 14:39:18.0 -0700 +++ linux-2.6.20.ep2/kernel/compat.c2007-03-11 14:45:13.0 -0700 @@ -475,8 +475,8 @@ return min_length; } -static int get_compat_itimerspec(struct itimerspec *dst, -struct compat_itimerspec __user *src) +int get_compat_itimerspec(struct itimerspec *dst, + const struct compat_itimerspec __user *src) { if (get_compat_timespec(>it_interval, >it_interval) || get_compat_timespec(>it_value, >it_value)) @@ -484,8 +484,8 @@ return 0; } -static int put_compat_itimerspec(struct compat_itimerspec __user *dst, -struct itimerspec *src) +int put_compat_itimerspec(struct compat_itimerspec __user *dst, + const struct itimerspec *src) { if (put_compat_timespec(>it_interval, >it_interval) || put_compat_timespec(>it_value, >it_value)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix read past end of array in md/linear.c
On Thursday March 8, [EMAIL PROTECTED] wrote: > On Thu, Mar 08, 2007 at 12:52:04PM -0800, Andy Isaacson wrote: > > Index: linus/drivers/md/linear.c > > === > > --- linus.orig/drivers/md/linear.c 2007-03-02 11:35:55.0 -0800 > > +++ linus/drivers/md/linear.c 2007-03-07 13:10:30.0 -0800 > > @@ -188,7 +188,7 @@ > > for (i=0; i < cnt-1 ; i++) { > > sector_t sz = 0; > > int j; > > - for (j=i; i > + for (j=i; j > sz += conf->disks[j].size; > > if (sz >= min_spacing && sz < conf->hash_spacing) > > conf->hash_spacing = sz; > > Forgot to add: > > Signed-off-by: Andrew Isaacson <[EMAIL PROTECTED]> And Acked-by: NeilBrown <[EMAIL PROTECTED]> Thanks! I would have replied earlier but I wanted to make sure I understood exactly what the possible consequences of this bug were.. and they are quite benign. The worst possible outcome is going so far off the end of the array that you hit un-mapped memory and Oops. If that doesn't happen, then the next worst option is that the hash table is sized poorly and you spend a few more cycles than needed choosing the target device for the request (we still always choose the right device). Thanks, NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 8/9] signalfd/timerfd v3 - timerfd wire up x86_64 arch ...
This patch wire the timerfd system call to the x86_64 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S === --- linux-2.6.20.ep2.orig/arch/x86_64/ia32/ia32entry.S 2007-03-11 14:28:46.0 -0700 +++ linux-2.6.20.ep2/arch/x86_64/ia32/ia32entry.S 2007-03-11 14:33:56.0 -0700 @@ -720,4 +720,5 @@ .quad sys_getcpu .quad sys_epoll_pwait .quad sys_signalfd /* 320 */ + .quad sys_timerfd ia32_syscall_end: Index: linux-2.6.20.ep2/include/asm-x86_64/unistd.h === --- linux-2.6.20.ep2.orig/include/asm-x86_64/unistd.h 2007-03-11 14:28:46.0 -0700 +++ linux-2.6.20.ep2/include/asm-x86_64/unistd.h2007-03-11 14:33:56.0 -0700 @@ -621,8 +621,10 @@ __SYSCALL(__NR_move_pages, sys_move_pages) #define __NR_signalfd 280 __SYSCALL(__NR_signalfd, sys_signalfd) +#define __NR_timerfd 281 +__SYSCALL(__NR_timerfd, sys_timerfd) -#define __NR_syscall_max __NR_signalfd +#define __NR_syscall_max __NR_timerfd #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...
On Sun, 11 Mar 2007, Davide Libenzi wrote: > This patch introduces a new system call for timers events delivered > though file descriptors. This allows timer event to be used with > standard POSIX poll(2), select(2) and read(2). As a consequence of > supporting the Linux f_op->poll subsystem, they can be used with > epoll(2) too. > The system call is defined as: > > int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr); > > The "ufd" parameter allows for re-use (re-programming) of an existing > timerfd w/out going through the close/open cycle (same as signalfd). > If "ufd" is -1, s new file descriptor will be created, otherwise the > existing "ufd" will be re-programmed. > The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. > The "tmrtype" parameter allows to specify the timer type. The following > values are supported: > > TFD_TIMER_REL > The time specified in the "utmr" parameter is a relative time > from NOW. > > TFD_TIMER_ABS > The timer specified in the "utmr" parameter is an absolute time. > > TFD_TIMER_SEQ > The time specified in the "utmr" parameter is an interval at > which a continuous clock rate will be generated. > Duh! Forgot to update the documenation. Now timerfd() gets an itimerspec. For TFD_TIMER_REL only the it_interval is valid, and it's the relative time. For TFD_TIMER_ABS, only the it_value is valid, and that the expiry absolute time. For TFD_TIMER_SEQ, it_value tells when the first tick should be generated, and it_interval tells the period of the following ticks. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mar 11 2007 18:01, Kyle Moffett wrote: > On Mar 11, 2007, at 16:41:51, Daniel Hazelton wrote: >> On Sunday 11 March 2007 16:35:50 Jan Engelhardt wrote: >> > On Mar 11 2007 22:15, Cong WANG wrote: >> > > So can I say using NULL is better than 0 in kernel? >> > >> > On what basis? Do you even know what NULL is defined as in (C, not >> > C++) userspace? Think about it. >> >> IIRC, the glibc and GCC headers define NULL as (void*)0 :) > > On the other hand when __cplusplus is defined they define it to the > "__null" builtin, which GCC uses to give type conversion errors for > "int foo = NULL" but not "char *foo = NULL". A "((void *)0)" > definition gives C++ type errors for both due to the broken C++ > void pointer conversion problems. I think that the primary reason they use __null is so that you can actually do class foo *ptr = NULL; because class foo *ptr = (void *)0; would throw an error or at least a warning (implicit cast from void* to class foo*). Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] BUILD_BUG_ON_ZERO -> BUILD_BUG_OR_ZERO
BUILD_BUG_ON_ZERO is named perfectly wrong, and BUILD_BUG_ON_RETURN_ZERO is too long. Flip three bits, and the name is much more suitable. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> diff -r 6fb745a5bb51 include/linux/compiler-gcc.h --- a/include/linux/compiler-gcc.h Mon Mar 12 09:12:20 2007 +1100 +++ b/include/linux/compiler-gcc.h Mon Mar 12 09:51:18 2007 +1100 @@ -24,7 +24,7 @@ /* [0] degrades to a pointer: a different type from an array */ #define __must_be_array(a) \ - BUILD_BUG_ON_ZERO(__builtin_types_compatible_p(typeof(a), typeof([0]))) + BUILD_BUG_OR_ZERO(__builtin_types_compatible_p(typeof(a), typeof([0]))) #define inline inline __attribute__((always_inline)) #define __inline__ __inline__ __attribute__((always_inline)) diff -r 6fb745a5bb51 include/linux/kernel.h --- a/include/linux/kernel.hMon Mar 12 09:12:20 2007 +1100 +++ b/include/linux/kernel.hMon Mar 12 09:51:25 2007 +1100 @@ -341,7 +341,7 @@ struct sysinfo { result (of value 0 and type size_t), so the expression can be used e.g. in a structure initializer (or where-ever else comma expressions aren't permitted). */ -#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) +#define BUILD_BUG_OR_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1) /* Trap pasters of __FUNCTION__ at compile-time */ #define __FUNCTION__ (__func__) diff -r 6fb745a5bb51 include/linux/moduleparam.h --- a/include/linux/moduleparam.h Mon Mar 12 09:12:20 2007 +1100 +++ b/include/linux/moduleparam.h Mon Mar 12 09:51:42 2007 +1100 @@ -65,7 +65,7 @@ struct kparam_array #define __module_param_call(prefix, name, set, get, arg, perm) \ /* Default value instead of permissions? */ \ static int __param_perm_check_##name __attribute__((unused)) = \ - BUILD_BUG_ON_ZERO((perm) < 0 || (perm) > 0777 || ((perm) & 2)); \ + BUILD_BUG_OR_ZERO((perm) < 0 || (perm) > 0777 || ((perm) & 2)); \ static char __param_str_##name[] = prefix #name;\ static struct kernel_param const __param_##name \ __attribute_used__ \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MMC: Clean up low voltage range handling
Philip Langdale wrote: > Clean up the handling of low voltage MMC cards. > > > The latest MMC and SD specs both agree that the low > voltage range is defined as 1.65-1.95V and is signified > by bit 7 in the OCR. An old Sandisk spec implied that > bits 7-0 represented voltages below 2.0V in 1V increments, > and the code was accordingly written with that expectation. > > We must not have the same specs. My simplified SD 2.0 physical spec defines everything below bit 15 as reserved. > This change switches the code to conform to the specs and > fixes the SDHCI driver. It also removes the explicit > defines for the host vdd and updates the SDHCI driver > to convert the bit number back to the mask value > for comparisons. Having only a single set of defines > ensures there's nothing to get out of sync. > > Although this is a nice change, it confuses things to have two changes in one commit. Could you split them up and base it on my "for-andrew" branch? Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [RSDL] sched: rsdl accounting fixes
Andrew the following patch can be rolled into the sched-implement-rsdl-cpu-scheduler.patch file or added separately if that's easier. All the oopses and bitmap errors of previous versions of rsdl were fixed by v0.29 so I think RSDL is ready for another round in -mm. Thanks. --- Higher priority tasks should always preempt lower priority tasks if they are queued higher than their static priority as non-rt tasks. Fix it. The deadline mechanism can be triggered before tasks' quota ever gets added to the runqueue priority level's quota. Add 1 to the quota in anticipation of this. The deadline mechanism should only be triggered if the quota is overrun instead of as soon as the quota is expired allowing some aliasing errors in scheduler_tick accounting. Fix that Signed-off-by: Con Kolivas <[EMAIL PROTECTED]> --- kernel/sched.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) Index: linux-2.6.21-rc3-mm2/kernel/sched.c === --- linux-2.6.21-rc3-mm2.orig/kernel/sched.c2007-03-12 08:47:43.0 +1100 +++ linux-2.6.21-rc3-mm2/kernel/sched.c 2007-03-12 09:10:33.0 +1100 @@ -96,10 +96,9 @@ unsigned long long __attribute__((weak)) * provided it is not a realtime comparison. */ #define TASK_PREEMPTS_CURR(p, curr) \ - (((p)->prio < (curr)->prio) || (((p)->prio == (curr)->prio) && \ + (((p)->prio < (curr)->prio) || (!rt_task(p) && \ ((p)->static_prio < (curr)->static_prio && \ - ((curr)->static_prio > (curr)->prio)) && \ - !rt_task(p))) + ((curr)->static_prio > (curr)->prio /* * This is the time all tasks within the same priority round robin. @@ -3323,7 +3322,7 @@ static inline void major_prio_rotation(s */ static inline void rotate_runqueue_priority(struct rq *rq) { - int new_prio_level, remaining_quota; + int new_prio_level; struct prio_array *array; /* @@ -3334,7 +,6 @@ static inline void rotate_runqueue_prior if (unlikely(sched_find_first_bit(rq->dyn_bitmap) < rq->prio_level)) return; - remaining_quota = rq_quota(rq, rq->prio_level); array = rq->active; if (rq->prio_level > MAX_PRIO - 2) { /* Major rotation required */ @@ -3368,10 +3366,11 @@ static inline void rotate_runqueue_prior } rq->prio_level = new_prio_level; /* -* While we usually rotate with the rq quota being 0, it is possible -* to be negative so we subtract any deficit from the new level. +* As we are merging to a prio_level that may not have anything in +* its quota we add 1 to ensure the tasks get to run in schedule() to +* add their quota to it. */ - rq_quota(rq, new_prio_level) += remaining_quota; + rq_quota(rq, new_prio_level) += 1; } static void task_running_tick(struct rq *rq, struct task_struct *p) @@ -3397,12 +3396,11 @@ static void task_running_tick(struct rq if (!--p->time_slice) task_expired_entitlement(rq, p); /* -* The rq quota can become negative due to a task being queued in -* scheduler without any quota left at that priority level. It is -* cheaper to allow it to run till this scheduler tick and then -* subtract it from the quota of the merged queues. +* We only employ the deadline mechanism if we run over the quota. +* It allows aliasing problems around the scheduler_tick to be +* less harmful. */ - if (!rt_task(p) && --rq_quota(rq, rq->prio_level) <= 0) { + if (!rt_task(p) && --rq_quota(rq, rq->prio_level) < 0) { if (unlikely(p->first_time_slice)) p->first_time_slice = 0; rotate_runqueue_priority(rq); -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...
On Sun, 2007-03-11 at 16:13 -0700, Davide Libenzi wrote: > On Sun, 11 Mar 2007, Davide Libenzi wrote: > > > This patch introduces a new system call for timers events delivered > > though file descriptors. This allows timer event to be used with > > standard POSIX poll(2), select(2) and read(2). As a consequence of > > supporting the Linux f_op->poll subsystem, they can be used with > > epoll(2) too. > > The system call is defined as: > > > > int timerfd(int ufd, int clockid, int tmrtype, const struct timespec *utmr); > > > > The "ufd" parameter allows for re-use (re-programming) of an existing > > timerfd w/out going through the close/open cycle (same as signalfd). > > If "ufd" is -1, s new file descriptor will be created, otherwise the > > existing "ufd" will be re-programmed. > > The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. > > The "tmrtype" parameter allows to specify the timer type. The following > > values are supported: > > > > TFD_TIMER_REL > > The time specified in the "utmr" parameter is a relative time > > from NOW. > > > > TFD_TIMER_ABS > > The timer specified in the "utmr" parameter is an absolute time. > > > > TFD_TIMER_SEQ > > The time specified in the "utmr" parameter is an interval at > > which a continuous clock rate will be generated. > > > > Duh! Forgot to update the documenation. Now timerfd() gets an itimerspec. > For TFD_TIMER_REL only the it_interval is valid, and it's the relative > time. For TFD_TIMER_ABS, only the it_value is valid, and that the expiry > absolute time. For TFD_TIMER_SEQ, it_value tells when the first tick > should be generated, and it_interval tells the period of the following > ticks. > You should probably make it behave like the other things that use itimerspec, just to avoid confusion -- i.e. timers are relative by default, there's a flag that makes them absolute, they expire when it_value specifies, and repeat every it_interval nanoseconds if it_interval is non-zero. i.e. int timerfd(int ufd, int clockid, int flags, const struct timespec *utmr); with TFD_TIMER_ABS in flags making the timer absolute instead of relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all). -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...
On Sun, 2007-03-11 at 16:50 -0700, Nicholas Miell wrote: > You should probably make it behave like the other things that use > itimerspec, just to avoid confusion -- i.e. timers are relative by > default, there's a flag that makes them absolute, they expire when > it_value specifies, and repeat every it_interval nanoseconds if > it_interval is non-zero. > > i.e. > > int timerfd(int ufd, int clockid, int flags, const struct timespec > *utmr); > > with TFD_TIMER_ABS in flags making the timer absolute instead of > relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all). > Sorry, that should be int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr); and TFD_TIMER_ABSTIME. -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RSDL v0.30 cpu scheduler for mainline kernels
There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and 2.6.21-rc3-mm2 to bring RSDL up to version 0.30 for download here: Full patches: http://ck.kolivas.org/patches/staircase-deadline/2.6.20-sched-rsdl-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2-rsdl-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-sched-rsdl-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2-rsdl-0.30.patch incrementals: http://ck.kolivas.org/patches/staircase-deadline/2.6.20/2.6.20.2-rsdl-0.29-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2/2.6.20.2-rsdl-0.29-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3/2.6.21-rc3-rsdl-0.29-0.30.patch http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2/2.6.21-rc3-mm2-rsdl-0.29-0.30.patch -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cosmetic adaption of drivers/ide/Kconfig concerning SATA
Hello, since Serial ATA has it's own menu point now, I guess we can change the description of the deprecated SATA driver as well, since the new S-ATA subsystem is not configured through a SCSI low-level driver anymore. The following patch is against 2.6.21-rc3: --- linux-2.6.20.orig/drivers/ide/Kconfig2007-03-12 01:34:38.0 +0100 +++ linux-2.6.20/drivers/ide/Kconfig2007-03-12 01:47:10.0 +0100 @@ -103,7 +103,7 @@ ---help--- There are two drivers for Serial ATA controllers. - The main driver, "libata", exists inside the SCSI subsystem + The main driver, "libata", exists in the "Serial ATA subsystem" and supports most modern SATA controllers. The IDE driver (which you are currently configuring) supports Since I am not subscribed to the list, I'd find it great if I were personally CC'ed. :-) Best regards Patrick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MMC: Clean up low voltage range handling
Pierre Ossman wrote: > > We must not have the same specs. My simplified SD 2.0 physical spec > defines everything below bit 15 as reserved. I was a little unclear. Both specs define bit 7 as the low-voltage range but only the MMC spec defines the actual voltage. As such, there is no complete definition of a low voltage SD card. That's why I added the sanity check in the actual code. > Although this is a nice change, it confuses things to have two changes > in one commit. Could you split them up and base it on my "for-andrew" > branch? Yeah, I thought you'd think that :-) I'll post the two diffs shortly. Thanks, --phil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MMC: Consolidate voltage definitions
Consolidate the list of available voltages. Up until now, a separate set of defines has been used for host->vdd than that used for the OCR voltage mask values. Having two sets of defines allows them to get out of sync and the current sets are already inconsistent with one claiming to describe ranges and the other specific voltages. Only the SDHCI driver uses the host->vdd defines and it is easily fixed to use the OCR defines. Signed-off-by: Philip Langdale <[EMAIL PROTECTED]> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 86d0957..2f34ae3 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -668,20 +668,17 @@ static void sdhci_set_power(struct sdhci pwr = SDHCI_POWER_ON; - switch (power) { - case MMC_VDD_170: - case MMC_VDD_180: - case MMC_VDD_190: + switch (1 << power) { + case MMC_VDD_17_18: + case MMC_VDD_18_19: pwr |= SDHCI_POWER_180; break; - case MMC_VDD_290: - case MMC_VDD_300: - case MMC_VDD_310: + case MMC_VDD_29_30: + case MMC_VDD_30_31: pwr |= SDHCI_POWER_300; break; - case MMC_VDD_320: - case MMC_VDD_330: - case MMC_VDD_340: + case MMC_VDD_32_33: + case MMC_VDD_33_34: pwr |= SDHCI_POWER_330; break; default: diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h index 43bf6a5..496f540 100644 --- a/include/linux/mmc/host.h +++ b/include/linux/mmc/host.h @@ -16,30 +16,7 @@ struct mmc_ios { unsigned intclock; /* clock rate */ unsigned short vdd; -#defineMMC_VDD_150 0 -#defineMMC_VDD_155 1 -#defineMMC_VDD_160 2 -#defineMMC_VDD_165 3 -#defineMMC_VDD_170 4 -#defineMMC_VDD_180 5 -#defineMMC_VDD_190 6 -#defineMMC_VDD_200 7 -#defineMMC_VDD_210 8 -#defineMMC_VDD_220 9 -#defineMMC_VDD_230 10 -#defineMMC_VDD_240 11 -#defineMMC_VDD_250 12 -#defineMMC_VDD_260 13 -#defineMMC_VDD_270 14 -#defineMMC_VDD_280 15 -#defineMMC_VDD_290 16 -#defineMMC_VDD_300 17 -#defineMMC_VDD_310 18 -#defineMMC_VDD_320 19 -#defineMMC_VDD_330 20 -#defineMMC_VDD_340 21 -#defineMMC_VDD_350 22 -#defineMMC_VDD_360 23 +/* vdd stores the bit number of the selected voltage range from below. */ unsigned char bus_mode; /* command output mode */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 6/9] signalfd/timerfd v3 - timerfd core ...
On Sun, 11 Mar 2007, Nicholas Miell wrote: > You should probably make it behave like the other things that use > itimerspec, just to avoid confusion -- i.e. timers are relative by > default, there's a flag that makes them absolute, they expire when > it_value specifies, and repeat every it_interval nanoseconds if > it_interval is non-zero. > > i.e. > > int timerfd(int ufd, int clockid, int flags, const struct timespec > *utmr); > > with TFD_TIMER_ABS in flags making the timer absolute instead of > relative (and no TFD_TIMER_REL or TFD_TIMER_SEQ at all). Sounds sane to me. Will do... - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MMC: Fix handling of low-voltage cards
Fix handling of low voltage MMC cards. The latest MMC and SD specs both agree that support for low-voltage operations is indicated by bit 7 in the OCR. The MMC spec states that the low voltage range is 1.65-1.95V while the SD spec leaves the actual voltage range undefined - meaning that there is still no such thing as a low voltage SD card. However, an old Sandisk spec implied that bits 7.0 represented voltages below 2.0V in 1V or 0.5V increments, and the code was accordingly written with that expectation. This confusion meant that host drivers attempting to support the typical low voltage (1.8V) would set the wrong bits in the host OCR mask (usually bits 5 and/or 6) resulting in the the low voltage mode never being used. This change corrects the low voltage range and adds sanity checks on the reserved bits (0-6) and for SD cards that claim to support low-voltage operations. Signed-off-by: Philip Langdale <[EMAIL PROTECTED]> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index c87ce56..74ebd97 100644 --- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -317,6 +317,24 @@ static u32 mmc_select_voltage(struct mmc { int bit; + /* +* Sanity check the voltages that the card claims to +* support. +*/ + if (ocr & 0x7F) { + printk("%s: card claims to support voltages below " + "the defined range. These will be ignored.\n", + mmc_hostname(host)); + ocr &= ~0x7F; + } + + if (host->mode == MMC_MODE_SD && (ocr & MMC_VDD_165_195)) { + printk("%s: SD card claims to support the incompletely " + "defined 'low voltage range'. This will be ignored.\n", + mmc_hostname(host)); + ocr &= ~MMC_VDD_165_195; + } + ocr &= host->ocr_avail; bit = ffs(ocr); diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 2f34ae3..a80c043 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -669,8 +669,7 @@ static void sdhci_set_power(struct sdhci pwr = SDHCI_POWER_ON; switch (1 << power) { - case MMC_VDD_17_18: - case MMC_VDD_18_19: + case MMC_VDD_165_195: pwr |= SDHCI_POWER_180; break; case MMC_VDD_29_30: @@ -1290,7 +1289,7 @@ static int __devinit sdhci_probe_slot(st if (caps & SDHCI_CAN_VDD_300) mmc->ocr_avail |= MMC_VDD_29_30|MMC_VDD_30_31; if (caps & SDHCI_CAN_VDD_180) - mmc->ocr_avail |= MMC_VDD_17_18|MMC_VDD_18_19; + mmc->ocr_avail |= MMC_VDD_165_195; if (mmc->ocr_avail == 0) { printk(KERN_ERR "%s: Hardware doesn't report any " diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h index 496f540..2aac62a 100644 --- a/include/linux/mmc/host.h +++ b/include/linux/mmc/host.h @@ -65,14 +65,7 @@ struct mmc_host { unsigned intf_max; u32 ocr_avail; -#define MMC_VDD_145_1500x0001 /* VDD voltage 1.45 - 1.50 */ -#define MMC_VDD_150_1550x0002 /* VDD voltage 1.50 - 1.55 */ -#define MMC_VDD_155_1600x0004 /* VDD voltage 1.55 - 1.60 */ -#define MMC_VDD_160_1650x0008 /* VDD voltage 1.60 - 1.65 */ -#define MMC_VDD_165_1700x0010 /* VDD voltage 1.65 - 1.70 */ -#define MMC_VDD_17_18 0x0020 /* VDD voltage 1.7 - 1.8 */ -#define MMC_VDD_18_19 0x0040 /* VDD voltage 1.8 - 1.9 */ -#define MMC_VDD_19_20 0x0080 /* VDD voltage 1.9 - 2.0 */ +#define MMC_VDD_165_1950x0080 /* VDD voltage 1.65 - 1.95 */ #define MMC_VDD_20_21 0x0100 /* VDD voltage 2.0 ~ 2.1 */ #define MMC_VDD_21_22 0x0200 /* VDD voltage 2.1 ~ 2.2 */ #define MMC_VDD_22_23 0x0400 /* VDD voltage 2.2 ~ 2.3 */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: _proxy_pda still makes linking modules fail
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes: > > I've heard that it now builds with gcc-4.2.0 snapshots. This is strange: > if the problem has been fixed for gcc-4.2.0, why doesn't it work for > gcc-4.1.2? arch/i386/kernel/vmlinux.lds.S does contain _proxy_pda = 0; Hmm, it probably needs a EXPORT_SYMBOL. The previous change only fixed the in kernel build. Does it work with this patch? -Andi Export _proxy_pda for gcc 4.2 The symbol is not actually used, but the compiler unforunately generates a (unused) reference to it. This can happen even in modules. So export it. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux/arch/i386/kernel/i386_ksyms.c === --- linux.orig/arch/i386/kernel/i386_ksyms.c +++ linux/arch/i386/kernel/i386_ksyms.c @@ -28,3 +28,5 @@ EXPORT_SYMBOL(__read_lock_failed); #endif EXPORT_SYMBOL(csum_partial); + +EXPORT_SYMBOL(_proxy_pda); Index: linux/arch/x86_64/kernel/x8664_ksyms.c === --- linux.orig/arch/x86_64/kernel/x8664_ksyms.c +++ linux/arch/x86_64/kernel/x8664_ksyms.c @@ -61,3 +61,4 @@ EXPORT_SYMBOL(empty_zero_page); EXPORT_SYMBOL(init_level4_pgt); EXPORT_SYMBOL(load_gs_index); +EXPORT_SYMBOL(_proxy_pda); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: _proxy_pda still makes linking modules fail
Andi Kleen wrote: > Hmm, it probably needs a EXPORT_SYMBOL. The previous change only > fixed the in kernel build. > > Does it work with this patch? > > -Andi > > Export _proxy_pda for gcc 4.2 > Gak. It seemed like such a good idea at the time. Rusty's pda->per_cpu patch will deal with this once and for all; have you picked it up yet? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irda rmmod lockdep trace.
Hi Dave, On Sat, Mar 10, 2007 at 07:43:26PM +0200, Samuel Ortiz wrote: > Hi Dave, > > On Thu, Mar 08, 2007 at 05:54:36PM -0500, Dave Jones wrote: > > modprobe irda ; rmmod irda in 2.6.21rc3 gets me the spew below.. > Well it seems that we call __irias_delete_object() from hashbin_delete(). Then > __irias_delete_object() calls itself hashbin_delete() again. We're trying to > get the lock recursively. Looking at the code more carefully, this seems to be a false positive: iriap_cleanup and and __irias_delete_object are taking 2 different locks from 2 different hashbin instances. The locks belong to the same lock class but they are hierarchically different. We need to tell the validator about it and the following patch does that. Comments are welcomed as I'm planning to push it to netdev soon: include/net/irda/irqueue.h |4 +++- net/irda/irias_object.c|3 ++- net/irda/irqueue.c | 13 + 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/net/irda/irqueue.h b/include/net/irda/irqueue.h index 335b0ac..ce9fa7c 100644 --- a/include/net/irda/irqueue.h +++ b/include/net/irda/irqueue.h @@ -77,7 +77,8 @@ typedef struct hashbin_t { } hashbin_t; hashbin_t *hashbin_new(int type); -int hashbin_delete(hashbin_t* hashbin, FREE_FUNC func); +int hashbin_delete_nested(hashbin_t* hashbin, FREE_FUNC func, + u8 nested_depth); int hashbin_clear(hashbin_t* hashbin, FREE_FUNC free_func); void hashbin_insert(hashbin_t* hashbin, irda_queue_t* entry, long hashv, const char* name); @@ -92,5 +93,6 @@ irda_queue_t *hashbin_get_first(hashbin_t *hashbin); irda_queue_t *hashbin_get_next(hashbin_t *hashbin); #define HASHBIN_GET_SIZE(hashbin) hashbin->hb_size +#define hashbin_delete(hashbin, func) hashbin_delete_nested(hashbin, func, 0) #endif diff --git a/net/irda/iriap.c b/net/irda/iriap.c diff --git a/net/irda/irias_object.c b/net/irda/irias_object.c index 4adaae2..4238d23 100644 --- a/net/irda/irias_object.c +++ b/net/irda/irias_object.c @@ -142,7 +142,8 @@ void __irias_delete_object(struct ias_object *obj) kfree(obj->name); - hashbin_delete(obj->attribs, (FREE_FUNC) __irias_delete_attrib); + hashbin_delete_nested(obj->attribs, (FREE_FUNC) __irias_delete_attrib, + SINGLE_DEPTH_NESTING); obj->magic = ~IAS_OBJECT_MAGIC; diff --git a/net/irda/irqueue.c b/net/irda/irqueue.c index 9266233..c669a86 100644 --- a/net/irda/irqueue.c +++ b/net/irda/irqueue.c @@ -378,13 +378,14 @@ EXPORT_SYMBOL(hashbin_new); /* - * Function hashbin_delete (hashbin, free_func) + * Function hashbin_delete_nested (hashbin, free_func, nested_lock) * *Destroy hashbin, the free_func can be a user supplied special routine *for deallocating this structure if it's complex. If not the user can *just supply kfree, which should take care of the job. */ -int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func) +int hashbin_delete_nested( hashbin_t* hashbin, FREE_FUNC free_func, + u8 nested_depth) { irda_queue_t* queue; unsigned long flags = 0; @@ -395,7 +396,11 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func) /* Synchronize */ if ( hashbin->hb_type & HB_LOCK ) { - spin_lock_irqsave(>hb_spinlock, flags); + if (nested_depth > 0) + spin_lock_irqsave_nested(>hb_spinlock, flags, +nested_depth); + else + spin_lock_irqsave(>hb_spinlock, flags); } /* @@ -428,7 +433,7 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func) return 0; } -EXPORT_SYMBOL(hashbin_delete); +EXPORT_SYMBOL(hashbin_delete_nested); /* HASHBIN LIST OPERATIONS */ > I'll try to fix that soon, thanks for the report. > > Cheers, > Samuel. > > > > Dave > > > > NET: Registered protocol family 23 > > NET: Unregistered protocol family 23 > > > > = > > [ INFO: possible recursive locking detected ] > > 2.6.20-1.2966.fc7 #1 > > - > > rmmod/16712 is trying to acquire lock: > > (>hb_spinlock){}, at: [] > > hashbin_delete+0x29/0x94 [irda] > > > > but task is already holding lock: > > (>hb_spinlock){}, at: [] > > hashbin_delete+0x29/0x94 [irda] > > > > other info that might help us debug this: > > 1 lock held by rmmod/16712: > > #0: (>hb_spinlock){}, at: [] > > hashbin_delete+0x29/0x94 [irda] > > > > stack backtrace: > > > > Call Trace: > > [] __lock_acquire+0x151/0xbc4 > > [] :irda:__irias_delete_attrib+0x0/0x31 > > [] lock_acquire+0x4c/0x65 > > [] :irda:hashbin_delete+0x29/0x94 > > [] _spin_lock_irqsave+0x2c/0x3c > > [] :irda:hashbin_delete+0x29/0x94 > > []
Re: [RFC][PATCH 2/7] RSS controller core
On Sun, Mar 11, 2007 at 06:04:28PM +0300, Pavel Emelianov wrote: > Herbert Poetzl wrote: > > On Sun, Mar 11, 2007 at 12:08:16PM +0300, Pavel Emelianov wrote: > >> Herbert Poetzl wrote: > >>> On Tue, Mar 06, 2007 at 02:00:36PM -0800, Andrew Morton wrote: > On Tue, 06 Mar 2007 17:55:29 +0300 > Pavel Emelianov <[EMAIL PROTECTED]> wrote: > > > +struct rss_container { > > + struct res_counter res; > > + struct list_head page_list; > > + struct container_subsys_state css; > > +}; > > + > > +struct page_container { > > + struct page *page; > > + struct rss_container *cnt; > > + struct list_head list; > > +}; > ah. This looks good. I'll find a hunk of time to go through this > work and through Paul's patches. It'd be good to get both patchsets > lined up in -mm within a couple of weeks. But.. > >>> doesn't look so good for me, mainly becaus of the > >>> additional per page data and per page processing > >>> > >>> on 4GB memory, with 100 guests, 50% shared for each > >>> guest, this basically means ~1mio pages, 500k shared > >>> and 1500k x sizeof(page_container) entries, which > >>> roughly boils down to ~25MB of wasted memory ... > >>> > >>> increase the amount of shared pages and it starts > >>> getting worse, but maybe I'm missing something here > >> You are. Each page has only one page_container associated > >> with it despite the number of containers it is shared > >> between. > >> > We need to decide whether we want to do per-container memory > limitation via these data structures, or whether we do it via > a physical scan of some software zone, possibly based on Mel's > patches. > >>> why not do simple page accounting (as done currently > >>> in Linux) and use that for the limits, without > >>> keeping the reference from container to page? > >> As I've already answered in my previous letter simple > >> limiting w/o per-container reclamation and per-container > >> oom killer isn't a good memory management. It doesn't allow > >> to handle resource shortage gracefully. > > > > per container OOM killer does not require any container > > page reference, you know _what_ tasks belong to the > > container, and you know their _badness_ from the normal > > OOM calculations, so doing them for a container is really > > straight forward without having any page 'tagging' > > That's true. If you look at the patches you'll > find out that no code in oom killer uses page 'tag'. so what do we keep the context -> page reference then at all? > > for the reclamation part, please elaborate how that will > > differ in a (shared memory) guest from what the kernel > > currently does ... > > This is all described in the code and in the > discussions we had before. must have missed some of them, please can you point me to the relevant threads ... TIA, Herbert > > TIA, > > Herbert > > > >> This patchset provides more grace way to handle this, but > >> full memory management includes accounting of VMA-length > >> as well (returning ENOMEM from system call) but we've decided > >> to start with RSS. > >> > >>> best, > >>> Herbert > >>> > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.osdl.org/mailman/listinfo/containers > >>> - > >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >>> the body of a message to [EMAIL PROTECTED] > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> Please read the FAQ at http://www.tux.org/lkml/ > >>> > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 2/7] RSS controller core
On Sun, Mar 11, 2007 at 04:51:11AM -0800, Andrew Morton wrote: > > On Sun, 11 Mar 2007 15:26:41 +0300 Kirill Korotaev <[EMAIL PROTECTED]> > > wrote: > > Andrew Morton wrote: > > > On Tue, 06 Mar 2007 17:55:29 +0300 > > > Pavel Emelianov <[EMAIL PROTECTED]> wrote: > > > > > > > > >>+struct rss_container { > > >>+ struct res_counter res; > > >>+ struct list_head page_list; > > >>+ struct container_subsys_state css; > > >>+}; > > >>+ > > >>+struct page_container { > > >>+ struct page *page; > > >>+ struct rss_container *cnt; > > >>+ struct list_head list; > > >>+}; > > > > > > > > > ah. This looks good. I'll find a hunk of time to go through > > > this work and through Paul's patches. It'd be good to get both > > > patchsets lined up in -mm within a couple of weeks. But.. > > > > > > We need to decide whether we want to do per-container memory > > > limitation via these data structures, or whether we do it via > > > a physical scan of some software zone, possibly based on Mel's > > > patches. > > i.e. a separate memzone for each container? > > Yep. Straightforward machine partitioning. An attractive thing is that > it 100% reuses existing page reclaim, unaltered. > > > imho memzone approach is inconvinient for pages sharing and shares > > accounting. it also makes memory management more strict, forbids > > overcommiting per-container etc. > > umm, who said they were requirements? well, I guess all existing OS-Level virtualizations (Linux-VServer, OpenVZ, and FreeVPS) have stated more than one time that _sharing_ of resources is a central element, and one especially important resource to share is memory (RAM) ... if your aim is full partitioning, we do not need to bother with OS-Level isolation, we can simply use Paravirtualization and be done ... > > Maybe you have some ideas how we can decide on this? > > We need to work out what the requirements are before we can > settle on an implementation. Linux-VServer (and probably OpenVZ): - shared mappings of 'shared' files (binaries and libraries) to allow for reduced memory footprint when N identical guests are running - virtual 'physical' limit should not cause swap out when there are still pages left on the host system (but pages of over limit guests can be preferred for swapping) - accounting and limits have to be consistent and should roughly represent the actual used memory/swap (modulo optimizations, I can go into detail here, if necessary) - OOM handling on a per guest basis, i.e. some out of memory condition in guest A must not affect guest B HTC, Herbert > Sigh. Who is running this show? Anyone? > > You can actually do a form of overcommittment by allowing multiple > containers to share one or more of the zones. Whether that is > sufficient or suitable I don't know. That depends on the requirements, > and we haven't even discussed those, let alone agreed to them. > > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.osdl.org/mailman/listinfo/containers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [REVISED] drivers/media/video/videocodec.c: check kmalloc() return value.
Description: Check the return value of kmalloc() in function videocodec_build_table(), in file drivers/media/video/videocodec.c. Signed-off-by: Amit Choudhary <[EMAIL PROTECTED]> diff --git a/drivers/media/video/videocodec.c b/drivers/media/video/videocodec.c index 290e641..f2bbd7a 100644 --- a/drivers/media/video/videocodec.c +++ b/drivers/media/video/videocodec.c @@ -348,6 +348,9 @@ #define LINESIZE 100 kfree(videocodec_buf); videocodec_buf = kmalloc(size, GFP_KERNEL); + if (!videocodec_buf) + return 0; + i = 0; i += scnprintf(videocodec_buf + i, size - 1, "lave or attached aster name type flagsmagic "); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ckrm-tech] [PATCH 0/2] resource control file system - aka containers on top of nsproxy!
Sam, responding to Herbert: > > from my personal PoV the following would be fine: > > > > spaces (for the various 'spaces') > >... > > container (for resource accounting/limits) > >... > > I like these a lot ... Hmmm ... ok ... Let me see if I understand this. We have actors, known as threads, tasks or processes, which use things, which are instances of such classes of things as disk partitions, file systems, memory, cpus, and semaphores. We assign names to these things, such as SysV id's to the semaphores, mount points to the file systems, pathnames to files and file descriptors to open files. These names provide handles that are typically more convenient and efficient to use, but alas less persistent, less ubiquitous, and needing of some dereferencing when used, to identify the underlying thing. Any particular assignment of names to some of the things in particular class forms one namespace (aka 'space', above). For each class of things, a given task is assigned one such namespace. Typically many related tasks (such as all those of a login session or a job) will be assigned the same set of namespaces, leading to various opportunities for optimizing the management of namespaces in the kernel. This assignment of names to things is neither injective nor surjective nor even a complete map. For example, not all file systems are mounted, certainly not all possible mount points (all directories) serve as mount points, sometimes the same file system is mounted in multiple places, and sometimes more than one file system is mounted on the same mount point, one hiding the other. In so far as the code managing this naming is concerned, the names are usually fairly arbitrary, except that there seems to be a tendency toward properly virtualizing these namespaces, presenting to a task the namespaces assigned it as if that was all there was, hiding the presence of alternative namespaces, and intentionally not providing a 'global view' that encompasses all namespaces of a given class. This tendency culminates in the full blown virtual machines, such as Xen and KVM, which virtualize more or less all namespaces. Because the essential semantics relating one namespace to another are rather weak (the namespaces for any given class of things are or can be pretty much independent of each other), there is a preference and a tradition to keep such sets of namespaces a simple flat space. Conclusions regarding namespaces, aka spaces: A namespace provide a set of convenient handles for things of a particular class. For each class of things, every task gets one namespace (perhaps a Null or Default one.) Namespaces are partial virtualizations, the 'space of namespaces' is pretty flat, and the assignment of names in one namespace is pretty independent of the next. === That much covers what I understand (perhaps in error) of namespaces. So what's this resource accounting/limits stuff? I think this depends on adding one more category to our universe. For the purposes of introducing yet more terms, I will call this new category a "metered class." Each time we set about to manage some resource, we tend to construct some more elaborate "metered classes" out of the elemental classes of things (partitions, cpus, ...) listed above. Examples of these more elaborate metered classes include percentages of a networks bandwidth, fractions of a nodes memory (the fake numa patch), subsets of the systems cpus and nodes (cpusets), ... These more elaborate metered classes each have fairly 'interesting' and specialized forms. Their semantics are closely adapted to the underlying class of things from which they are formed, and to the usually challenging, often conflicting, constraints on managing the usage of such a resource. For example, the rules that apply to percentages of a networks bandwidth have little in common with the rules that apply to sets of subsets of a systems cpus and nodes. We then attach tasks to these metered classes. Each task is assigned one metered instance from each metered class. For example, each task is assigned to a cpuset. For metered classes that are visible across the system, we tend to name these classes, and then use those names when attaching tasks to them. See for example cpusets. For metered classes that are only privately visible within the current context of a task, such as setrlimit, set_mempolicy, mbind and set_mempolicy, we tend to implicitly attach each task to its current metered class and provide it explicit means to manipulate the individual attributes of that metered class by direct system calls. Conclusions regarding metered classes, aka containers: Unlike namespaces, metered classes have rich and varied semantics, sometimes elaborate inheritance and transfer rules, and frequently non-flat topologies. Depending on the scope of visibility of a metered class, it may or may not have much of a formal name space.
[PATCH] [REVISED] drivers/media/video/stv680.c: check kmalloc() return value.
Description: Check the return value of kmalloc() in function stv680_start_stream(), in file drivers/media/video/stv680.c. Signed-off-by: Amit Choudhary <[EMAIL PROTECTED]> diff --git a/drivers/media/video/stv680.c b/drivers/media/video/stv680.c index 6d1ef1e..f35c664 100644 --- a/drivers/media/video/stv680.c +++ b/drivers/media/video/stv680.c @@ -687,7 +687,11 @@ static int stv680_start_stream (struct u stv680->sbuf[i].data = kmalloc (stv680->rawbufsize, GFP_KERNEL); if (stv680->sbuf[i].data == NULL) { PDEBUG (0, "STV(e): Could not kmalloc raw data buffer %i", i); - return -1; + for (i = i - 1; i >= 0; i--) { + kfree(stv680->sbuf[i].data); + stv680->sbuf[i].data = NULL; + } + return -ENOMEM; } } @@ -698,15 +702,25 @@ static int stv680_start_stream (struct u stv680->scratch[i].data = kmalloc (stv680->rawbufsize, GFP_KERNEL); if (stv680->scratch[i].data == NULL) { PDEBUG (0, "STV(e): Could not kmalloc raw scratch buffer %i", i); - return -1; + for (i = i - 1; i >= 0; i--) { + kfree(stv680->scratch[i].data); + stv680->scratch[i].data = NULL; + } + goto nomem_sbuf; } stv680->scratch[i].state = BUFFER_UNUSED; } for (i = 0; i < STV680_NUMSBUF; i++) { urb = usb_alloc_urb (0, GFP_KERNEL); - if (!urb) - return -ENOMEM; + if (!urb) { + for (i = i - 1; i >= 0; i--) { + usb_kill_urb(stv680->urb[i]); + usb_free_urb(stv680->urb[i]); + stv680->urb[i] = NULL; + } + goto nomem_scratch; + } /* sbuf is urb->transfer_buffer, later gets memcpyed to scratch */ usb_fill_bulk_urb (urb, stv680->udev, @@ -721,6 +735,18 @@ static int stv680_start_stream (struct u stv680->framecount = 0; return 0; + + nomem_scratch: + for (i=0; iscratch[i].data); + stv680->scratch[i].data = NULL; + } + nomem_sbuf: + for (i=0; isbuf[i].data); + stv680->sbuf[i].data = NULL; + } + return -ENOMEM; } static int stv680_stop_stream (struct usb_stv *stv680) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/7] Resource counters
On Sun, Mar 11, 2007 at 01:00:15PM -0600, Eric W. Biederman wrote: > Herbert Poetzl <[EMAIL PROTECTED]> writes: > > > > > Linux-VServer does the accounting with atomic counters, > > so that works quite fine, just do the checks at the > > beginning of whatever resource allocation and the > > accounting once the resource is acquired ... > > Atomic operations versus locks is only a granularity thing. > You still need the cache line which is the cost on SMP. > > Are you using atomic_add_return or atomic_add_unless or > are you performing you actions in two separate steps > which is racy? What I have seen indicates you are using > a racy two separate operation form. yes, this is the current implementation which is more than sufficient, but I'm aware of the potential issues here, and I have an experimental patch sitting here which removes this race with the following change: - doesn't store the accounted value but limit - accounted (i.e. the free resource) - uses atomic_add_return() - when negative, an error is returned and the resource amount is added back changes to the limit have to adjust the 'current' value too, but that is again simple and atomic best, Herbert PS: atomic_add_unless() didn't exist back then (at least I think so) but that might be an option too ... > >> If we'll remove failcnt this would look like > >>while (atomic_cmpxchg(...)) > >> which is also not that good. > >> > >> Moreover - in RSS accounting patches I perform page list > >> manipulations under this lock, so this also saves one atomic op. > > > > it still hasn't been shown that this kind of RSS limit > > doesn't add big time overhead to normal operations > > (inside and outside of such a resource container) > > > > note that the 'usual' memory accounting is much more > > lightweight and serves similar purposes ... > > Perhaps > > Eric > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.osdl.org/mailman/listinfo/containers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote: On Mar 11 2007 18:01, Kyle Moffett wrote: On the other hand when __cplusplus is defined they define it to the "__null" builtin, which GCC uses to give type conversion errors for "int foo = NULL" but not "char *foo = NULL". A "((void *)0)" definition gives C++ type errors for both due to the broken C ++ void pointer conversion problems. I think that the primary reason they use __null is so that you can actually do class foo *ptr = NULL; because class foo *ptr = (void *)0; would throw an error or at least a warning (implicit cast from void* to class foo*). Isn't that what I said? :-D Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mar 11 2007 21:27, Kyle Moffett wrote: > On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote: >> On Mar 11 2007 18:01, Kyle Moffett wrote: >> > On the other hand when __cplusplus is defined they define it to the >> > "__null" builtin, which GCC uses to give type conversion errors for >> > "int foo = NULL" but not "char *foo = NULL". >> I think that the primary reason they use __null is so that you can >> actually do[...] > > Isn't that what I said? :-D Ya. Though I was picking at |"__null" builtin, which GCC uses to give type conversion errors for |"int foo = NULL" since C's (void *)0 would also barf when being assigned to int. So it's not a genuine __null feature ;-) Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mar 11, 2007, at 21:32:00, Jan Engelhardt wrote: On Mar 11 2007 21:27, Kyle Moffett wrote: On Mar 11, 2007, at 19:16:59, Jan Engelhardt wrote: On Mar 11 2007 18:01, Kyle Moffett wrote: On the other hand when __cplusplus is defined they define it to the "__null" builtin, which GCC uses to give type conversion errors for "int foo = NULL" but not "char *foo = NULL". I think that the primary reason they use __null is so that you can actually do[...] Isn't that what I said? :-D Ya. Though I was picking at "__null" builtin, which GCC uses to give type conversion errors for "int foo = NULL" since C's (void *)0 would also barf when being assigned to int. So it's not a genuine __null feature ;-) You chopped my sentence in half! :-D What I *really* said was: ...give type conversion errors for 'int foo = NULL' but not 'char *foo = NULL'. The pseudo-standard "#define NULL (0)" that the C++ standards ask for does *NOT* give an error for "int foo = NULL;", and in C++ the C- standard "#define NULL ((void *)0)" *does* give an error for "char *foo = NULL;" Ergo I think I was correct when I said "GCC uses [__null] to give type conversion errors for but not second>" Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add nForce MCP61 support to i2c-nforce2
Jean Delvare wrote: Hi Petr, On Sat, 10 Mar 2007 09:00:03 +0100, Petr Vandrovec wrote: Hello, patch below adds support for nVidia's SMBus adapter present on Gateway's GT5414E motherboard (ECS's MCP61 PM-AM). Patch is for current Linus's git tree. We already have a patch doing exactly this in -mm: http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc3/2.6.21-rc3-mm2/broken-out/jdelvare-i2c-i2c-nforce2-add-mcp61-mcp65-support.patch Thanks. 00:01.1 SMBus: nVidia Corporation MCP61 SMBus (rev a2) Subsystem: Elitegroup Computer Systems Unknown device 2601 Flags: 66MHz, fast devsel, IRQ 10 I/O ports at fc00 [size=64] I/O ports at 1c00 [size=64] I/O ports at f400 [size=64] Capabilities: [44] Power Management version 2 BTW, note how the MCP61 has not 2 but 3 64-byte I/O areas declared. The previous chips used BAR 4 and 5, this new one additionally uses BAR 0. Without documentation it's hard to be sure this is a 3rd SMBus channel, but it sure looks so. Maybe you'll want to hack the i2c-nforce2 driver a bit to confirm or infirm this theory. I had same idea as you have, so I tried to modify driver to use BAR0 as well, and (1) i2cdump then said that nobody is there and (2) dump of range fc00 was quite different from range 1c00 and f400. So for my hardware I'm sure that BAR0 is of no use for me - if it is 3rd channel then either it uses different interface from nforce2, or nothing is connected to it. Petr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Bitbanging i2c bus driver using the GPIO API
On Sat, 2007-03-10 at 14:13 +0100, Haavard Skinnemoen wrote: > This is a very simple bitbanging i2c bus driver utilizing the new > arch-neutral GPIO API. Useful for chips that don't have a built-in > i2c controller, additional i2c busses, or testing purposes. > Sorry for missing this hot discussion. Your idea is exactly what I want. So many arch specific GPIO based I2C adapter implementation will benefit from this. > To use, include something similar to the following in the > board-specific setup code: > > #include > > static struct i2c_gpio_platform_data i2c_gpio_data = { > .sda_pin= GPIO_PIN_FOO, > .scl_pin= GPIO_PIN_BAR, > }; Is this usage right, because 3 flags are added to this structure as below: struct i2c_gpio_platform_data { unsigned int sda_pin; unsigned int scl_pin; unsigned int sda_is_open_drain:1; unsigned int scl_is_open_drain:1; unsigned int scl_is_output_only:1; }; > static struct platform_device i2c_gpio_device = { > .name = "i2c-gpio", > .id = 0, > .dev= { > .platform_data = _gpio_data, > }, > }; > > Register this platform_device, set up the i2c pins as GPIO if > required and you're ready to go. > > Signed-off-by: Haavard Skinnemoen <[EMAIL PROTECTED]> > --- > This patch is different from the first patch in the following ways: > * Handles pins set up as open drain (aka multidrive) by toggling > the output value instead of the direction > * Handles output-only SCL pins the same way, and also does not > install a getscl() callback for such pins > * Does not add anything to include/linux/i2c-ids.h > * Sets the output value explicitly after changing the direction to > output. > * Plugs a memory leak in remove() -- algo_data wasn't freed. > * Prints out the pin IDs in decimal, with an extra note when clock > stretching isn't supported > > This version has been compile-tested only. I'll give it a spin when I > get back to work on monday. > > Dave, does this address your concerns? > > Haavard Thanks a lot, I will drop our GPIO based I2C driver and try this one on our platform. > + if (!pdata->scl_is_output_only) > + bit_data->getscl = i2c_gpio_getscl, > + > + bit_data->getsda= i2c_gpio_getsda, > + bit_data->udelay= 5,/* 100 kHz */ > + bit_data->timeout = HZ / 10, /* 100 ms */ Can we add these udelay/timeout to struct i2c_gpio_platform_data? And let customer to choose these according their specific requirement. We use Kconfig to do this, but Jean and David don't like the idea, -:( Regards, -Bryan Wu - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i.MX/MX1 SDHC fix/workaround of SD card recognition problems
On Monday 12 March 2007 00:36, you wrote: > Pavel Pisa wrote: > > The SDHC controllers cannot process shorter transfers. > > They has to be handled as longer ones, but it such case CRC > > error is evaluated. There was a case in the code still, > > where this error is not ignored as it should to be process > > these transfers. > > > > Signed-off-by: Pavel Pisa <[EMAIL PROTECTED]> > > Thanks, applied. Is this something critical that should be in 2.6.21? > > Rgds Hello Pierre, this should go to 2.6.21, I have hold this for some months and I have discussed it in the thread "Re: CRC Errors with SD cards in 4bits mode (on i.MXl)" You have been CCed. This is not solution for seen data CRC problem, but solves problems with recognition of cards which has been timing sensitive sometimes. I have sent it into Russell's patch queue with my others MX1 fixes I have intended to be included in 2.6.21. It was probably mistake for this one, because it should go through your tree. If you send it to mainline yourself, I would discard patch from patch daemon. We have spoken about MX1 SDHC maintainership. I am attaching my subscription. I am not sure about mailing list field there. Do you suggest this one, ALKML or other? Best wishes Pavel Pisa -- Subject: i.MX/MX1 SDHC maintainer I am reporting to responsibility for i.MX MMC driver bugs and coordination of the fighting against problems of this hardware beast. Signed-off-by: Pavel Pisa <[EMAIL PROTECTED]> MAINTAINERS |7 +++ 1 file changed, 7 insertions(+) Index: linux-2.6.21-rc1/MAINTAINERS === --- linux-2.6.21-rc1.orig/MAINTAINERS +++ linux-2.6.21-rc1/MAINTAINERS @@ -1713,6 +1713,13 @@ M: [EMAIL PROTECTED] L: [EMAIL PROTECTED] (subscribers-only) S: Maintained +IMX MMC/SD HOST CONTROLLER INTERFACE DRIVER +P: Pavel Pisa +M: [EMAIL PROTECTED] +L: [EMAIL PROTECTED] +W: http://mmc.drzeus.cx/wiki/Controllers/Freescale/SDHC +S: Maintained + INFINIBAND SUBSYSTEM P: Roland Dreier M: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> > This is a bug actually in the megaraid. Aha, I'll track it. > > And this is a direct command submission path: it already passed both > online check gates in this path *after* the device was offlined, so > adding a third won't fix this. Yeah, I have notice that, however, from the logs, the device have offline, but why still can send cmd to device? isn't the sequences of printk suspectful? > single disk, so the I/O was definitely bound for sda? Secondly, can you > reproduce with a modern (2.6.20) kernel. Your trace strongly suggests > that the device came back online for some reason and then the megaraid > driver died. It's hard to update the kernel for the system is a production system, and we cannot debug it at the box :( I dont know if you have notice, the logs come from diskdump, if it caused by diskdump? Thanks, Joe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] fs: introduce perform_write aop
On Sat, Mar 10, 2007 at 09:25:41AM +, Christoph Hellwig wrote: > On Fri, Mar 09, 2007 at 03:33:01PM -0800, Mark Fasheh wrote: > > ->kernel_write() as opposed to genericizing ->perform_write() would be fine > > with me. Just so long as we get rid of ->prepare_write and ->commit_write in > > that other kernel code doesn't call them directly. That interface just > > doesn't work for Ocfs2. > > It doesn't work for any filesystem that needs slightly fancy locking. > That and the reason that's an interface that doesn't fit into our > layering is why I want to get rid of it. Note that fops->kernel_write > might in fact use ->perform_write with an actor as Nick suggested. > I'm not quite sure how it'll look like - I'd rather take care of the > buffered write path first and then handle this issue once the first > changes have stabilized. > > > Right now I've got Ocfs2 implementing it's own lowest-level buffered write > > code - think generic_file_buffered_write() replacement for Ocfs2. With some > > duplicated code above that layer. What's nice is that I can abstract away > > the "copy data into some target pages" bits such that the majority of that > > code is re-usable for ocfs2's splice write operation. I'm not sure we could > > have that low a level of abstraction for anyhing above individual the file > > system though which also has to deal with non-kernel writes though. That's > > where a ->kernel_write() might come in handy. > > Why do you need your own low level buffered write functionality? As in > past times when filesystems want to come up I'd like to have a very > good exaplanation on why you think it's needed and whether we shouldn't > improve the generic buffered write code instead. Fair enough - I personally tried everything I could before coming to the conclusion that for the time being, Ocfs2 should have a seperate write path. As you know, I've been adding sparse file support for Ocfs2. Putting aside all the reasons to have real support for sparse files (as opposed to zeroing allocated regions), the tree code changes alone has gotten us 90% the way to supporting unwritten extents (much like xfs does). Ocfs2 supports atomic data allocation units ('clusters', to use an overloaded term) which can range in size from 4k to 1 meg. This means that for allocating writes on page size < cluster size file systems, we have to zero pages adjacent to the one being written so that a re-read doesn't return dirty data. This alone requires page locking which we can't realistically achieve via ->prepare_write() and ->commit_write(). I believe NTFS has a similar restriction, which has lead to their own file write. So, page locking was definitely the straw that broke the camels back. Some other things which were akward or slightly less critically broken than the page locking: Since ocfs2 has a rather large (compared to a local file system) context to build up during an allocating write, it became uncomfortable to pass that around ->prepare_write() and ->commit_write() without putting that context on our struct inode and protecting it with a lock. And since the existing interfaces were so rigid, it actually required a lot more context to be passed around than in my current code. There's also the cluster lock / page lock inversion which we have to deal with (it gets even worse if we fault in pages in the middle of the user copy for a write). Granted, we fixed a lot of that before merging, but allocating in write means taking even more cluster locks and I don't really feel comfortable nesting so many of those within the page locks. Finally, we get to the optimization problem - writing stuff one page at a time. To be fair, my current stuff doesn't do a very good job of optimizing the amount of data written in a given pass, but the groundwork is there to easily write at least one clusters worth of user data at a time. My priority has been mostly to stabilize it as opposed to performance tuning. So, quite possibly, I overstated what Ocfs2 was doing earlier - we still make use of as much generic code as we can. The O_DIRECT path for instance wasn't touched. Ocfs2 still makes use of block_commit_write(), the standard jbd mechanisms for ordered data mode, and though we got rid of block_prepare_write() (for zeroing reasons), what we do is a much simpler version. By the way, the code in question can be found in the sparse_files branch of ocfs2.git: http://git.kernel.org/?p=linux/kernel/git/mfasheh/ocfs2.git;a=log;h=sparse_files Your review has been extremely useful in the past, so I welcome any comments you might have. Though it's getting close to being put in ALL (for a spin in -mm), it's definitely a work in progress branch. There's 3 patches to generic code which I need to push out for review (it's pretty much just exporting symbols which we'd need in any case). Also, some of the bug fixes and feature adjustments need to get folded back into their respective patches. > This codepath is so nasty that any
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't > been fixed in mainline by other means? I cannot confirm if it have fixed in latest kernel, the server is a production system, it's hard to debug it and try reproduce. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi]: Add offline state checking while dispatch a scsi cmd
> On Mon, 12 Mar 2007 10:52:22 +0800 Joe Jin <[EMAIL PROTECTED]> wrote: > > The 2.6.9 base is very old in mainline terms. Are you sure the bug hasn't > > been fixed in mainline by other means? > > I cannot confirm if it have fixed in latest kernel, the server is a > production system, it's hard to debug it and try reproduce. Well. That makes it hard to run tests, but perhaps it can be determined from code review.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [git patches] libata fixes
On Sun, 11 Mar 2007, Paul Rolland wrote: > > My machine is having two problems : the one you are describing above, > which is due to a SIL controler being connected to one port of the ICH7 > (at least, it seems to), and probing it goes timeout, but nothing is > connected on it. Ok, so that's just a message irritation, not actually bothersome otherwise? > The second problem is a Jmicron363 controler that is failing to detect > the DVD-RW that is connected, unless I use the irqpoll option as Tejun has > suggested. .. and this one has never worked without irqpoll? > But, as you suggest it, I'm adding pci=nomsi to the command line > rebooting... no change for this part of the problem. > > OK, the /proc/interrupt for this config, and the dmesg attached. > > 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts >CPU0 CPU1 > 0: 297549 0 IO-APIC-edge timer > 1: 7 0 IO-APIC-edge i8042 > 4: 13 0 IO-APIC-edge serial > 6: 5 0 IO-APIC-edge floppy > 8: 1 0 IO-APIC-edge rtc > 9: 0 0 IO-APIC-fasteoi acpi > 12:126 0 IO-APIC-edge i8042 > 14: 8313 0 IO-APIC-edge libata > 15: 0 0 IO-APIC-edge libata > 16: 0 0 IO-APIC-fasteoi eth1, libata So it's the irq16 one that is the Jmicron controller and just isn't getting any interrupts? Since all the other interrupts work (and MSI worked for other controllers), I don't think it's interrupt-routing related. Especially as MSI shouldn't even care about things like that. And since it all works when "irqpoll" is used, that implies that the *only* thing that is broken is literally irq delivery. Is there possibly some jmicron-specific "enable interrupts" bit? > PS : I'd like to try 2.6.21-rc3, but it seems that this is breaking my > config : disk naming is no more the same, and I end up with a panic > Warning: unable to open an initial console > though i've been compiling with the same .config I was using for 2.6.21-rc2 Gaah. Can you get a log through serial console or netconsole to see what changed? Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA resume slowness, e1000 MSI warning
> Quoting Eric W. Biederman <[EMAIL PROTECTED]>: > Subject: Re: SATA resume slowness, e1000 MSI warning > > "Michael S. Tsirkin" <[EMAIL PROTECTED]> writes: > > > OK I guess. I gather we assume writing read-only registers has no side > > effects? > > Are there rumors circulating wrt to these? > > I haven't heard anything about that, and if we are writing the same value back > it should be pretty safe. > > I have heard it asserted that at least one version of the pci spec > only required 32bit accesses to be supported by the hardware. One of > these days I will have to look that and see if it is true. Maybe. But surely before the PCI-X days. > I do know > it can be weird for hardware developers to support multiple kinds of > decode. Is this the only place where Linux uses pci_read_config_word/pci_read_config_dword? I think such hardware will be pretty much DOA on all OS-es. Why don't we wait and see whether someone reports a broken config? > As I recall for pci and pci-x at the hardware level the only > difference in between 32bit transactions and smaller ones is the state > of the byte-enable lines. True, and same holds for PCI-Express. So let's assume hardware implements RO correctly but ignores the BE bits - nothing bad happens then, right? -- MST - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler
Con Kolivas wrote: > On Monday 12 March 2007 08:52, Con Kolivas wrote: > > And thank you! I think I know what's going on now. I think each rotation > > is followed by another rotation before the higher priority task is > > getting a look in in schedule() to even get quota and add it to the > > runqueue quota. I'll try a simple change to see if that helps. Patch > > coming up shortly. > > Can you try the following patch and see if it helps. There's also one > minor preemption logic fix in there that I'm planning on including. > Thanks! Applied on top of v0.28 mainline, and there is no difference. What's it look like on your machine? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs
On Wed, Mar 07, 2007 at 11:13:14AM +0100, Christoph Hellwig wrote: > xfs_buf_get_noaddr. There's a subtile change because > xfs_buf_get_empty returns the buffer locked, but xfs_buf_get_noaddr > returns it unlocked. From my auditing and testing nothing in the > log I/O code cares about this distincition, but I'd be happy if > someone could try to prove this independently. Looks safe to me - we initialise all the fields in the xfs_buf_t when we allocate out of the slab, so it doesn't really matter what state the buffer is in when we free it. OTOH, all other buffers are supposed to be locked when under I/O. This change makes a special case for the log buffers, and I'd prefer not to have to remember that this behaviour changed fo log buffers at some point in time. I suggest that adding: > - iclog->hic_data = (xlog_in_core_2_t *) > - kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE); > - > iclog->ic_prev = prev_iclog; > prev_iclog = iclog; > + > + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); > + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); + XFS_BUF_PSEMA(bp, PRIBIO); > + iclog->ic_bp = bp; > + iclog->hic_data = bp->b_addr; > + > log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header); > > head = >ic_header; To lock the buffer should be added here. That way we don't change any semantics of the code at all. > @@ -1216,11 +1221,6 @@ > INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT); > memcpy(>h_fs_uuid, >m_sb.sb_uuid, sizeof(uuid_t)); > > - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp); > - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); > - iclog->ic_bp = bp; > > iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize; > iclog->ic_state = XLOG_STATE_ACTIVE; > @@ -1229,7 +1229,6 @@ > iclog->ic_datap = (char *)iclog->hic_data + log->l_iclog_hsize; > > ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp)); > - ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0); And this assert can then stay... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Null pointer in autofs4 (_spin_lock) in 2.6.21-rc2
On Sun, 11 Mar 2007, Thomas Renninger wrote: > On Thu, 2007-03-08 at 19:39 +0900, Ian Kent wrote: > > On Thu, 2007-03-08 at 11:12 +0100, Thomas Renninger wrote: > > > On Thu, 2007-03-08 at 01:28 -0800, Andrew Morton wrote: > > > > > On Thu, 08 Mar 2007 09:57:56 +0100 Thomas Renninger <[EMAIL > > > > > PROTECTED]> wrote: > > > > > I saw this happening several times on 2.6.21-rc2. > > > > > Tell me how I can help... > > > > > Some nfs partitions are mounted via nfs using autofs. > > > > > It takes some hours to run into this: > > > > > > > > > > Unable to handle kernel NULL pointer dereference at 0008 > > > > > RIP: > > > > > [] _spin_lock+0x0/0xf > > > > > PGD 1dde23067 PUD 1d3060067 PMD 0 > > > > > Oops: 0002 [1] SMP > > > > > CPU 3 > > > > > Modules linked in: autofs4 nfs lockd nfs_acl sunrpc asus_acpi > > > > > af_packet > > > > > tg3 ipv6 button battery ac ext2 mbcache loop dm_mod floppy parport_pc > > > > > lp > > > > > parport reiserfs pata_amd edd fan thermal sg processor sata_sil libata > > > > > amd74xx sd_mod scsi_mod ide_disk ide_core > > > > > Pid: 11373, comm: touch Not tainted 2.6.21-rc2-default #6 > > > > > RIP: 0010:[] [] > > > > > _spin_lock+0x0/0xf > > > > > RSP: 0018:8101c50a5a50 EFLAGS: 00010202 > > > > > RAX: 8100eb8916f8 RBX: 81010007dcd8 RCX: 8100ea45b280 > > > > > RDX: 10e58c2e RSI: 810163bf9e50 RDI: 0008 > > > > > RBP: 810163bf9e50 R08: 8101c50a4000 R09: 8101c50a5ea8 > > > > > R10: 81010003fca8 R11: 802299ad R12: > > > > > R13: 8100eb891680 R14: 0005 R15: 8101c50a5b48 > > > > > FS: 2b8ae744bf20() GS:81010016a7c0() > > > > > knlGS:b7bd88d0 > > > > > CS: 0010 DS: ES: CR0: 8005003b > > > > > CR2: 0008 CR3: 0001b925f000 CR4: 06e0 > > > > > Process touch (pid: 11373, threadinfo 8101c50a4000, task > > > > > 8101b78bd100) > > > > > Stack: 882d5f38 8101c50a5ea8 8100ec8df4b0 > > > > > 00d0 > > > > > 8100eb8916f8 810163bf9efc 10e58c2eea45b220 8100ea45b220 > > > > > 810163bf9e50 8100ea45b220 8100ec8df4b0 8100ec8df568 > > > > > Call Trace: > > > > > [] :autofs4:autofs4_lookup+0xcb/0x311 > > > > > [] do_lookup+0xc4/0x1ae > > > > > [] __link_path_walk+0x8ec/0xd9d > > > > > [] :sunrpc:rpcauth_lookup_credcache+0x12e/0x24a > > > > > [] link_path_walk+0x58/0xe0 > > > > > [] __strncpy_from_user+0x17/0x41 > > > > > [] __link_path_walk+0x5c9/0xd9d > > > > > [] link_path_walk+0x58/0xe0 > > > > > [] __strncpy_from_user+0x17/0x41 > > > > > [] do_path_lookup+0x1b6/0x217 > > > > > [] __path_lookup_intent_open+0x56/0x97 > > > > > [] open_namei+0xa9/0x64c > > > > > [] do_page_fault+0x45e/0x7ad > > > > > [] do_filp_open+0x1c/0x38 > > > > > [] __strncpy_from_user+0x17/0x41 > > > > > [] do_sys_open+0x44/0xc1 > > > > > [] system_call+0x7e/0x83 > > > > > > > > > > > > > > > Code: f0 ff 0f 79 09 f3 90 83 3f 00 7e f9 eb f2 c3 f0 81 2f 00 00 > > > > > RIP [] _spin_lock+0x0/0xf > > > > > RSP > > > > > CR2: 0008 > > > > > > > > I assume 2.6.20 is OK? > > > Can't say for sure, I expect yes. > > > Set up with 2.6.20 now and let it run for a day or two. > > > Maybe someone has worked in that area and has an idea meanwhile... > > > > Do we have any idea on what was being opened here? > > Might be useful to see the autofs maps if possible. > I sent that stuff to Ian... > > However, I couldn't run into that with 2.6.20 and also not with > *2.6.21-rc3* (yet). Maybe it already got fixed? > Machine still running, I'll report back if this should happen again. I suspect the problem is still present but maybe a bit hard to trigger. I'm not convinced this is needed but it is the only thing that looks at all suspicious so if (when) you see this again could you give the patch below a try please. Ian --- --- linux-2.6.21-rc3/fs/autofs4/root.c.sbi-check2007-03-12 13:29:42.0 +0900 +++ linux-2.6.21-rc3/fs/autofs4/root.c 2007-03-12 13:30:04.0 +0900 @@ -503,6 +503,9 @@ static struct dentry *autofs4_lookup_unh const unsigned char *str = name->name; struct list_head *p, *head; + if (!sbi) + return NULL; + spin_lock(_lock); spin_lock(>rehash_lock); head = >rehash_list; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [PATCH -mm 3/7] Freezer: Remove PF_NOFREEZE from rcutorture thread)
On Sun, Mar 11, 2007 at 06:49:08PM +0100, Rafael J. Wysocki wrote: > On Saturday, 3 March 2007 18:32, Oleg Nesterov wrote: > > On 03/02, Paul E. McKenney wrote: > > > > > > On Sat, Mar 03, 2007 at 02:33:37AM +0300, Oleg Nesterov wrote: > > > > On 03/02, Paul E. McKenney wrote: > > > > > > > > > > One way to embed try_to_freeze() into kthread_should_stop() might be > > > > > as follows: > > > > > > > > > > int kthread_should_stop(void) > > > > > { > > > > > if (kthread_stop_info.k == current) > > > > > return 1; > > > > > try_to_freeze(); > > > > > return 0; > > > > > } > > > > > > > > I think this is dangerous. For example, worker_thread() will probably > > > > need some special actions after return from refrigerator. Also, a kernel > > > > thread may check kthread_should_stop() in the place where > > > > try_to_freeze() > > > > is not safe. > > > > > > > > Perhaps we should introduce a new helper which does this. > > > > > > Good point -- the return value from try_to_freeze() is lost if one uses > > > the above approach. About one third of the calls to try_to_freeze() > > > in 2.6.20 pay attention to the return value. > > > > > > One approach would be to have a kthread_should_stop_nofreeze() for those > > > cases, and let the default be to try to freeze. > > > > I personally think we should do the opposite, add > > kthread_should_stop_check_freeze() > > or something. kthread_should_stop() is like signal_pending(), we can use > > it under spin_lock (and it is probably used this way by some out-of-tree > > driver). The new helper is obviously "might_sleep()". > > Something like this, perhaps: Looks good to me! The other kthread_should_stop() calls in rcutorture.c should also become kthread_should_top_check_freeze(). Acked-by: Paul E. McKenney <[EMAIL PROTECTED]> > include/linux/kthread.h |1 + > kernel/kthread.c| 16 > kernel/rcutorture.c |5 ++--- > 3 files changed, 19 insertions(+), 3 deletions(-) > > Index: linux-2.6.21-rc3-mm2/kernel/kthread.c > === > --- linux-2.6.21-rc3-mm2.orig/kernel/kthread.c2007-03-08 > 21:58:48.0 +0100 > +++ linux-2.6.21-rc3-mm2/kernel/kthread.c 2007-03-11 18:32:59.0 > +0100 > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > #include > > /* > @@ -60,6 +61,21 @@ int kthread_should_stop(void) > } > EXPORT_SYMBOL(kthread_should_stop); > > +/** > + * kthread_should_stop_check_freeze - check if the thread should return now > and > + * if not, check if there is a freezing request pending for it. > + */ > +int kthread_should_stop_check_freeze(void) > +{ > + might_sleep(); > + if (kthread_stop_info.k == current) > + return 1; > + > + try_to_freeze(); > + return 0; > +} > +EXPORT_SYMBOL(kthread_should_stop_check_freeze); > + > static void kthread_exit_files(void) > { > struct fs_struct *fs; > Index: linux-2.6.21-rc3-mm2/include/linux/kthread.h > === > --- linux-2.6.21-rc3-mm2.orig/include/linux/kthread.h 2007-02-04 > 19:44:54.0 +0100 > +++ linux-2.6.21-rc3-mm2/include/linux/kthread.h 2007-03-11 > 18:37:10.0 +0100 > @@ -29,5 +29,6 @@ struct task_struct *kthread_create(int ( > void kthread_bind(struct task_struct *k, unsigned int cpu); > int kthread_stop(struct task_struct *k); > int kthread_should_stop(void); > +int kthread_should_stop_check_freeze(void); > > #endif /* _LINUX_KTHREAD_H */ > Index: linux-2.6.21-rc3-mm2/kernel/rcutorture.c > === > --- linux-2.6.21-rc3-mm2.orig/kernel/rcutorture.c 2007-03-11 > 11:39:06.0 +0100 > +++ linux-2.6.21-rc3-mm2/kernel/rcutorture.c 2007-03-11 18:45:00.0 > +0100 > @@ -540,10 +540,9 @@ rcu_torture_writer(void *arg) > } > rcu_torture_current_version++; > oldbatch = cur_ops->completed(); > - try_to_freeze(); > - } while (!kthread_should_stop() && !fullstop); > + } while (!kthread_should_stop_check_freeze() && !fullstop); > VERBOSE_PRINTK_STRING("rcu_torture_writer task stopping"); > - while (!kthread_should_stop()) > + while (!kthread_should_stop_check_freeze()) > schedule_timeout_uninterruptible(1); > return 0; > } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL v0.30 cpu scheduler for ... 2.6.18.8 kernel
On Monday 12 March 2007 19:17, Vincent Fortier wrote: > > There are updated patches for 2.6.20, 2.6.20.2, 2.6.21-rc3 and > > 2.6.21-rc3-mm2 to bring RSDL up to version 0.30 for download here: > > > > Full patches: > > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20-sched-rsdl-0.30.p > >at ch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2-rsdl-0.30.patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-sched-rsdl-0. > >30 .patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2-rsdl-0.30 > >.p atch > > > > incrementals: > > > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20/2.6.20.2-rsdl-0.2 > >9- 0.30.patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.2/2.6.20.2-rsdl-0 > >.2 9-0.30.patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3/2.6.21-rc3-rs > >dl -0.29-0.30.patch > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc3-mm2/2.6.21-rc > >3- mm2-rsdl-0.29-0.30.patch > And here are the backported RSDL 0.30 patches in case any of you would > still be running an older 2.6.18.8 kernel ... Thanks, your efforts are appreciated as it would take me quite a while to do a variety of backports that people are already requesting. > Just for info, verison 0.30 seems around 2 seconds faster than 0.26-0.29 > versions at boot time. I used to have around 2-3 seconds of difference > between a vanilla and a rsdl patched kernel. Now it looks more like 5 > seconds faster! Wow.. nice work CK! > > 2.6.18.8 vanilla kernel: > [ 68.514248] ACPI: Power Button (CM) [PWRB] > 2.6.18.8-rsdl-0.30: > [ 63.739337] ACPI: Power Button (CM) [PWRB] Indeed there's almost 5 seconds difference there. To be honest, the boot time speedups are an unexpected bonus, but everyone seems to be reporting them on all flavours so perhaps all those timeout related driver setups are inadvertently benefiting. > - vin Thanks -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc3-mm1
On Sun, Mar 11, 2007 at 06:02:31PM +0100, Michal Piotrowski wrote: > On 10/03/07, Paul E. McKenney <[EMAIL PROTECTED]> wrote: > >On Fri, Mar 09, 2007 at 06:18:51PM -0800, Andrew Morton wrote: > >> > On Thu, 08 Mar 2007 21:50:29 +0100 Michal Piotrowski > ><[EMAIL PROTECTED]> wrote: > >> > Andrew Morton napisaĆ(a): > >> > > Temporarily at > >> > > > >> > > http://userweb.kernel.org/~akpm/2.6.21-rc3-mm1/ > >> > > > >> > > Will appear later at > >> > > > >> > > > >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc3/2.6.21-rc3-mm1/ > >> > > > >> > > >> > cpu_hotplug (AutoTest) hangs at this > >> > > >> > = > >> > [ INFO: possible recursive locking detected ] > >> > 2.6.21-rc3-mm1 #2 > >> > - > >> > sh/7213 is trying to acquire lock: > >> > (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > >> > > >> > but task is already holding lock: > >> > (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > >> > > >> > other info that might help us debug this: > >> > 4 locks held by sh/7213: > >> > #0: (cpu_add_remove_lock){--..}, at: [] > >mutex_lock+0x1c/0x1f > >> > #1: (sched_hotcpu_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > >> > #2: (cache_chain_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > >> > #3: (workqueue_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > >> > >> That's pretty useless, isn't it? We need to know the mutex_lock() caller > >> here. > >> > >> > stack backtrace > >> > [] show_trace_log_lvl+0x1a/0x2f > >> > [] show_trace+0x12/0x14 > >> > [] dump_stack+0x16/0x18 > >> > [] __lock_acquire+0x1aa/0xceb > >> > [] lock_acquire+0x79/0x93 > >> > [] __mutex_lock_slowpath+0x107/0x349 > >> > [] mutex_lock+0x1c/0x1f > >> > [] sched_getaffinity+0x14/0x91 > >> > [] __synchronize_sched+0x11/0x5f > >> > [] detach_destroy_domains+0x2c/0x30 > >> > [] update_sched_domains+0x27/0x3a > >> > [] notifier_call_chain+0x2b/0x4a > >> > [] __raw_notifier_call_chain+0x19/0x1e > >> > [] _cpu_down+0x70/0x282 > >> > [] cpu_down+0x26/0x38 > >> > [] store_online+0x27/0x5a > >> > [] sysdev_store+0x20/0x25 > >> > [] sysfs_write_file+0xc1/0xe9 > >> > [] vfs_write+0xd1/0x15a > >> > [] sys_write+0x3d/0x72 > >> > [] syscall_call+0x7/0xb > >> > > >> > l *0xc033883a > >> > 0xc033883a is in mutex_lock > >(/mnt/md0/devel/linux-mm/kernel/mutex.c:92). > >> > 87 /* > >> > 88 * The locking fastpath is the 1->0 transition from > >> > 89 * 'unlocked' into 'locked' state. > >> > 90 */ > >> > 91 __mutex_fastpath_lock(>count, > >__mutex_lock_slowpath); > >> > 92 } > >> > 93 > >> > 94 EXPORT_SYMBOL(mutex_lock); > >> > 95 > >> > 96 static void fastcall noinline __sched > >> > > >> > I didn't test other -mm's with this test. > >> > > >> > > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3-mm1/console.log > >> > > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3-mm1/mm-config > >> > >> I can't immediately spot the bug. Probably it's caused by rcu-preempt's > >> changes to synchronize_sched(): that function now does a heap more than > >it > >> used to, including taking sched_hotcpu_muex. > >> > >> So, what to do about this. Paul, I'm thinking that I should drop > >> rcu-preempt for now - I don't think we ended up being able to identify > >any > >> particular benefit which it brings to current mainline, and I suspect > >that > >> things will become simpler if/when we start using the process freezer for > >> CPU hotplug. > > > >It certainly makes sense for Michal to try backing out rcu-preempt using > >your broken-out list of patches. If that makes the problem go away, > > Problem is caused by rcu-preempt.patch. OK, clearly we need to fix this. You might be right about the freezer code having to go in first, Andrew -- will see! Thanx, Paul > >then I would certainly have a hard time arguing with you. We are working > >on getting measurements showing benefit of rcu-preempt, but aren't there > >yet. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler
On Monday 12 March 2007 15:42, Al Boldi wrote: > Con Kolivas wrote: > > On Monday 12 March 2007 08:52, Con Kolivas wrote: > > > And thank you! I think I know what's going on now. I think each > > > rotation is followed by another rotation before the higher priority > > > task is getting a look in in schedule() to even get quota and add it to > > > the runqueue quota. I'll try a simple change to see if that helps. > > > Patch coming up shortly. > > > > Can you try the following patch and see if it helps. There's also one > > minor preemption logic fix in there that I'm planning on including. > > Thanks! > > Applied on top of v0.28 mainline, and there is no difference. > > What's it look like on your machine? The higher priority one always get 6-7ms whereas the lower priority one runs 6-7ms and then one larger perfectly bound expiration amount. Basically exactly as I'd expect. The higher priority task gets precisely RR_INTERVAL maximum latency whereas the lower priority task gets RR_INTERVAL min and full expiration (according to the virtual deadline) as a maximum. That's exactly how I intend it to work. Yes I realise that the max latency ends up being longer intermittently on the niced task but that's -in my opinion- perfectly fine as a compromise to ensure the nice 0 one always gets low latency. Eg: nice 0 vs nice 10 nice 0: pid 6288, prio 0, out for7 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms pid 6288, prio 0, out for6 ms nice 10: pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for 66 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms pid 6290, prio 10, out for6 ms exactly as I'd expect. If you want fixed latencies _of niced tasks_ in the presence of less niced tasks you will not get them with this scheduler. What you will get, though, is a perfectly bound relationship knowing exactly what the maximum latency will ever be. Thanks for the test case. It's interesting and nice that it confirms this scheduler works as I expect it to. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
2007/3/12, Jan Engelhardt <[EMAIL PROTECTED]>: On Mar 11 2007 22:15, Cong WANG wrote: > > I have a question about coding style in linux kernel. In > Documention/CodingStyle, it is said that "Linux style for comments is > the C89 "/* ... */" style. Don't use C99-style "// ..." comments." > _But_ I see a lot of '//' style comments in current kernel code. > > Which is wrong? The documentions or the code, or neither? And why? The code. And because it's not always reviewed but silently pushed. > Another question is about NULL. AFAIK, in user space, using NULL is > better than directly using 0 in C. In kernel, I know it used its own > NULL, which may be defined as ((void*)0), but it's _still_ different > from raw zero. In what way? The following code is picked from drivers/kvm/kvm_main.c: static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot) { struct kvm_vcpu *vcpu = >vcpus[vcpu_slot]; mutex_lock(>mutex); if (unlikely(!vcpu->vmcs)) { mutex_unlock(>mutex); return 0; } return kvm_arch_ops->vcpu_load(vcpu); } Obviously, it used 0 rather than NULL when returning a pointer to indicate an error. Should we fix such issue? >So can I say using NULL is better than 0 in kernel? On what basis? Do you even know what NULL is defined as in (C, not C++) userspace? Think about it. I think it's more clear to indicate we are using a pointer rather than an integer when we use NULL in kernel. But in userspace, using NULL is for portbility of the program, although most (*just* most, NOT all) of NULL's defination is ((void*)0). ;-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL for 2.6.21-rc3- 0.29
On Sunday 11 March 2007, Con Kolivas wrote: >On Sunday 11 March 2007 15:03, Matt Mackall wrote: >> On Sat, Mar 10, 2007 at 10:01:32PM -0600, Matt Mackall wrote: >> > On Sun, Mar 11, 2007 at 01:28:22PM +1100, Con Kolivas wrote: >> > > Ok I don't think there's any actual accounting problem here per se >> > > (although I did just recently post a bugfix for rsdl however I >> > > think that's unrelated). What I think is going on in the ccache >> > > testcase is that all the work is being offloaded to kernel threads >> > > reading/writing to/from the filesystem and the make is not getting >> > > any actual cpu time. >> > >> > I don't see significant system time while this is happening. >> >> Also, it's running pretty much entirely out of page cache so there >> wouldn't be a whole lot for kernel threads to do. > >Well I can't reproduce that behaviour here at all whether from disk or > the pagecache with ccache, so I'm not entirely sure what's different at > your end. However both you and the other person reporting bad behaviour > were using ATI drivers. That's about the only commonality? I wonder if > they do need to yield... somewhat instead of not at all. I hate to say it Con, but this one seems to have broken the amanda-tar symbiosis. I haven't tried a plain 21-rc3, so the problem may exist there, and in fact it did for 21-rc1, but I don't recall if it was true for -rc2. But I will have a plain 21-rc3 running by tomorrow nights amanda run to test. What happens is that when amanda tells tar to do a level 1 or 2, tar still thinks its doing a level 0. The net result is that the tape is filled completely and amanda does an EOT exit in about 10 of my 42 dle's. This is tar-1.15-1 for fedora core 6. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) While it may be true that a watched pot never boils, the one you don't keep an eye on can make an awful mess of your stove. -- Edward Stevenson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mar 12 2007 13:37, Cong WANG wrote: > > The following code is picked from drivers/kvm/kvm_main.c: > > static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot) > { > struct kvm_vcpu *vcpu = >vcpus[vcpu_slot]; > > mutex_lock(>mutex); > if (unlikely(!vcpu->vmcs)) { > mutex_unlock(>mutex); > return 0; > } > return kvm_arch_ops->vcpu_load(vcpu); > } > > Obviously, it used 0 rather than NULL when returning a pointer to > indicate an error. Should we fix such issue? Indeed. If it was for me, something like that should throw a compile error. >>[...] > I think it's more clear to indicate we are using a pointer rather than > an integer when we use NULL in kernel. But in userspace, using NULL is > for portbility of the program, although most (*just* most, NOT all) of > NULL's defination is ((void*)0). ;-) NULL has the same bit pattern as the number zero. (I'm not saying the bit pattern is all zeroes. And I am not even sure if NULL ought to have the same pattern as zero.) So C++ could use (void *)0, if it would let itself :p > > Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL for 2.6.21-rc3- 0.29
Hi Gene. On Monday 12 March 2007 16:38, Gene Heskett wrote: > I hate to say it Con, but this one seems to have broken the amanda-tar > symbiosis. > > I haven't tried a plain 21-rc3, so the problem may exist there, and in > fact it did for 21-rc1, but I don't recall if it was true for -rc2. But > I will have a plain 21-rc3 running by tomorrow nights amanda run to test. > > What happens is that when amanda tells tar to do a level 1 or 2, tar still > thinks its doing a level 0. The net result is that the tape is filled > completely and amanda does an EOT exit in about 10 of my 42 dle's. This > is tar-1.15-1 for fedora core 6. I'm sorry but I have to say I have no idea what any of this means. I gather you're making an association between some application combination failing and RSDL cpu scheduler. Unfortunately the details of what the problem is, or how the cpu scheduler is responsible, escape me :( -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mon, 2007-03-12 at 06:40 +0100, Jan Engelhardt wrote: > On Mar 12 2007 13:37, Cong WANG wrote: > > > > The following code is picked from drivers/kvm/kvm_main.c: > > > > static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot) > > { > > struct kvm_vcpu *vcpu = >vcpus[vcpu_slot]; > > > > mutex_lock(>mutex); > > if (unlikely(!vcpu->vmcs)) { > > mutex_unlock(>mutex); > > return 0; > > } > > return kvm_arch_ops->vcpu_load(vcpu); > > } > > > > Obviously, it used 0 rather than NULL when returning a pointer to > > indicate an error. Should we fix such issue? > > Indeed. If it was for me, something like that should throw a compile error. > > >>[...] > > I think it's more clear to indicate we are using a pointer rather than > > an integer when we use NULL in kernel. But in userspace, using NULL is > > for portbility of the program, although most (*just* most, NOT all) of > > NULL's defination is ((void*)0). ;-) > > NULL has the same bit pattern as the number zero. (I'm not saying the bit > pattern is all zeroes. And I am not even sure if NULL ought to have the same > pattern as zero.) So C++ could use (void *)0, if it would let itself :p Not necessarily. You can use 0 at the source level, but the compiler has to convert it to the actual NULL pointer bit pattern, whatever it may be. In C++, NULL is typically defined to 0 (with no void* cast) by most compilers because 0 (and only 0) can be implicitly converted to to null pointer of any ponter type without a cast. GCC introduced the __null extension so that NULL still works correctly in C++ when passed to a varargs function on 64-bit platforms. (This just works in C because C makes NULL ((void*)0) is thus is the right size. In C++, the 0 ends up being an int instead of a pointer when passed to a varargs function, and things tend to blow up when they read the garbage high bits. Of course, nobody else does this, so you still have to use (void*)NULL to be portable.) -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: "Make nenuconfig" does not save parameters.
On 3/11/07, Sam Ravnborg <[EMAIL PROTECTED]> wrote: [..snip..] | > To make the conversion we should consider renaming from | > current "Load alternate" to "Open config file..." | > and likewise "Save alternate" to "Save config file as..." | > | > Comments? | > | >Sam [..snip...] I think that is excellent. (Actually I can't test it now but the idea is just perfect) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Style Question
On Mon, 12 Mar 2007, Jan Engelhardt wrote: > > On Mar 12 2007 13:37, Cong WANG wrote: > > > > The following code is picked from drivers/kvm/kvm_main.c: > > > > static struct kvm_vcpu *vcpu_load(struct kvm *kvm, int vcpu_slot) > > { > > struct kvm_vcpu *vcpu = >vcpus[vcpu_slot]; > > > > mutex_lock(>mutex); > > if (unlikely(!vcpu->vmcs)) { > > mutex_unlock(>mutex); > > return 0; > > } > > return kvm_arch_ops->vcpu_load(vcpu); > > } > > > > Obviously, it used 0 rather than NULL when returning a pointer to > > indicate an error. Should we fix such issue? > > Indeed. If it was for me, something like that should throw a compile error. At least it does throw a sparse warning, and yes, it should be fixed. > >>[...] > > I think it's more clear to indicate we are using a pointer rather than > > an integer when we use NULL in kernel. But in userspace, using NULL is > > for portbility of the program, although most (*just* most, NOT all) of > > NULL's defination is ((void*)0). ;-) > > NULL has the same bit pattern as the number zero. (I'm not saying the bit > pattern is all zeroes. And I am not even sure if NULL ought to have the same > pattern as zero.) So C++ could use (void *)0, if it would let itself :p > > > > > > > > Jan > -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL for 2.6.21-rc3- 0.29
On Monday 12 March 2007, Con Kolivas wrote: >Hi Gene. > >On Monday 12 March 2007 16:38, Gene Heskett wrote: >> I hate to say it Con, but this one seems to have broken the amanda-tar >> symbiosis. >> >> I haven't tried a plain 21-rc3, so the problem may exist there, and in >> fact it did for 21-rc1, but I don't recall if it was true for -rc2. >> But I will have a plain 21-rc3 running by tomorrow nights amanda run >> to test. >> >> What happens is that when amanda tells tar to do a level 1 or 2, tar >> still thinks its doing a level 0. The net result is that the tape is >> filled completely and amanda does an EOT exit in about 10 of my 42 >> dle's. This is tar-1.15-1 for fedora core 6. > >I'm sorry but I have to say I have no idea what any of this means. I > gather you're making an association between some application > combination failing and RSDL cpu scheduler. Unfortunately the details > of what the problem is, or how the cpu scheduler is responsible, escape > me :( I have another backup running right now, after building a plain 2.6.21-rc3, and rebooting just now for the test. I don't think its the scheduler itself, but is something post 2.6.20 that is messing with tars mind and making it think the files it just read to do the estimate phase, are all new, so even a level 2 is in effect a level 0. I'll have an answer in about an hour, but its also 2:36am here and I'm headed for the rack to get some zzz's. So I'll report in the morning as to whether or not this backup ran as it was supposed to. I have a feeling its not going to though. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) "When it comes to humility, I'm the greatest." -- Bullwinkle Moose - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH]Replace 0 with NULL when returning a pointer
Use NULL to indicate we are returning a pointer rather than an integer and to eliminate some sparse warnings. Signed-off-by: Cong WANG <[EMAIL PROTECTED]> --- --- drivers/kvm/kvm_main.c.orig 2007-03-11 21:41:23.0 +0800 +++ drivers/kvm/kvm_main.c 2007-03-12 14:26:17.0 +0800 @@ -205,7 +205,7 @@ static struct kvm_vcpu *vcpu_load(struct mutex_lock(>mutex); if (unlikely(!vcpu->vmcs)) { mutex_unlock(>mutex); - return 0; + return NULL; } return kvm_arch_ops->vcpu_load(vcpu); } @@ -799,7 +799,7 @@ struct kvm_memory_slot *gfn_to_memslot(s && gfn < memslot->base_gfn + memslot->npages) return memslot; } - return 0; + return NULL; } EXPORT_SYMBOL_GPL(gfn_to_memslot); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH]Replace 0 with NULL when returning a pointer
Use NULL to indicate we are returning a pointer rather than an integer and to eliminate some sparse warnings. Signed-off-by: Cong WANG <[EMAIL PROTECTED]> --- --- drivers/kvm/vmx.c.orig 2007-03-11 21:41:03.0 +0800 +++ drivers/kvm/vmx.c 2007-03-12 14:25:11.0 +0800 @@ -98,7 +98,7 @@ static struct vmx_msr_entry *find_msr_en for (i = 0; i < vcpu->nmsrs; ++i) if (vcpu->guest_msrs[i].index == msr) return >guest_msrs[i]; - return 0; + return NULL; } static void vmcs_clear(struct vmcs *vmcs) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
On Tue, Mar 06, 2007 at 07:04:46PM +0100, Miklos Szeredi wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > > [EMAIL PROTECTED]: bugfix] > > Miklos Szeredi <[EMAIL PROTECTED]>: > > Changes: > - updated to apply after clear_page_dirty_for_io() race fix > > This is needed for > > - balance_dirty_pages() deadlock fix > - fuse dirty page accounting > > I have no idea how serious the scalability problems with this are. If > they are serious, different solutions can probably be found for the > above, but this is certainly the simplest. Atomic operations to a single per-backing device from all CPUs at once? That's a pretty serious scalability issue and it will cause a major performance regression for XFS. I'd call this a showstopper right now - maybe you need to look at something like the ZVC code that Christoph Lameter wrote, perhaps? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
On 3/11/07, Gene Heskett <[EMAIL PROTECTED]> wrote: On Sunday 11 March 2007, Mike Galbraith wrote: Just to comment, I've been running one of the patches between 20-ck1 and this latest one, which is building as I type, but I also run gkrellm here, version 2.2.9. Since I have been running this middle of this series patch, something is killing gkrellm about once a day, and there is nothing in the logs to indicate a problem. I see a blink out of the corner of my eye, and its gone. And it always starts right back up from a kmenu click. No idea if anyone else is experiencing this or not. -- Cheers, Gene I've had such an issue with 0.20 or something. Sometimes, the xfce4-panel would disappear (die) when I displayed its menu. Very rare issue. Doesn't happen with 0.28 anyway. :-) Which looks really good, though I'll update to 0.30. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] two more device ids for dm9601 usbnet driver
> "Jon" == Jon Dowland <[EMAIL PROTECTED]> writes: Hi, Jon> This patch for the linux-usb-devel tree adds two more Jon> product ids to the dm9601 driver. These ids were found on Jon> rebadged dm9601 devices in the wild. Jon> Signed-off-by: Jon Dowland <[EMAIL PROTECTED]> Acked-by: Peter Korsgaard <[EMAIL PROTECTED]> -- Bye, Peter Korsgaard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc2-mm2: drivers/net/wireless/libertas/debugfs.c addr bogosity
On Fri, Mar 09, 2007 at 09:14:29AM -0800, Randy Dunlap wrote: > Good to use FIELD_SIZEOF(), Thanks. > but in general, we prefer to use it > directly, not in yet another wrapper. I left the item_{size,addr} in place as it seemed to make the item[] more compact. I'm not certain using the FIELD_SIZEOF() macro directly is a win. From: Tony Breeds <[EMAIL PROTECTED]> Cleanup drivers/net/wireless/libertas/debugfs.c to use standard kernel macros and functions. Signed-off-by: Tony Breeds <[EMAIL PROTECTED]> --- only compile tested on x86 drivers/net/wireless/libertas/debugfs.c | 56 +++ 1 files changed, 12 insertions(+), 44 deletions(-) diff --git a/drivers/net/wireless/libertas/debugfs.c b/drivers/net/wireless/libertas/debugfs.c index 3ad1e03..8b0e3ec 100644 --- a/drivers/net/wireless/libertas/debugfs.c +++ b/drivers/net/wireless/libertas/debugfs.c @@ -1771,58 +1771,26 @@ void libertas_debugfs_remove_one(wlan_private *priv) } /* debug entry */ - -#define item_size(n) (sizeof ((wlan_adapter *)0)->n) -#define item_addr(n) ((u32) &((wlan_adapter *)0)->n) - struct debug_data { char name[32]; u32 size; u32 addr; }; -/* To debug any member of wlan_adapter, simply add one line here. - */ +/* To debug any member of wlan_adapter, simply add a record here. */ static struct debug_data items[] = { - {"intcounter", item_size(intcounter), item_addr(intcounter)}, - {"psmode", item_size(psmode), item_addr(psmode)}, - {"psstate", item_size(psstate), item_addr(psstate)}, + { .name = "intcounter", + .size = FIELD_SIZEOF(wlan_adapter, intcounter), + .addr = offsetof(wlan_adapter, intcounter) }, + { .name = "psmode", + .size = FIELD_SIZEOF(wlan_adapter, psmode), + .addr = offsetof(wlan_adapter, psmode) }, + { .name = "psstate", + .size = FIELD_SIZEOF(wlan_adapter, psstate), + .addr = offsetof(wlan_adapter, psstate) }, }; -static int num_of_items = sizeof(items) / sizeof(items[0]); - -/** - * @brief convert string to number - * - * @param s pointer to numbered string - * @return converted number from string s - */ -static int string_to_number(char *s) -{ - int r = 0; - int base = 0; - - if ((strncmp(s, "0x", 2) == 0) || (strncmp(s, "0X", 2) == 0)) - base = 16; - else - base = 10; - - if (base == 16) - s += 2; - - for (s = s; *s != 0; s++) { - if ((*s >= 48) && (*s <= 57)) - r = (r * base) + (*s - 48); - else if ((*s >= 65) && (*s <= 70)) - r = (r * base) + (*s - 55); - else if ((*s >= 97) && (*s <= 102)) - r = (r * base) + (*s - 87); - else - break; - } - - return r; -} +static int num_of_items = ARRAY_SIZE(items); /** * @brief proc read function @@ -1912,7 +1880,7 @@ static int wlan_debugfs_write(struct file *f, const char __user *buf, if (!p2) break; p2++; - r = string_to_number(p2); + r = simple_strtoul(p2, NULL, 0); if (d[i].size == 1) *((u8 *) d[i].addr) = (u8) r; else if (d[i].size == 2) Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git patches] libata fixes
Hello, Linus. Linus Torvalds wrote: > On Sun, 11 Mar 2007, Paul Rolland wrote: >> My machine is having two problems : the one you are describing above, >> which is due to a SIL controler being connected to one port of the ICH7 >> (at least, it seems to), and probing it goes timeout, but nothing is >> connected on it. > > Ok, so that's just a message irritation, not actually bothersome > otherwise? It involves a long timeout, so it's bothersome. This is caused by Silicon Image 4726/3726 storage processor (SATA Port Multiplier with extra features) attached to one of the ICH ports. If the first downstream port in the PMP is empty and it gets reset in non-PMP way, it identifies itself as "Config Disk" of quite small size. It's probably used to configure the extra features using standard ATA RW commands. Anyways, this "Config Disk" is a bit peculiar and doesn't work very well with the current ATA reset sequence and gets identified only after a few failures thus causing long timeout. I keep forgetting about this. I'll ask SIMG how to deal with this. For the time being, connecting a device to the PMP port should remove the timeouts. >> The second problem is a Jmicron363 controler that is failing to detect >> the DVD-RW that is connected, unless I use the irqpoll option as Tejun has >> suggested. > > .. and this one has never worked without irqpoll? > >> But, as you suggest it, I'm adding pci=nomsi to the command line >> rebooting... no change for this part of the problem. >> >> OK, the /proc/interrupt for this config, and the dmesg attached. >> >> 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts >>CPU0 CPU1 >> 0: 297549 0 IO-APIC-edge timer >> 1: 7 0 IO-APIC-edge i8042 >> 4: 13 0 IO-APIC-edge serial >> 6: 5 0 IO-APIC-edge floppy >> 8: 1 0 IO-APIC-edge rtc >> 9: 0 0 IO-APIC-fasteoi acpi >> 12:126 0 IO-APIC-edge i8042 >> 14: 8313 0 IO-APIC-edge libata >> 15: 0 0 IO-APIC-edge libata >> 16: 0 0 IO-APIC-fasteoi eth1, libata > > So it's the irq16 one that is the Jmicron controller and just isn't > getting any interrupts? > > Since all the other interrupts work (and MSI worked for other > controllers), I don't think it's interrupt-routing related. Especially as > MSI shouldn't even care about things like that. > > And since it all works when "irqpoll" is used, that implies that the > *only* thing that is broken is literally irq delivery. > > Is there possibly some jmicron-specific "enable interrupts" bit? (cc'ing Justin of JMicron. Hello, please correct me if I'm wrong.) Not that I know of. The PATA portion of JMB controllers is bog standard PCI BMDMA ATA device where ATA_NIEN is the way to turn IRQ on and off. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git patches] libata fixes
Of course I forgot to CC. :-) Quoting whole message for Justin. Tejun Heo wrote: > Hello, Linus. > > Linus Torvalds wrote: >> On Sun, 11 Mar 2007, Paul Rolland wrote: >>> My machine is having two problems : the one you are describing above, >>> which is due to a SIL controler being connected to one port of the ICH7 >>> (at least, it seems to), and probing it goes timeout, but nothing is >>> connected on it. >> Ok, so that's just a message irritation, not actually bothersome >> otherwise? > > It involves a long timeout, so it's bothersome. This is caused by > Silicon Image 4726/3726 storage processor (SATA Port Multiplier with > extra features) attached to one of the ICH ports. > > If the first downstream port in the PMP is empty and it gets reset in > non-PMP way, it identifies itself as "Config Disk" of quite small size. > It's probably used to configure the extra features using standard ATA > RW commands. Anyways, this "Config Disk" is a bit peculiar and doesn't > work very well with the current ATA reset sequence and gets identified > only after a few failures thus causing long timeout. > > I keep forgetting about this. I'll ask SIMG how to deal with this. For > the time being, connecting a device to the PMP port should remove the > timeouts. > >>> The second problem is a Jmicron363 controler that is failing to detect >>> the DVD-RW that is connected, unless I use the irqpoll option as Tejun has >>> suggested. >> .. and this one has never worked without irqpoll? >> >>> But, as you suggest it, I'm adding pci=nomsi to the command line >>> rebooting... no change for this part of the problem. >>> >>> OK, the /proc/interrupt for this config, and the dmesg attached. >>> >>> 3 [23:22] [EMAIL PROTECTED]:~> cat /proc/interrupts >>>CPU0 CPU1 >>> 0: 297549 0 IO-APIC-edge timer >>> 1: 7 0 IO-APIC-edge i8042 >>> 4: 13 0 IO-APIC-edge serial >>> 6: 5 0 IO-APIC-edge floppy >>> 8: 1 0 IO-APIC-edge rtc >>> 9: 0 0 IO-APIC-fasteoi acpi >>> 12:126 0 IO-APIC-edge i8042 >>> 14: 8313 0 IO-APIC-edge libata >>> 15: 0 0 IO-APIC-edge libata >>> 16: 0 0 IO-APIC-fasteoi eth1, libata >> So it's the irq16 one that is the Jmicron controller and just isn't >> getting any interrupts? >> >> Since all the other interrupts work (and MSI worked for other >> controllers), I don't think it's interrupt-routing related. Especially as >> MSI shouldn't even care about things like that. >> >> And since it all works when "irqpoll" is used, that implies that the >> *only* thing that is broken is literally irq delivery. >> >> Is there possibly some jmicron-specific "enable interrupts" bit? > > (cc'ing Justin of JMicron. Hello, please correct me if I'm wrong.) > > Not that I know of. The PATA portion of JMB controllers is bog standard > PCI BMDMA ATA device where ATA_NIEN is the way to turn IRQ on and off. > > Thanks. > -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/