Re: [patch] CFS scheduler, -v19
Well, I am now running a 2.6.22 (without cfs) and could now see it once (within a month...) that exactly the same message from konqueror was produced. So I think its a general problem of konqueror that was hidden and somehow its triggered much more often with the cfs. I just wonder why nobody else has this problem. Markus PS: I am currently building a 2.6.23.1... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Well, I am now running a 2.6.22 (without cfs) and could now see it once (within a month...) that exactly the same message from konqueror was produced. So I think its a general problem of konqueror that was hidden and somehow its triggered much more often with the cfs. I just wonder why nobody else has this problem. Markus PS: I am currently building a 2.6.23.1... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> could you send me your debug-info: > > http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh > > just run that script on 2.6.23-rc2 system and send me the file it > produces. I've got a theory about what might be going on, and this > debug-info could help prove/disprove it. Done by private mail on friday. Also tried the very current linux-2.6.git (friday aswell) with sched-ingo-combo.patch (as told in a private answer mail). Nothing fixed it so far. If I can do anything... Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
could you send me your debug-info: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh just run that script on 2.6.23-rc2 system and send me the file it produces. I've got a theory about what might be going on, and this debug-info could help prove/disprove it. Done by private mail on friday. Also tried the very current linux-2.6.git (friday aswell) with sched-ingo-combo.patch (as told in a private answer mail). Nothing fixed it so far. If I can do anything... Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus <[EMAIL PROTECTED]> wrote: > Well, I am back now, but the problem still exists in 2.6.23-rc2. And > as there is nothing more I can do thats it for now. could you send me your debug-info: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh just run that script on 2.6.23-rc2 system and send me the file it produces. I've got a theory about what might be going on, and this debug-info could help prove/disprove it. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus [EMAIL PROTECTED] wrote: Well, I am back now, but the problem still exists in 2.6.23-rc2. And as there is nothing more I can do thats it for now. could you send me your debug-info: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh just run that script on 2.6.23-rc2 system and send me the file it produces. I've got a theory about what might be going on, and this debug-info could help prove/disprove it. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Well, I am back now, but the problem still exists in 2.6.23-rc2. And as there is nothing more I can do thats it for now. Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Well, I am back now, but the problem still exists in 2.6.23-rc2. And as there is nothing more I can do thats it for now. Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that > does not include the time.c change. Markus - does your kernel include > the code below? (if yes, please revert it via patch -p1 -R ) As already said, 2.6.22.1-cfs-v19 includes the patch and 2.6.21.6-cfs-v19 does not include it. But both suffer of the problem. I now reversed the patch on 2.6.22.1-cfs-v19 but it does not help. 2.6.22-git15 is not working as well... so the problem did not magically disappear like the processes. Any further things I can do to track it down? (I go on vacation on monday for two weeks, but just send them, I'll try them when I am back or answer the mails, when I dont need to build a new kernel...) Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that does not include the time.c change. Markus - does your kernel include the code below? (if yes, please revert it via patch -p1 -R ) As already said, 2.6.22.1-cfs-v19 includes the patch and 2.6.21.6-cfs-v19 does not include it. But both suffer of the problem. I now reversed the patch on 2.6.22.1-cfs-v19 but it does not help. 2.6.22-git15 is not working as well... so the problem did not magically disappear like the processes. Any further things I can do to track it down? (I go on vacation on monday for two weeks, but just send them, I'll try them when I am back or answer the mails, when I dont need to build a new kernel...) Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Bill Davidsen <[EMAIL PROTECTED]> wrote: Does the patch below help? Spectacularly no! With this patch the "glitch1" script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Okay, as noted off-list, after I exported the xtime_seconds it now builds and works. However, there are a *lot* of "section mismatches" which are not reassuring. Boots, runs, glitch1 test runs reasonably smoothly. automount has not used significant CPU yet, but I don't know what triggers it, the bad behavior did not happen immediately without the patch. However, it looks very hopeful. Warnings attached to save you the trouble... -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot Script started on Thu 19 Jul 2007 05:29:08 PM EDT Common profile 1.13 lastmod 2006-01-04 22:43:25-05 No common directory available Session time 17:29:08 on 07/19/07 posidon:davidsen> time nice -10 make -j4 -s; sleep 2; exit CHK include/linux/version.h CHK include/linux/utsrelease.h CHK include/linux/compile.h CHK include/linux/compile.h UPD include/linux/compile.h CHK include/linux/version.h Building modules, stage 2. WARNING: vmlinux(.text+0xc1001183): Section mismatch: reference to .init.text:start_kernel (between 'is386' and 'check_x87') WARNING: vmlinux(.text+0xc1213fb4): Section mismatch: reference to .init.text: (between 'rest_init' and 'kthreadd_setup') WARNING: vmlinux(.text+0xc1218786): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc1218792): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc121879e): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc12187aa): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc1214071): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'alloc_node_mem_map' and 'zone_wait_table_init') WARNING: vmlinux(.text+0xc1214117): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'zone_wait_table_init' and 'schedule') WARNING: vmlinux(.text+0xc10fbaae): Section mismatch: reference to .init.text:__alloc_bootmem (between 'vgacon_startup' and 'vgacon_scrolldelta') WARNING: vmlinux(.text+0xc1218eda): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') Root device is (253, 0) Setup is 11240 bytes (padded to 11264 bytes). System is 1915 kB Kernel: arch/i386/boot/bzImage is ready (#3) real4m11.024s user2m5.121s sys 0m30.952s exit Script done on Thu 19 Jul 2007 05:33:35 PM EDT
Re: [patch] CFS scheduler, -v19
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > Bill Davidsen wrote: > >Ingo Molnar wrote: > >>* Bill Davidsen <[EMAIL PROTECTED]> wrote: > >> > Does the patch below help? > > > >Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as > >I recreate it. > > Applied to 2.6.22-git9, building now. ok, that's fine too. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Bill Davidsen wrote: Ingo Molnar wrote: * Bill Davidsen <[EMAIL PROTECTED]> wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. Applied to 2.6.22-git9, building now. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote: > >* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > > >>>Does the patch below help? > > Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as > I recreate it. the patch below is merged against 2.6.22.1-cfs-v19 - does it solve the autofs problem (without any other bad side-effects)? Ingo ---> Subject: time: introduce xtime_seconds From: Ingo Molnar <[EMAIL PROTECTED]> introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 26 +++--- 3 files changed, 40 insertions(+), 24 deletions(-) Index: linux-cfs-2.6.22.q/include/linux/time.h === --- linux-cfs-2.6.22.q.orig/include/linux/time.h +++ linux-cfs-2.6.22.q/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) > 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux-cfs-2.6.22.q/kernel/time.c === --- linux-cfs-2.6.22.q.orig/kernel/time.c +++ linux-cfs-2.6.22.q/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(_lock); - + now = xtime; } while (read_seqretry(_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv->tv_sec = sec; tv->tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv->tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(_lock); - update_wall_time(); - write_seqlock_irqrestore(_lock); - } + update_xtime_seconds(sec); } EXPORT_SYMBOL(do_gettimeofday);
Re: [patch] CFS scheduler, -v19
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote: > >* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > > >>>Does the patch below help? > > Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as > I recreate it. hm, it's against recent -git. dont waste your time on 2.6.21.6-cfsv19, it will likely not apply - give me a few minutes to create a patch for you against 2.6.22.1-cfsv19, ok? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Bill Davidsen <[EMAIL PROTECTED]> wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. Spectacularly no! With this patch the "glitch1" script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Ingo ---> Subject: time: introduce xtime_seconds From: Ingo Molnar <[EMAIL PROTECTED]> introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 28 3 files changed, 41 insertions(+), 25 deletions(-) Index: linux/include/linux/time.h === --- linux.orig/include/linux/time.h +++ linux/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) > 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(_lock); - + now = xtime; } while (read_seqretry(_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv->tv_sec = sec; tv->tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv->tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(_lock, flags); - update_wall_time(); - write_sequnlock_irqrestore(_lock, flags); - } +
Re: [patch] CFS scheduler, -v19
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > Does the patch below help? > > Spectacularly no! With this patch the "glitch1" script with multiple > scrolling windows has all xterms and glxgears stop totally dead for > ~200ms once per second. I didn't properly test anything else after > that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Ingo ---> Subject: time: introduce xtime_seconds From: Ingo Molnar <[EMAIL PROTECTED]> introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 28 3 files changed, 41 insertions(+), 25 deletions(-) Index: linux/include/linux/time.h === --- linux.orig/include/linux/time.h +++ linux/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) > 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(_lock); - + now = xtime; } while (read_seqretry(_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv->tv_sec = sec; tv->tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv->tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(_lock, flags); - update_wall_time(); - write_sequnlock_irqrestore(_lock, flags); - } + update_xtime_seconds(sec); } EXPORT_SYMBOL(do_gettimeofday); Index:
Re: [patch] CFS scheduler, -v19
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > > ah! It passes in a low-res time source into a high-res time > > interface (pthread_cond_timedwait()). Could you change the > > time(NULL) + 1 to time(NULL) + 2, or change it to: > > > > gettimeofday(, NULL); > > wait.tv_sec++; > > This is wrong. It's wrong for two reasons: > > - it really shouldn't be needed. I don't think "time()" has to be >*exactly* in sync, but I don't think it can be off by a third of a >second or whatever (as the "30% CPU load" would seem to imply) > > - gettimeofday works on a timeval, pthread_cond_timedwait() works on a >timespec. ah, i didnt notice that automount mixed up timespec with timeval! That is nasty and the tv_nsec field (which really is ts_usec to pthread_cond_timewait()) must stay cleared - or rather, to avoid bugs of this type, a timespec variable should be used for all this. > So if it actually makes a difference, it makes a difference for the > *wrong* reason: the time is still totally nonsensical in the tv_nsec > field (because it actually got filled in with msecs!), but now the > tv_sec field is in sync, so it hides the bug. > > Anyway, hopefully the patch below might help. But we probably should make > this whole thing a much more generic routine (ie we have our internal > "getnstimeofday()" that still is missing the second-overflow logic, and > that is quite possibly the one that triggers the "30% off" behaviour). yeah, i'll generalize it, but our internal getnstimeofday() used on most architectures is using __get_realtime_clock_ns(), and the patch you attached already adds the second-overflow logic to it. there are two versions of getnstimeofday(), a TIME_INTERPOLATION one and a !TIME_INTERPOLATION one. TIME_INTERPOLATION is only used on ia64 at the moment - and that one indeed does not have the second overflow logic. > Ingo, I'd suggest: > - ger rid of "timespec_add_ns()", or at least make it return a return >value for when it overflows. > - make all the people who overflow into tv_sec call a "fix_up_seconds()" >thing that does the xtime overflow handling. ok, i'll do something clean. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > Does the patch below help? > > Spectacularly no! With this patch the "glitch1" script with multiple > scrolling windows has all xterms and glxgears stop totally dead for > ~200ms once per second. I didn't properly test anything else after > that. Since the automount issue doesn't seem to start until something > kicks it off, I didn't see it but that doesn't mean it's fixed. thanks. Andrew also just reported that it broke his laptop and i'm working on a proper version. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Spectacularly no! With this patch the glitch1 script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Since the automount issue doesn't seem to start until something kicks it off, I didn't see it but that doesn't mean it's fixed. thanks. Andrew also just reported that it broke his laptop and i'm working on a proper version. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Linus Torvalds [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think time() has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the 30% CPU load would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. ah, i didnt notice that automount mixed up timespec with timeval! That is nasty and the tv_nsec field (which really is ts_usec to pthread_cond_timewait()) must stay cleared - or rather, to avoid bugs of this type, a timespec variable should be used for all this. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Anyway, hopefully the patch below might help. But we probably should make this whole thing a much more generic routine (ie we have our internal getnstimeofday() that still is missing the second-overflow logic, and that is quite possibly the one that triggers the 30% off behaviour). yeah, i'll generalize it, but our internal getnstimeofday() used on most architectures is using __get_realtime_clock_ns(), and the patch you attached already adds the second-overflow logic to it. there are two versions of getnstimeofday(), a TIME_INTERPOLATION one and a !TIME_INTERPOLATION one. TIME_INTERPOLATION is only used on ia64 at the moment - and that one indeed does not have the second overflow logic. Ingo, I'd suggest: - ger rid of timespec_add_ns(), or at least make it return a return value for when it overflows. - make all the people who overflow into tv_sec call a fix_up_seconds() thing that does the xtime overflow handling. ok, i'll do something clean. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Spectacularly no! With this patch the glitch1 script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Ingo --- Subject: time: introduce xtime_seconds From: Ingo Molnar [EMAIL PROTECTED] introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 28 3 files changed, 41 insertions(+), 25 deletions(-) Index: linux/include/linux/time.h === --- linux.orig/include/linux/time.h +++ linux/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(xtime_lock); - + now = xtime; } while (read_seqretry(xtime_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv-tv_sec = sec; tv-tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv-tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(xtime_lock, flags); - update_wall_time(); - write_sequnlock_irqrestore(xtime_lock, flags); - } + update_xtime_seconds(sec); } EXPORT_SYMBOL(do_gettimeofday); Index:
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. Spectacularly no! With this patch the glitch1 script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Ingo --- Subject: time: introduce xtime_seconds From: Ingo Molnar [EMAIL PROTECTED] introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 28 3 files changed, 41 insertions(+), 25 deletions(-) Index: linux/include/linux/time.h === --- linux.orig/include/linux/time.h +++ linux/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(xtime_lock); - + now = xtime; } while (read_seqretry(xtime_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv-tv_sec = sec; tv-tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv-tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(xtime_lock, flags); - update_wall_time(); - write_sequnlock_irqrestore(xtime_lock, flags); - } +
Re: [patch] CFS scheduler, -v19
* Bill Davidsen [EMAIL PROTECTED] wrote: Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. hm, it's against recent -git. dont waste your time on 2.6.21.6-cfsv19, it will likely not apply - give me a few minutes to create a patch for you against 2.6.22.1-cfsv19, ok? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen [EMAIL PROTECTED] wrote: Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. the patch below is merged against 2.6.22.1-cfs-v19 - does it solve the autofs problem (without any other bad side-effects)? Ingo --- Subject: time: introduce xtime_seconds From: Ingo Molnar [EMAIL PROTECTED] introduce the xtime_seconds optimization. This is a read-mostly low-resolution time source available to sys_time() and kernel-internal use. This variable is kept uptodate atomically, and it's monotically increased, every time some time interface constructs an xtime-alike time result that overflows the seconds value. (it's updated from the timer interrupt as well) this way high-resolution time results update their seconds component at the same time sys_time() does it: 118485883289000 11848588320 118485883292000 11848588320 118485883296000 11848588320 118485883299000 11848588320 118485883303000 11848588330 118485883306000 11848588330 118485883309000 11848588330 [ these are nsec time results from alternating calls to sys_time() and sys_gettimeofday(), recorded at the seconds boundary. ] instead of the previous (non-coherent) behavior: 118484895087000 11848489500 11848489509 11848489500 118484895094000 11848489500 118484895097000 11848489500 118484895101000 11848489500 118484895105000 11848489500 118484895108000 11848489500 118484895111000 11848489500 118484895115000 Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- include/linux/time.h | 13 +++-- kernel/time.c | 25 ++--- kernel/time/timekeeping.c | 26 +++--- 3 files changed, 40 insertions(+), 24 deletions(-) Index: linux-cfs-2.6.22.q/include/linux/time.h === --- linux-cfs-2.6.22.q.orig/include/linux/time.h +++ linux-cfs-2.6.22.q/include/linux/time.h @@ -91,19 +91,28 @@ static inline struct timespec timespec_s extern struct timespec xtime; extern struct timespec wall_to_monotonic; extern seqlock_t xtime_lock __attribute__((weak)); +extern unsigned long xtime_seconds; extern unsigned long read_persistent_clock(void); void timekeeping_init(void); +extern void __update_xtime_seconds(unsigned long new_xtime_seconds); + +static inline void update_xtime_seconds(unsigned long new_xtime_seconds) +{ + if (unlikely((long)(new_xtime_seconds - xtime_seconds) 0)) + __update_xtime_seconds(new_xtime_seconds); +} + static inline unsigned long get_seconds(void) { - return xtime.tv_sec; + return xtime_seconds; } struct timespec current_kernel_time(void); #define CURRENT_TIME (current_kernel_time()) -#define CURRENT_TIME_SEC ((struct timespec) { xtime.tv_sec, 0 }) +#define CURRENT_TIME_SEC ((struct timespec) { xtime_seconds, 0 }) extern void do_gettimeofday(struct timeval *tv); extern int do_settimeofday(struct timespec *tv); Index: linux-cfs-2.6.22.q/kernel/time.c === --- linux-cfs-2.6.22.q.orig/kernel/time.c +++ linux-cfs-2.6.22.q/kernel/time.c @@ -58,11 +58,10 @@ EXPORT_SYMBOL(sys_tz); asmlinkage long sys_time(time_t __user * tloc) { /* -* We read xtime.tv_sec atomically - it's updated -* atomically by update_wall_time(), so no need to -* even read-lock the xtime seqlock: +* We read xtime_seconds atomically - it's updated +* atomically by update_xtime_seconds(): */ - time_t i = xtime.tv_sec; + time_t i = xtime_seconds; smp_rmb(); /* sys_time() results are coherent */ @@ -226,11 +225,11 @@ inline struct timespec current_kernel_ti do { seq = read_seqbegin(xtime_lock); - + now = xtime; } while (read_seqretry(xtime_lock, seq)); - return now; + return now; } EXPORT_SYMBOL(current_kernel_time); @@ -377,19 +376,7 @@ void do_gettimeofday (struct timeval *tv tv-tv_sec = sec; tv-tv_usec = usec; - /* -* Make sure xtime.tv_sec [returned by sys_time()] always -* follows the gettimeofday() result precisely. This -* condition is extremely unlikely, it can hit at most -* once per second: -*/ - if (unlikely(xtime.tv_sec != tv-tv_sec)) { - unsigned long flags; - - write_seqlock_irqsave(xtime_lock); - update_wall_time(); - write_seqlock_irqrestore(xtime_lock); - } + update_xtime_seconds(sec); } EXPORT_SYMBOL(do_gettimeofday);
Re: [patch] CFS scheduler, -v19
Bill Davidsen wrote: Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. Applied to 2.6.22-git9, building now. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Bill Davidsen [EMAIL PROTECTED] wrote: Bill Davidsen wrote: Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Doesn't seem to apply against 2.6.22.1, I'm trying 2.6.22.6 as soon as I recreate it. Applied to 2.6.22-git9, building now. ok, that's fine too. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Bill Davidsen [EMAIL PROTECTED] wrote: Does the patch below help? Spectacularly no! With this patch the glitch1 script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Bill, could you try the patch below - does it fix the automount problem, without introducing new problems? Okay, as noted off-list, after I exported the xtime_seconds it now builds and works. However, there are a *lot* of section mismatches which are not reassuring. Boots, runs, glitch1 test runs reasonably smoothly. automount has not used significant CPU yet, but I don't know what triggers it, the bad behavior did not happen immediately without the patch. However, it looks very hopeful. Warnings attached to save you the trouble... -- Bill Davidsen [EMAIL PROTECTED] We have more to fear from the bungling of the incompetent than from the machinations of the wicked. - from Slashdot Script started on Thu 19 Jul 2007 05:29:08 PM EDT Common profile 1.13 lastmod 2006-01-04 22:43:25-05 No common directory available Session time 17:29:08 on 07/19/07 posidon:davidsen time nice -10 make -j4 -s; sleep 2; exit CHK include/linux/version.h CHK include/linux/utsrelease.h CHK include/linux/compile.h CHK include/linux/compile.h UPD include/linux/compile.h CHK include/linux/version.h Building modules, stage 2. WARNING: vmlinux(.text+0xc1001183): Section mismatch: reference to .init.text:start_kernel (between 'is386' and 'check_x87') WARNING: vmlinux(.text+0xc1213fb4): Section mismatch: reference to .init.text: (between 'rest_init' and 'kthreadd_setup') WARNING: vmlinux(.text+0xc1218786): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc1218792): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc121879e): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc12187aa): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') WARNING: vmlinux(.text+0xc1214071): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'alloc_node_mem_map' and 'zone_wait_table_init') WARNING: vmlinux(.text+0xc1214117): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'zone_wait_table_init' and 'schedule') WARNING: vmlinux(.text+0xc10fbaae): Section mismatch: reference to .init.text:__alloc_bootmem (between 'vgacon_startup' and 'vgacon_scrolldelta') WARNING: vmlinux(.text+0xc1218eda): Section mismatch: reference to .init.text: (between 'iret_exc' and '_etext') Root device is (253, 0) Setup is 11240 bytes (padded to 11264 bytes). System is 1915 kB Kernel: arch/i386/boot/bzImage is ready (#3) real4m11.024s user2m5.121s sys 0m30.952s exit Script done on Thu 19 Jul 2007 05:33:35 PM EDT
Re: [patch] CFS scheduler, -v19
Linus Torvalds wrote: On Tue, 17 Jul 2007, Ingo Molnar wrote: * Ian Kent <[EMAIL PROTECTED]> wrote: In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; Ok, that definitely should work. Does the patch below help? Spectacularly no! With this patch the "glitch1" script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Since the automount issue doesn't seem to start until something kicks it off, I didn't see it but that doesn't mean it's fixed. ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(, NULL); wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think "time()" has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the "30% CPU load" would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Anyway, hopefully the patch below might help. But we probably should make this whole thing a much more generic routine (ie we have our internal "getnstimeofday()" that still is missing the second-overflow logic, and that is quite possibly the one that triggers the "30% off" behaviour). Hope that info helps. Ingo, I'd suggest: - ger rid of "timespec_add_ns()", or at least make it return a return value for when it overflows. - make all the people who overflow into tv_sec call a "fix_up_seconds()" thing that does the xtime overflow handling. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Wed, 2007-07-18 at 09:03 -0700, Linus Torvalds wrote: > > On Tue, 17 Jul 2007, Ingo Molnar wrote: > > > > * Ian Kent <[EMAIL PROTECTED]> wrote: > > > > > > In several places I have code similar to: > > > > > > wait.tv_sec = time(NULL) + 1; > > > wait.tv_nsec = 0; > > Ok, that definitely should work. > > Does the patch below help? > > > ah! It passes in a low-res time source into a high-res time interface > > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > > time(NULL) + 2, or change it to: > > > > gettimeofday(, NULL); > > wait.tv_sec++; > > This is wrong. It's wrong for two reasons: > > - it really shouldn't be needed. I don't think "time()" has to be >*exactly* in sync, but I don't think it can be off by a third of a >second or whatever (as the "30% CPU load" would seem to imply) > > - gettimeofday works on a timeval, pthread_cond_timedwait() works on a >timespec. > > So if it actually makes a difference, it makes a difference for the > *wrong* reason: the time is still totally nonsensical in the tv_nsec field > (because it actually got filled in with msecs!), but now the tv_sec field > is in sync, so it hides the bug. Oh ya .. I thought it wouldn't hurt to add the fraction of the current second for correctness and actually put things like: gettimeofday(, NULL); wait.tv_sec = now.tv_sec + 1; wait.tv_nsec = now.tv_usec * 1000; in autofs. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Wed, 18 Jul 2007, Ingo Molnar wrote: > > Linus, Thomas, what do you think, should we keep the time.c change? No, not if it's off by the second field. That 30% CPU usage indicates that there's some nasty bug there somewhere, and that's just not worth it. If time() cannot get the second field right, it's bogus. I'm ok with us not *guaranteeing* monotonicity of the second field when you compare gettimeofday() with time(), but the 30% thing implies that it's much worse than that, and that "time()" will likely report the previous second (when compared to hrtimers) roughly a quarter of the time. And that isn't acceptable. So either it should be fixed, or reverted. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 17 Jul 2007, Ingo Molnar wrote: > > * Ian Kent <[EMAIL PROTECTED]> wrote: > > > > In several places I have code similar to: > > > > wait.tv_sec = time(NULL) + 1; > > wait.tv_nsec = 0; Ok, that definitely should work. Does the patch below help? > ah! It passes in a low-res time source into a high-res time interface > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > time(NULL) + 2, or change it to: > > gettimeofday(, NULL); > wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think "time()" has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the "30% CPU load" would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Anyway, hopefully the patch below might help. But we probably should make this whole thing a much more generic routine (ie we have our internal "getnstimeofday()" that still is missing the second-overflow logic, and that is quite possibly the one that triggers the "30% off" behaviour). Ingo, I'd suggest: - ger rid of "timespec_add_ns()", or at least make it return a return value for when it overflows. - make all the people who overflow into tv_sec call a "fix_up_seconds()" thing that does the xtime overflow handling. Linus --- Subject: time: make sure sys_gettimeofday() and sys_time() are in sync From: Ingo Molnar <[EMAIL PROTECTED]> make sure sys_gettimeofday() and sys_time() results are coherent. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- kernel/time/timekeeping.c | 13 + 1 file changed, 13 insertions(+) Index: linux/kernel/time/timekeeping.c === --- linux.orig/kernel/time/timekeeping.c +++ linux/kernel/time/timekeeping.c @@ -92,6 +92,19 @@ static inline void __get_realtime_clock_ } while (read_seqretry(_lock, seq)); timespec_add_ns(ts, nsecs); + /* +* Make sure xtime.tv_sec [returned by sys_time()] always +* follows the gettimeofday() result precisely. This +* condition is extremely unlikely, it can hit at most +* once per second: +*/ + if (unlikely(xtime.tv_sec != ts->tv_sec)) { + unsigned long flags; + + write_seqlock_irqsave(_lock, flags); + update_wall_time(); + write_sequnlock_irqrestore(_lock, flags); + } } /** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Ian Kent <[EMAIL PROTECTED]> wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(, NULL); wait.tv_sec++; does this solve the spinning? Yes, adding in the offset within the current second appears to resolve the issue. Thanks Ingo. i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? I don't think the change is unreasonable since I wasn't using an accurate time in the condition wait, so that's a coding mistake on my part which I will fix. thanks Ian for taking care of this and for fixing it! Linus, Thomas, what do you think, should we keep the time.c change? Automount is one app affected so far, and it's a borderline case: the increased (30%) CPU usage is annoying, but it does not prevent the system from working per se, and an upgrade to a fixed/enhanced automount version resolves it. The temptation of using a really (and trivially) scalable low-resolution time-source (which is _easily_ vsyscall-able, on any platform) for DBMS use is really large, to me at least. Should i perhaps add a boot/config option that enables/disables this optimization, to allow distros finer grained control about this? And we've also got to wait whether there's any other app affected. Allow it to be selected by the "features" so that admins can evaluate the implications without a reboot? That would be a convenient interface if you could provide it. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent <[EMAIL PROTECTED]> wrote: > > > ah! It passes in a low-res time source into a high-res time > > > interface (pthread_cond_timedwait()). Could you change the > > > time(NULL) + 1 to time(NULL) + 2, or change it to: > > > > > > gettimeofday(, NULL); > > > wait.tv_sec++; > > > > > > does this solve the spinning? > > Yes, adding in the offset within the current second appears to resolve > the issue. Thanks Ingo. > > > > i'm wondering how widespread this is. If automount is the only app > > > doing this then _maybe_ we could get away with it by changing > > > automount? > > I don't think the change is unreasonable since I wasn't using an > accurate time in the condition wait, so that's a coding mistake on my > part which I will fix. thanks Ian for taking care of this and for fixing it! Linus, Thomas, what do you think, should we keep the time.c change? Automount is one app affected so far, and it's a borderline case: the increased (30%) CPU usage is annoying, but it does not prevent the system from working per se, and an upgrade to a fixed/enhanced automount version resolves it. The temptation of using a really (and trivially) scalable low-resolution time-source (which is _easily_ vsyscall-able, on any platform) for DBMS use is really large, to me at least. Should i perhaps add a boot/config option that enables/disables this optimization, to allow distros finer grained control about this? And we've also got to wait whether there's any other app affected. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 21:24 -0400, Bill Davidsen wrote: > Ingo Molnar wrote: > > * Ian Kent <[EMAIL PROTECTED]> wrote: > > > > > >>> ah! It passes in a low-res time source into a high-res time interface > >>> (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > >>> time(NULL) + 2, or change it to: > >>> > >>> gettimeofday(, NULL); > >>> wait.tv_sec++; > >>> > >> OK, I'm with you, hi-res timer. > >> But even so, how is the time in the past after adding a second. > >> > >> Is it because I'm not setting tv_nsec when it's close to a second > >> boundary, and hence your recommendation above? > >> > > > > yeah, it looks a bit suspicious: you create a +1 second timeout out of a > > 1 second resolution timesource. I dont yet understand the failure mode > > though that results in that looping and in the 30% CPU time use - do you > > understand it perhaps? (and automount is still functional while this is > > happening, correct?) > > > > Can't say, I have automount running because I get it by default, but I > have nothing using at on my test machine. Why is it looping so fast when > there are no mount points defined? If the config changes there's no > requirement to notice right away, is there? There are two threads where this mistake is made. One is used to trigger expire events for all automounted filesystems which happen all the time since I need to run the expire to check if anything is mounted and whether it needs to be umounted. The alarm handler sleeps on a condition until the alarm list in not empty and then sleeps on a condition until the next alarm in the list expires or an alarm is added to the list, in which case it then checks the list again. Since the autofs timeout granularity is one second this is a problem and will be fixed. This isn't the source of the problem that's been reported. The second is the state queue handler which runs tasks such as expires, map re-reads, shutdowns etc. for all automounted filesystems. While the check interval could be longer it causes autofs to be slugish in situations such as shutdowns where there are a largish number of mounts present and I need to cancel such things as expires and the like. It's possible I could improve this but, in fact, once the timespec is set correctly as Ingo suggests it works fine and uses very little resource. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 14:16 -0700, David Schwartz wrote: > > * Ian Kent <[EMAIL PROTECTED]> wrote: > > > > > Yes it does and I have two reported bugs so far. > > > > > > In several places I have code similar to: > > > > > > wait.tv_sec = time(NULL) + 1; > > > wait.tv_nsec = 0; > > > > > > signaled = 0; > > > while (!signaled) { > > > status = pthread_cond_timedwait(, , ); > > >if (status) { > > > if (status == ETIMEDOUT) > > > break; > > > fatal(status); > > > } > > > } > > > > ah! It passes in a low-res time source into a high-res time interface > > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > > time(NULL) + 2, or change it to: > > > > gettimeofday(, NULL); > > wait.tv_sec++; > > > > does this solve the spinning? Yes, adding in the offset within the current second appears to resolve the issue. Thanks Ingo. > > > > i'm wondering how widespread this is. If automount is the only app doing > > this then _maybe_ we could get away with it by changing automount? I don't think the change is unreasonable since I wasn't using an accurate time in the condition wait, so that's a coding mistake on my part which I will fix. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 14:16 -0700, David Schwartz wrote: * Ian Kent [EMAIL PROTECTED] wrote: Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? Yes, adding in the offset within the current second appears to resolve the issue. Thanks Ingo. i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? I don't think the change is unreasonable since I wasn't using an accurate time in the condition wait, so that's a coding mistake on my part which I will fix. Ian - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 21:24 -0400, Bill Davidsen wrote: Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? yeah, it looks a bit suspicious: you create a +1 second timeout out of a 1 second resolution timesource. I dont yet understand the failure mode though that results in that looping and in the 30% CPU time use - do you understand it perhaps? (and automount is still functional while this is happening, correct?) Can't say, I have automount running because I get it by default, but I have nothing using at on my test machine. Why is it looping so fast when there are no mount points defined? If the config changes there's no requirement to notice right away, is there? There are two threads where this mistake is made. One is used to trigger expire events for all automounted filesystems which happen all the time since I need to run the expire to check if anything is mounted and whether it needs to be umounted. The alarm handler sleeps on a condition until the alarm list in not empty and then sleeps on a condition until the next alarm in the list expires or an alarm is added to the list, in which case it then checks the list again. Since the autofs timeout granularity is one second this is a problem and will be fixed. This isn't the source of the problem that's been reported. The second is the state queue handler which runs tasks such as expires, map re-reads, shutdowns etc. for all automounted filesystems. While the check interval could be longer it causes autofs to be slugish in situations such as shutdowns where there are a largish number of mounts present and I need to cancel such things as expires and the like. It's possible I could improve this but, in fact, once the timespec is set correctly as Ingo suggests it works fine and uses very little resource. Ian - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? Yes, adding in the offset within the current second appears to resolve the issue. Thanks Ingo. i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? I don't think the change is unreasonable since I wasn't using an accurate time in the condition wait, so that's a coding mistake on my part which I will fix. thanks Ian for taking care of this and for fixing it! Linus, Thomas, what do you think, should we keep the time.c change? Automount is one app affected so far, and it's a borderline case: the increased (30%) CPU usage is annoying, but it does not prevent the system from working per se, and an upgrade to a fixed/enhanced automount version resolves it. The temptation of using a really (and trivially) scalable low-resolution time-source (which is _easily_ vsyscall-able, on any platform) for DBMS use is really large, to me at least. Should i perhaps add a boot/config option that enables/disables this optimization, to allow distros finer grained control about this? And we've also got to wait whether there's any other app affected. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? Yes, adding in the offset within the current second appears to resolve the issue. Thanks Ingo. i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? I don't think the change is unreasonable since I wasn't using an accurate time in the condition wait, so that's a coding mistake on my part which I will fix. thanks Ian for taking care of this and for fixing it! Linus, Thomas, what do you think, should we keep the time.c change? Automount is one app affected so far, and it's a borderline case: the increased (30%) CPU usage is annoying, but it does not prevent the system from working per se, and an upgrade to a fixed/enhanced automount version resolves it. The temptation of using a really (and trivially) scalable low-resolution time-source (which is _easily_ vsyscall-able, on any platform) for DBMS use is really large, to me at least. Should i perhaps add a boot/config option that enables/disables this optimization, to allow distros finer grained control about this? And we've also got to wait whether there's any other app affected. Allow it to be selected by the features so that admins can evaluate the implications without a reboot? That would be a convenient interface if you could provide it. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 17 Jul 2007, Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; Ok, that definitely should work. Does the patch below help? ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think time() has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the 30% CPU load would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Anyway, hopefully the patch below might help. But we probably should make this whole thing a much more generic routine (ie we have our internal getnstimeofday() that still is missing the second-overflow logic, and that is quite possibly the one that triggers the 30% off behaviour). Ingo, I'd suggest: - ger rid of timespec_add_ns(), or at least make it return a return value for when it overflows. - make all the people who overflow into tv_sec call a fix_up_seconds() thing that does the xtime overflow handling. Linus --- Subject: time: make sure sys_gettimeofday() and sys_time() are in sync From: Ingo Molnar [EMAIL PROTECTED] make sure sys_gettimeofday() and sys_time() results are coherent. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- kernel/time/timekeeping.c | 13 + 1 file changed, 13 insertions(+) Index: linux/kernel/time/timekeeping.c === --- linux.orig/kernel/time/timekeeping.c +++ linux/kernel/time/timekeeping.c @@ -92,6 +92,19 @@ static inline void __get_realtime_clock_ } while (read_seqretry(xtime_lock, seq)); timespec_add_ns(ts, nsecs); + /* +* Make sure xtime.tv_sec [returned by sys_time()] always +* follows the gettimeofday() result precisely. This +* condition is extremely unlikely, it can hit at most +* once per second: +*/ + if (unlikely(xtime.tv_sec != ts-tv_sec)) { + unsigned long flags; + + write_seqlock_irqsave(xtime_lock, flags); + update_wall_time(); + write_sequnlock_irqrestore(xtime_lock, flags); + } } /** - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Wed, 18 Jul 2007, Ingo Molnar wrote: Linus, Thomas, what do you think, should we keep the time.c change? No, not if it's off by the second field. That 30% CPU usage indicates that there's some nasty bug there somewhere, and that's just not worth it. If time() cannot get the second field right, it's bogus. I'm ok with us not *guaranteeing* monotonicity of the second field when you compare gettimeofday() with time(), but the 30% thing implies that it's much worse than that, and that time() will likely report the previous second (when compared to hrtimers) roughly a quarter of the time. And that isn't acceptable. So either it should be fixed, or reverted. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Wed, 2007-07-18 at 09:03 -0700, Linus Torvalds wrote: On Tue, 17 Jul 2007, Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; Ok, that definitely should work. Does the patch below help? ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think time() has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the 30% CPU load would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Oh ya .. I thought it wouldn't hurt to add the fraction of the current second for correctness and actually put things like: gettimeofday(now, NULL); wait.tv_sec = now.tv_sec + 1; wait.tv_nsec = now.tv_usec * 1000; in autofs. Ian - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Linus Torvalds wrote: On Tue, 17 Jul 2007, Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; Ok, that definitely should work. Does the patch below help? Spectacularly no! With this patch the glitch1 script with multiple scrolling windows has all xterms and glxgears stop totally dead for ~200ms once per second. I didn't properly test anything else after that. Since the automount issue doesn't seem to start until something kicks it off, I didn't see it but that doesn't mean it's fixed. ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; This is wrong. It's wrong for two reasons: - it really shouldn't be needed. I don't think time() has to be *exactly* in sync, but I don't think it can be off by a third of a second or whatever (as the 30% CPU load would seem to imply) - gettimeofday works on a timeval, pthread_cond_timedwait() works on a timespec. So if it actually makes a difference, it makes a difference for the *wrong* reason: the time is still totally nonsensical in the tv_nsec field (because it actually got filled in with msecs!), but now the tv_sec field is in sync, so it hides the bug. Anyway, hopefully the patch below might help. But we probably should make this whole thing a much more generic routine (ie we have our internal getnstimeofday() that still is missing the second-overflow logic, and that is quite possibly the one that triggers the 30% off behaviour). Hope that info helps. Ingo, I'd suggest: - ger rid of timespec_add_ns(), or at least make it return a return value for when it overflows. - make all the people who overflow into tv_sec call a fix_up_seconds() thing that does the xtime overflow handling. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Ian Kent <[EMAIL PROTECTED]> wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(, NULL); wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? yeah, it looks a bit suspicious: you create a +1 second timeout out of a 1 second resolution timesource. I dont yet understand the failure mode though that results in that looping and in the 30% CPU time use - do you understand it perhaps? (and automount is still functional while this is happening, correct?) Can't say, I have automount running because I get it by default, but I have nothing using at on my test machine. Why is it looping so fast when there are no mount points defined? If the config changes there's no requirement to notice right away, is there? -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that > does not include the time.c change. Markus - does your kernel include > the code below? (if yes, please revert it via patch -p1 -R ) Well, the 2.6.22.1-cfs-v19 does include it, but the 2.6.21.6-cfs-v19 does not have that patch applied. But both show this problem. Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Hi Ingo, sorry for the long delay, I've spent a week doing non-kernel work. On Tue, Jul 10, 2007 at 12:39:50AM +0200, Ingo Molnar wrote: > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > The biggest user-visible change in -v19 is reworked sleeper > > > fairness: it's similar in behavior to -v18 but works more > > > consistently across nice levels. Fork-happy workloads (like kernel > > > builds) should behave better as well. There are also a handful of > > > speedups: unsigned math, 32-bit speedups, O(1) task pickup, > > > debloating and other micro-optimizations. > > > > Interestingly, I also noticed the possibility of O(1) task pickup when > > playing with v18, but did not detect any noticeable improvement with > > it. Of course, it depends on the workload and I probably didn't > > perform the most relevant tests. > > yeah - it's a small tweak. CFS is O(31) in sleep/wakeup so it's now all > a big O(1) family again :) Yes, that's what I tried to explain to a guy once : what I like with log(N) algos is that even with N very large, log(N) is always small, and it's sometimes faster to perform log(N) fast operations than 1 slow operation. That's also why I don't care about balanced trees : my unbalanced trees may hold 32 levels for 32 carefully chosen values, while balanced trees will have 5 levels (worst difference between both). If I can insert and delete a node 6 times faster, I always win. And quite frankly, I'm not interested at the 32 entries case in a tree :-) > > V19 works very well here on 2.6.20.14. I could start 32k busy loops at > > nice +19 (I exhausted the 32k pids limit), and could still perform > > normal operations. I noticed that 'vmstat' scans all the pid entries > > under /proc, which takes ages to collect data before displaying a > > line. Obviously, the system sometimes shows some short latencies, but > > not much more than what you get from and SSH through a remote DSL > > connection. > > great! I did not try to push it this far, yet. Well, I borrowed two 1GB sticks because I discovered that one of my 512MB had one defect bit. It was finally an opportunity for me to push the test this far. > > Here's a vmstat 1 output : > > > > r b w swpd free buff cache si sobibo incs us sy > > id > > 32437 0 0 0 809724488 619600 1 0 135 0 24 > > 72 4 > > 32436 0 0 0 811336488 619600 0 0 717 0 78 > > 22 0 > > crazy :-) indeed :-) > > Amusingly, I started mpg123 during this test and it skipped quite a > > bit. After setting all tasks to SCHED_IDLE, it did not skip anymore. > > All this seems to behave like one could expect. > > yeah. It behaves better than i expected in fact - 32K tasks is pushing > things quite a bit. (we've got a 32K PID limit for example) Yes, and in fact, I suspect that we still have an O(N) or O(N^2) pid allocation algo somewhere (I did not look at the code), because forking was very very slow when reaching those numbers. I'll possibly check this when I have some spare time, because it reminds me a trivial source port ring allocator I wrote a few years ago which was O(1). With 32k pids, it will only require 64kB RAM for the whole system, and we may even optimize it to spread CPUs entry points in order to nearly always avoid lock contention. > > I also started 30k processes distributed in 130 groups of 234 chained > > by pipes in which one byte is passed. I get an average of 8000 in the > > run queue. The context switch rate is very low and sometimes even null > > in this test, maybe some of them are starving, I really do not know : > > > > r b w swpd free buff cache si sobibo incs us sy > > id > > 7752 0 1 0 656892244 419600 0 0 725 0 16 > > 84 0 > > hm, could you profile this? We could have some bottleneck somewhere > (likely not in the scheduler) with that many tasks being runnable. [ > With CFS you can actually run a profiler under this workload ;-) ] I may probably try some time later (not this week-end, I have some 2.4 to work on). > > In my tree, I have replaced the rbtree with the ebtree we talked > > about, but it did not bring any performance boost because, eventhough > > insert() and delete() are faster, the scheduler is already quite good > > at avoiding them as much as possible, mostly relying on rb_next() > > which has the same cost in both trees. All in all, the only variations > > I noticed were caused by cacheline alignment when I tried to reorder > > fields in the eb_node. So I will stop my experimentations here since I > > don't see any more room for improvement. > > well, just a little bit of improvement would be nice to have too :) Yes but I prefer to merge it where it really bring something (I'll have a look at epoll, I noticed epollctl() was 30% slower under 2.6 with an rbtree as it is under 2.4 with a hash). Then people will tell me "you're
RE: [patch] CFS scheduler, -v19
> * Ian Kent <[EMAIL PROTECTED]> wrote: > > > Yes it does and I have two reported bugs so far. > > > > In several places I have code similar to: > > > > wait.tv_sec = time(NULL) + 1; > > wait.tv_nsec = 0; > > > > signaled = 0; > > while (!signaled) { > > status = pthread_cond_timedwait(, , ); > >if (status) { > > if (status == ETIMEDOUT) > > break; > > fatal(status); > > } > > } > > ah! It passes in a low-res time source into a high-res time interface > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > time(NULL) + 2, or change it to: > > gettimeofday(, NULL); > wait.tv_sec++; > > does this solve the spinning? > > i'm wondering how widespread this is. If automount is the only app doing > this then _maybe_ we could get away with it by changing automount? This code is horribly broken. Don't change the kernel because this code is broken. First it adds a second, but then it subtracts up to a second. Just before the second boundary, this code can burn CPU like crazy, with each wait being just a few nanoseconds. What is the intent of this code? Is it to wait "up to a second, possibly for no time at all" or is to wait "for at least a second"? If so, why are you zeroing the nanosecond count? DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > But why does that happen? And why would the scheduler have *anything* > to do with this? No idea. Maybe timing. Maybe the time.c changes. > Dunno. hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that does not include the time.c change. Markus - does your kernel include the code below? (if yes, please revert it via patch -p1 -R ) Ingo Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -57,14 +57,17 @@ EXPORT_SYMBOL(sys_tz); */ asmlinkage long sys_time(time_t __user * tloc) { - time_t i; - struct timeval tv; + /* +* We read xtime.tv_sec atomically - it's updated +* atomically by update_wall_time(), so no need to +* even read-lock the xtime seqlock: +*/ + time_t i = xtime.tv_sec; - do_gettimeofday(); - i = tv.tv_sec; + smp_rmb(); /* sys_time() results are coherent */ if (tloc) { - if (put_user(i,tloc)) + if (put_user(i, tloc)) i = -EFAULT; } return i; @@ -373,6 +376,20 @@ void do_gettimeofday (struct timeval *tv tv->tv_sec = sec; tv->tv_usec = usec; + + /* +* Make sure xtime.tv_sec [returned by sys_time()] always +* follows the gettimeofday() result precisely. This +* condition is extremely unlikely, it can hit at most +* once per second: +*/ + if (unlikely(xtime.tv_sec != tv->tv_sec)) { + unsigned long flags; + + write_seqlock_irqsave(_lock); + update_wall_time(); + write_seqlock_irqrestore(_lock); + } } EXPORT_SYMBOL(do_gettimeofday); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 17 Jul 2007, Ingo Molnar wrote: > > i think the problem starts here: > > 11902 1184699865.141939 read(3, "", 32) = 0 <0.07> Well, it's preceded by a poll() that says that it has a POLLHUP event, so that socket would seem to have simply been closed from the other end. There's also a huge amount of select() calls showing the same thing (except since it's just the input side, you cannot tell that it's POLLHUP). Don't ask me *why*, though. It's preceded by .. 11902 1184699848.615201 read(3, 0x7fffb5b9c8b0, 32) = -1 EAGAIN (Resource temporarily unavailable) <0.09> 11902 1184699848.615252 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, -1) = 1 <0.009307> 11902 1184699848.624614 read(3, "\1 \303!\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0b\340T\0\0\0\0\0", 32) = 32 <0.11> .. got data .. 11902 1184699848.624710 ioctl(3, FIONREAD, [0]) = 0 <0.09> 11902 1184699848.624762 ioctl(3, FIONREAD, [0]) = 0 <0.48> .. ok, nothing more.. 11902 1184699848.624866 select(10, [3 4 5 7 9], [], [], NULL) = 1 (in [3]) <16.495008> 11902 1184699865.119950 ioctl(3, FIONREAD, [0]) = 0 <0.06> 16+ seconds pass, now it's marked as readable, but returns zero bytes of data: the other end closed it. Tons of unnecessary and stupid sequences of: 11902 1184699865.119988 select(10, [3 4 5 7 9], [], [], NULL) = 1 (in [3]) <0.07> 11902 1184699865.120031 ioctl(3, FIONREAD, [0]) = 0 <0.05> .. and then finally: ... 11902 1184699865.141809 poll([{fd=3, events=POLLIN, revents=POLLIN|POLLHUP}], 1, 0) = 1 <0.05> 11902 1184699865.141838 ioctl(3, FIONREAD, [0]) = 0 <0.05> 11902 1184699865.141939 read(3, "", 32) = 0 <0.07> ie now konqueror noticed that it was *really* closed, and read the EOF. But why does that happen? And why would the scheduler have *anything* to do with this? No idea. Maybe timing. Maybe the time.c changes. Dunno. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> 9173 1184675906.194424 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) > = -1 ENOTTY (Inappropriate ioctl for device) <0.06> > 9173 1184675906.194463 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) > = -1 ENOTTY (Inappropriate ioctl for device) <0.04> > > ? Are those -ENOTTY results normal? Yes, I see it on any kernel. > 9173 1184675906.155015 write(2, "In file kernel/qpixmap_x11.cpp, "..., 56) = 56 <0.06> > 9173 1184675906.155052 write(2, "QImage::convertDepth: Image is a"..., 44) = 44 <0.04> > 9173 1184675906.155169 gettimeofday({1184675906, 155179}, NULL) = 0 <0.06> > 9173 1184675906.155249 write(11, "close(6f1c2f7):about:konqueror\n", 31) = 31 <0.32> > > i think konqueror tried to say something here about an image problem? Well, yes: In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image konqueror: Fatal IO error: client killed And no, my 2 GB of RAM are not full: $ free -m total used free sharedbuffers cached Mem: 2012 1077935 0 22 441 -/+ buffers/cache:612 1400 Swap: 2070 0 2070 > could you perhaps upload the strace to some webpage so that others can > take a look too? hm, I dont have any webspace... > it might also be good to add "-s 1000" to the strace command, so that we > can see the full messages that konqueror tried to log to some other > task, i.e.: > > strace -s 1000 -ttt -TTT -o trace.log -f > > and perhaps try to do a 'comparison' trace.normal.log as well, with > konqueror having no problems. I now made some new strace logs: - konq crash 251K - Konq without crash on cfs 302K - konq without crash on non-cfs 248K Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent <[EMAIL PROTECTED]> wrote: > > ah! It passes in a low-res time source into a high-res time interface > > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > > time(NULL) + 2, or change it to: > > > > gettimeofday(, NULL); > > wait.tv_sec++; > > OK, I'm with you, hi-res timer. > But even so, how is the time in the past after adding a second. > > Is it because I'm not setting tv_nsec when it's close to a second > boundary, and hence your recommendation above? yeah, it looks a bit suspicious: you create a +1 second timeout out of a 1 second resolution timesource. I dont yet understand the failure mode though that results in that looping and in the 30% CPU time use - do you understand it perhaps? (and automount is still functional while this is happening, correct?) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > i think it fails here due to some IO error: > > 9173 1184675906.674610 write(2, "konqueror: Fatal IO error: > clien"..., 41) = 41 <0.07> oh, and i missed the obvious request: could you start konqueror from a terminal and see what it prints when it goes down? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus <[EMAIL PROTECTED]> wrote: > > > Nothing is printed for a disapeared app for me. > > > > > > Is there anything more I can try? > > > > sure - could you start one of those apps via: > > > > strace -ttt -TTT -o trace.log -f > > > > and wait for it to "disappear"? Then compress the trace.log via > > bzip2 -9 (it's probably going to be a really large file) and send me > > it? > private mail, aswell (187K) i think it fails here due to some IO error: 9173 1184675906.674610 write(2, "konqueror: Fatal IO error: clien"..., 41) = 41 <0.07> could this be due to: 9173 1184675906.194424 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) <0.06> 9173 1184675906.194463 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) <0.04> ? Are those -ENOTTY results normal? or perhaps the problem started alot earlier, at: 9173 1184675906.155015 write(2, "In file kernel/qpixmap_x11.cpp, "..., 56) = 56 <0.06> 9173 1184675906.155052 write(2, "QImage::convertDepth: Image is a"..., 44) = 44 <0.04> 9173 1184675906.155169 gettimeofday({1184675906, 155179}, NULL) = 0 <0.06> 9173 1184675906.155249 write(11, "close(6f1c2f7):about:konqueror\n", 31) = 31 <0.32> i think konqueror tried to say something here about an image problem? could you perhaps upload the strace to some webpage so that others can take a look too? it might also be good to add "-s 1000" to the strace command, so that we can see the full messages that konqueror tried to log to some other task, i.e.: strace -s 1000 -ttt -TTT -o trace.log -f and perhaps try to do a 'comparison' trace.normal.log as well, with konqueror having no problems. Also a KDE expert's advice would be useful here too i guess ... Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On 07/17/2007 03:45 AM, Ingo Molnar wrote: > * Ian Kent <[EMAIL PROTECTED]> wrote: > >> Yes it does and I have two reported bugs so far. >> >> In several places I have code similar to: >> >> wait.tv_sec = time(NULL) + 1; >> wait.tv_nsec = 0; >> >> signaled = 0; >> while (!signaled) { >> status = pthread_cond_timedwait(, , ); >>if (status) { >> if (status == ETIMEDOUT) >> break; >> fatal(status); >> } >> } > > ah! It passes in a low-res time source into a high-res time interface > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > time(NULL) + 2, or change it to: > > gettimeofday(, NULL); > wait.tv_sec++; > > does this solve the spinning? > > i'm wondering how widespread this is. If automount is the only app doing > this then _maybe_ we could get away with it by changing automount? Odds are there's at least one other app doing that somewhere. Would reverting the CFS changes to time.c fix this problem? That optimization just got merged in 2.6.22 mainline... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> could you please send me the cfs-debug-info output nevertheless? private mail (4,9K) > > Nothing is printed for a disapeared app for me. > > > > Is there anything more I can try? > > sure - could you start one of those apps via: > > strace -ttt -TTT -o trace.log -f > > and wait for it to "disappear"? Then compress the trace.log via bzip2 -9 > (it's probably going to be a really large file) and send me it? private mail, aswell (187K) When attachments are allowed, I can resend them on the list as well (or just ask me...) To answer a private mail: I do not use any kernel-module thats not part of the official kernel! And of course nothing proprietary # cat /proc/sys/kernel/tainted 0 I used gcc-4.1.2 (glibc-2.5-r4) to build the kernels. (Its a amd64 system, quite stable so far.) Programs that "disappeared" are most graphical, because others I have not noticed so far... also [1] might be caused by this... amarok, kdesktop, whole X, konqueror, konsole but also gtk-apps Markus [1] http://lkml.org/lkml/2007/07/14/64 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 09:45 +0200, Ingo Molnar wrote: > * Ian Kent <[EMAIL PROTECTED]> wrote: > > > Yes it does and I have two reported bugs so far. > > > > In several places I have code similar to: > > > > wait.tv_sec = time(NULL) + 1; > > wait.tv_nsec = 0; > > > > signaled = 0; > > while (!signaled) { > > status = pthread_cond_timedwait(, , ); > >if (status) { > > if (status == ETIMEDOUT) > > break; > > fatal(status); > > } > > } > > ah! It passes in a low-res time source into a high-res time interface > (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to > time(NULL) + 2, or change it to: > > gettimeofday(, NULL); > wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? > > does this solve the spinning? I don't have a system to test this on so I'll try to get one of the people that logged the problem to test a patch. > > i'm wondering how widespread this is. If automount is the only app doing > this then _maybe_ we could get away with it by changing automount? I'm happy to change automount but that could cause odd version specific problems for people updating their kernel on an older installed base. Aaah .. and they'll all blame me!! ;) Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent <[EMAIL PROTECTED]> wrote: > Yes it does and I have two reported bugs so far. > > In several places I have code similar to: > > wait.tv_sec = time(NULL) + 1; > wait.tv_nsec = 0; > > signaled = 0; > while (!signaled) { > status = pthread_cond_timedwait(, , ); >if (status) { > if (status == ETIMEDOUT) > break; > fatal(status); > } > } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(, NULL); wait.tv_sec++; does this solve the spinning? i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus <[EMAIL PROTECTED]> wrote: > The dmesg output is not differing in any interesting point (just some > numbers, like raid-benchmark, some irqs or usb-numbers...) could you please send me the cfs-debug-info output nevertheless? > Nothing is printed for a disapeared app for me. > > Is there anything more I can try? sure - could you start one of those apps via: strace -ttt -TTT -o trace.log -f and wait for it to "disappear"? Then compress the trace.log via bzip2 -9 (it's probably going to be a really large file) and send me it? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus [EMAIL PROTECTED] wrote: The dmesg output is not differing in any interesting point (just some numbers, like raid-benchmark, some irqs or usb-numbers...) could you please send me the cfs-debug-info output nevertheless? Nothing is printed for a disapeared app for me. Is there anything more I can try? sure - could you start one of those apps via: strace -ttt -TTT -o trace.log -f app and wait for it to disappear? Then compress the trace.log via bzip2 -9 (it's probably going to be a really large file) and send me it? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent [EMAIL PROTECTED] wrote: Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Tue, 2007-07-17 at 09:45 +0200, Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? does this solve the spinning? I don't have a system to test this on so I'll try to get one of the people that logged the problem to test a patch. i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? I'm happy to change automount but that could cause odd version specific problems for people updating their kernel on an older installed base. Aaah .. and they'll all blame me!! ;) Ian - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
could you please send me the cfs-debug-info output nevertheless? private mail (4,9K) Nothing is printed for a disapeared app for me. Is there anything more I can try? sure - could you start one of those apps via: strace -ttt -TTT -o trace.log -f app and wait for it to disappear? Then compress the trace.log via bzip2 -9 (it's probably going to be a really large file) and send me it? private mail, aswell (187K) When attachments are allowed, I can resend them on the list as well (or just ask me...) To answer a private mail: I do not use any kernel-module thats not part of the official kernel! And of course nothing proprietary # cat /proc/sys/kernel/tainted 0 I used gcc-4.1.2 (glibc-2.5-r4) to build the kernels. (Its a amd64 system, quite stable so far.) Programs that disappeared are most graphical, because others I have not noticed so far... also [1] might be caused by this... amarok, kdesktop, whole X, konqueror, konsole but also gtk-apps Markus [1] http://lkml.org/lkml/2007/07/14/64 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On 07/17/2007 03:45 AM, Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? Odds are there's at least one other app doing that somewhere. Would reverting the CFS changes to time.c fix this problem? That optimization just got merged in 2.6.22 mainline... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus [EMAIL PROTECTED] wrote: Nothing is printed for a disapeared app for me. Is there anything more I can try? sure - could you start one of those apps via: strace -ttt -TTT -o trace.log -f app and wait for it to disappear? Then compress the trace.log via bzip2 -9 (it's probably going to be a really large file) and send me it? private mail, aswell (187K) i think it fails here due to some IO error: 9173 1184675906.674610 write(2, konqueror: Fatal IO error: clien..., 41) = 41 0.07 could this be due to: 9173 1184675906.194424 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) 0.06 9173 1184675906.194463 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) 0.04 ? Are those -ENOTTY results normal? or perhaps the problem started alot earlier, at: 9173 1184675906.155015 write(2, In file kernel/qpixmap_x11.cpp, ..., 56) = 56 0.06 9173 1184675906.155052 write(2, QImage::convertDepth: Image is a..., 44) = 44 0.04 9173 1184675906.155169 gettimeofday({1184675906, 155179}, NULL) = 0 0.06 9173 1184675906.155249 write(11, close(6f1c2f7):about:konqueror\n, 31) = 31 0.32 i think konqueror tried to say something here about an image problem? could you perhaps upload the strace to some webpage so that others can take a look too? it might also be good to add -s 1000 to the strace command, so that we can see the full messages that konqueror tried to log to some other task, i.e.: strace -s 1000 -ttt -TTT -o trace.log -f app and perhaps try to do a 'comparison' trace.normal.log as well, with konqueror having no problems. Also a KDE expert's advice would be useful here too i guess ... Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ingo Molnar [EMAIL PROTECTED] wrote: i think it fails here due to some IO error: 9173 1184675906.674610 write(2, konqueror: Fatal IO error: clien..., 41) = 41 0.07 oh, and i missed the obvious request: could you start konqueror from a terminal and see what it prints when it goes down? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ian Kent [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? yeah, it looks a bit suspicious: you create a +1 second timeout out of a 1 second resolution timesource. I dont yet understand the failure mode though that results in that looping and in the 30% CPU time use - do you understand it perhaps? (and automount is still functional while this is happening, correct?) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
9173 1184675906.194424 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) 0.06 9173 1184675906.194463 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff341af5c0) = -1 ENOTTY (Inappropriate ioctl for device) 0.04 ? Are those -ENOTTY results normal? Yes, I see it on any kernel. 9173 1184675906.155015 write(2, In file kernel/qpixmap_x11.cpp, ..., 56) = 56 0.06 9173 1184675906.155052 write(2, QImage::convertDepth: Image is a..., 44) = 44 0.04 9173 1184675906.155169 gettimeofday({1184675906, 155179}, NULL) = 0 0.06 9173 1184675906.155249 write(11, close(6f1c2f7):about:konqueror\n, 31) = 31 0.32 i think konqueror tried to say something here about an image problem? Well, yes: In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image In file kernel/qpixmap_x11.cpp, line 633: Out of memory QImage::convertDepth: Image is a null image konqueror: Fatal IO error: client killed And no, my 2 GB of RAM are not full: $ free -m total used free sharedbuffers cached Mem: 2012 1077935 0 22 441 -/+ buffers/cache:612 1400 Swap: 2070 0 2070 could you perhaps upload the strace to some webpage so that others can take a look too? hm, I dont have any webspace... it might also be good to add -s 1000 to the strace command, so that we can see the full messages that konqueror tried to log to some other task, i.e.: strace -s 1000 -ttt -TTT -o trace.log -f app and perhaps try to do a 'comparison' trace.normal.log as well, with konqueror having no problems. I now made some new strace logs: - konq crash 251K - Konq without crash on cfs 302K - konq without crash on non-cfs 248K Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Linus Torvalds [EMAIL PROTECTED] wrote: But why does that happen? And why would the scheduler have *anything* to do with this? No idea. Maybe timing. Maybe the time.c changes. Dunno. hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that does not include the time.c change. Markus - does your kernel include the code below? (if yes, please revert it via patch -p1 -R ) Ingo Index: linux/kernel/time.c === --- linux.orig/kernel/time.c +++ linux/kernel/time.c @@ -57,14 +57,17 @@ EXPORT_SYMBOL(sys_tz); */ asmlinkage long sys_time(time_t __user * tloc) { - time_t i; - struct timeval tv; + /* +* We read xtime.tv_sec atomically - it's updated +* atomically by update_wall_time(), so no need to +* even read-lock the xtime seqlock: +*/ + time_t i = xtime.tv_sec; - do_gettimeofday(tv); - i = tv.tv_sec; + smp_rmb(); /* sys_time() results are coherent */ if (tloc) { - if (put_user(i,tloc)) + if (put_user(i, tloc)) i = -EFAULT; } return i; @@ -373,6 +376,20 @@ void do_gettimeofday (struct timeval *tv tv-tv_sec = sec; tv-tv_usec = usec; + + /* +* Make sure xtime.tv_sec [returned by sys_time()] always +* follows the gettimeofday() result precisely. This +* condition is extremely unlikely, it can hit at most +* once per second: +*/ + if (unlikely(xtime.tv_sec != tv-tv_sec)) { + unsigned long flags; + + write_seqlock_irqsave(xtime_lock); + update_wall_time(); + write_seqlock_irqrestore(xtime_lock); + } } EXPORT_SYMBOL(do_gettimeofday); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] CFS scheduler, -v19
* Ian Kent [EMAIL PROTECTED] wrote: Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; does this solve the spinning? i'm wondering how widespread this is. If automount is the only app doing this then _maybe_ we could get away with it by changing automount? This code is horribly broken. Don't change the kernel because this code is broken. First it adds a second, but then it subtracts up to a second. Just before the second boundary, this code can burn CPU like crazy, with each wait being just a few nanoseconds. What is the intent of this code? Is it to wait up to a second, possibly for no time at all or is to wait for at least a second? If so, why are you zeroing the nanosecond count? DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Hi Ingo, sorry for the long delay, I've spent a week doing non-kernel work. On Tue, Jul 10, 2007 at 12:39:50AM +0200, Ingo Molnar wrote: * Willy Tarreau [EMAIL PROTECTED] wrote: The biggest user-visible change in -v19 is reworked sleeper fairness: it's similar in behavior to -v18 but works more consistently across nice levels. Fork-happy workloads (like kernel builds) should behave better as well. There are also a handful of speedups: unsigned math, 32-bit speedups, O(1) task pickup, debloating and other micro-optimizations. Interestingly, I also noticed the possibility of O(1) task pickup when playing with v18, but did not detect any noticeable improvement with it. Of course, it depends on the workload and I probably didn't perform the most relevant tests. yeah - it's a small tweak. CFS is O(31) in sleep/wakeup so it's now all a big O(1) family again :) Yes, that's what I tried to explain to a guy once : what I like with log(N) algos is that even with N very large, log(N) is always small, and it's sometimes faster to perform log(N) fast operations than 1 slow operation. That's also why I don't care about balanced trees : my unbalanced trees may hold 32 levels for 32 carefully chosen values, while balanced trees will have 5 levels (worst difference between both). If I can insert and delete a node 6 times faster, I always win. And quite frankly, I'm not interested at the 32 entries case in a tree :-) V19 works very well here on 2.6.20.14. I could start 32k busy loops at nice +19 (I exhausted the 32k pids limit), and could still perform normal operations. I noticed that 'vmstat' scans all the pid entries under /proc, which takes ages to collect data before displaying a line. Obviously, the system sometimes shows some short latencies, but not much more than what you get from and SSH through a remote DSL connection. great! I did not try to push it this far, yet. Well, I borrowed two 1GB sticks because I discovered that one of my 512MB had one defect bit. It was finally an opportunity for me to push the test this far. Here's a vmstat 1 output : r b w swpd free buff cache si sobibo incs us sy id 32437 0 0 0 809724488 619600 1 0 135 0 24 72 4 32436 0 0 0 811336488 619600 0 0 717 0 78 22 0 crazy :-) indeed :-) Amusingly, I started mpg123 during this test and it skipped quite a bit. After setting all tasks to SCHED_IDLE, it did not skip anymore. All this seems to behave like one could expect. yeah. It behaves better than i expected in fact - 32K tasks is pushing things quite a bit. (we've got a 32K PID limit for example) Yes, and in fact, I suspect that we still have an O(N) or O(N^2) pid allocation algo somewhere (I did not look at the code), because forking was very very slow when reaching those numbers. I'll possibly check this when I have some spare time, because it reminds me a trivial source port ring allocator I wrote a few years ago which was O(1). With 32k pids, it will only require 64kB RAM for the whole system, and we may even optimize it to spread CPUs entry points in order to nearly always avoid lock contention. I also started 30k processes distributed in 130 groups of 234 chained by pipes in which one byte is passed. I get an average of 8000 in the run queue. The context switch rate is very low and sometimes even null in this test, maybe some of them are starving, I really do not know : r b w swpd free buff cache si sobibo incs us sy id 7752 0 1 0 656892244 419600 0 0 725 0 16 84 0 hm, could you profile this? We could have some bottleneck somewhere (likely not in the scheduler) with that many tasks being runnable. [ With CFS you can actually run a profiler under this workload ;-) ] I may probably try some time later (not this week-end, I have some 2.4 to work on). In my tree, I have replaced the rbtree with the ebtree we talked about, but it did not bring any performance boost because, eventhough insert() and delete() are faster, the scheduler is already quite good at avoiding them as much as possible, mostly relying on rb_next() which has the same cost in both trees. All in all, the only variations I noticed were caused by cacheline alignment when I tried to reorder fields in the eb_node. So I will stop my experimentations here since I don't see any more room for improvement. well, just a little bit of improvement would be nice to have too :) Yes but I prefer to merge it where it really bring something (I'll have a look at epoll, I noticed epollctl() was 30% slower under 2.6 with an rbtree as it is under 2.4 with a hash). Then people will tell me you're completely dumb, you could have improved it that way! and then, once it's optimized to be always faster than the
Re: [patch] CFS scheduler, -v19
hm, Markus indicated that he tried the v2.6.21.6-cfsv19 patch, and that does not include the time.c change. Markus - does your kernel include the code below? (if yes, please revert it via patch -p1 -R ) Well, the 2.6.22.1-cfs-v19 does include it, but the 2.6.21.6-cfs-v19 does not have that patch applied. But both show this problem. Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Ian Kent [EMAIL PROTECTED] wrote: ah! It passes in a low-res time source into a high-res time interface (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to time(NULL) + 2, or change it to: gettimeofday(wait, NULL); wait.tv_sec++; OK, I'm with you, hi-res timer. But even so, how is the time in the past after adding a second. Is it because I'm not setting tv_nsec when it's close to a second boundary, and hence your recommendation above? yeah, it looks a bit suspicious: you create a +1 second timeout out of a 1 second resolution timesource. I dont yet understand the failure mode though that results in that looping and in the 30% CPU time use - do you understand it perhaps? (and automount is still functional while this is happening, correct?) Can't say, I have automount running because I get it by default, but I have nothing using at on my test machine. Why is it looping so fast when there are no mount points defined? If the config changes there's no requirement to notice right away, is there? -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Mon, 2007-07-16 at 23:55 +0200, Ingo Molnar wrote: > * Chuck Ebbert <[EMAIL PROTECTED]> wrote: > > > On 07/13/2007 05:19 PM, Bill Davidsen wrote: > > > > > > I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors > > > with FC6. Automount starts taking 30% of CPU (unused at the moment) > > > > Can you confirm whether CFS is involved, i.e. does it spin like that > > even without the CFS patch applied? > > hmmm could you take out the kernel/time.c (sys_time()) changes from > the CFS patch, does that solve the automount issue? If yes, could > someone take a look at automount and check whether it makes use of > time(2) and whether it combines it with finer grained time sources? Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(, , ); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } lead to automount spinning with strace output a bit like: futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 130925919}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131160876}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131377080}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131593297}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131871792}) = 0 There should be something like: futex(0x557868c4, FUTEX_WAIT, 5321099, {0, 998091311}) = -1 ETIMEDOUT (Connection timed out) in there I think. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Chuck Ebbert <[EMAIL PROTECTED]> wrote: On 07/13/2007 05:19 PM, Bill Davidsen wrote: I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? I will try that, but not until Tuesday night. I've been here too long today and have an out-of-state meeting tomorrow. I'll take a look after dinner. Note that the latest 2.6.21 with cfs-v19 doesn't have any problems of any nature, other than suspend to RAM not working, and I may have the config wrong. Runs really well otherwise, but I'll test drive 2.6.22 w/o the patch. hmmm could you take out the kernel/time.c (sys_time()) changes from the CFS patch, does that solve the automount issue? If yes, could someone take a look at automount and check whether it makes use of time(2) and whether it combines it with finer grained time sources? Will do. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Chuck Ebbert <[EMAIL PROTECTED]> wrote: > On 07/13/2007 05:19 PM, Bill Davidsen wrote: > > > > I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors > > with FC6. Automount starts taking 30% of CPU (unused at the moment) > > Can you confirm whether CFS is involved, i.e. does it spin like that > even without the CFS patch applied? hmmm could you take out the kernel/time.c (sys_time()) changes from the CFS patch, does that solve the automount issue? If yes, could someone take a look at automount and check whether it makes use of time(2) and whether it combines it with finer grained time sources? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On 07/13/2007 05:19 PM, Bill Davidsen wrote: > > I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors > with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> > [...] The mouse is smooth, just when one app is being quit (dont > > know why...) the mouse will be jerking for a few seconds... > is the mouse jerky on any app quitting? No. > Or is your observation the following: _sometimes_ apps quit > unexpectedly (their window just vanishes?), and _at the same time_, > the mouse becomes jerky as well, for a few seconds? Exactly. > the mouse typically only becomes jerky when there's some really high > load on the system - anything else would be a kernel bug. A jerky > mouse on an unloaded system is definitely a sign of some sort of > kernel bug (in or outside of the scheduler). An app vanishing > unexpectedly might mean an OOM-kill - but that would should up in the > syslog as well. > Pretty weird. Well, the system uses about 30% of the cpu (cool'n'quite put it on the lowest frequency). I made a plain 2.6.22.1 and could use it for about 2 hours without any problem. Then I applied the cfs-v19 for that kernel, rebuild from mrproper with the saved config and booted. After a few minutes the first app vanished... some more followed by time (I just surfed around a bit...) The dmesg output is not differing in any interesting point (just some numbers, like raid-benchmark, some irqs or usb-numbers...) So its obviously something within cfs... unfortunately... > Can you make this regression trigger arbitrarily, so that we could > debug it better? Apps exiting unexpectedly can be debugged via: > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/broken-out/vdso-print-fatal-signals.patch > > you can turn it on via the print-fatal-signals=1 boot option or via: > > echo 1 > /proc/sys/kernel/print-fatal-signals > > this feature will produce a small dump to the syslog about every app > that exits unexpectedly. Note that this might not cover all types of > "window suddenly vanishes" regressions. Nothing is printed for a disapeared app for me. Is there anything more I can try? Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Monday 16 July 2007 05:17, Ingo Molnar wrote: > > * Ed Tomlinson <[EMAIL PROTECTED]> wrote: > > > I run a java application at nice 15. Its been a background > > application here for as long as SD and CFS have been around. If I > > have a compile running at nice 0, with v19 java gets so little cpu > > that the the wrapper that runs to monitor it is timing out waiting for > > it to start. This is new in v19 - something in v19 is not meshing > > well with my mix of applications... > > how much longer did the startup of the java app get relative to say v18? > > to debug this, could you check whether this problem goes away if you use > nice 10 (or nice 5) instead of nice 15? Ingo, It may take a day to two before I get to test this. I have had to revert to 2.6.21 - it seems that 22 triggers a stall here (21 also can trigger this but its harder)... Thanks Ed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus <[EMAIL PROTECTED]> wrote: > [...] The mouse is smooth, just when one app is being quit (dont know > why...) the mouse will be jerking for a few seconds... is the mouse jerky on any app quitting? Or is your observation the following: _sometimes_ apps quit unexpectedly (their window just vanishes?), and _at the same time_, the mouse becomes jerky as well, for a few seconds? the mouse typically only becomes jerky when there's some really high load on the system - anything else would be a kernel bug. A jerky mouse on an unloaded system is definitely a sign of some sort of kernel bug (in or outside of the scheduler). An app vanishing unexpectedly might mean an OOM-kill - but that would should up in the syslog as well. Pretty weird. Can you make this regression trigger arbitrarily, so that we could debug it better? Apps exiting unexpectedly can be debugged via: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/broken-out/vdso-print-fatal-signals.patch you can turn it on via the print-fatal-signals=1 boot option or via: echo 1 > /proc/sys/kernel/print-fatal-signals this feature will produce a small dump to the syslog about every app that exits unexpectedly. Note that this might not cover all types of "window suddenly vanishes" regressions. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ed Tomlinson <[EMAIL PROTECTED]> wrote: > I run a java application at nice 15. Its been a background > application here for as long as SD and CFS have been around. If I > have a compile running at nice 0, with v19 java gets so little cpu > that the the wrapper that runs to monitor it is timing out waiting for > it to start. This is new in v19 - something in v19 is not meshing > well with my mix of applications... how much longer did the startup of the java app get relative to say v18? to debug this, could you check whether this problem goes away if you use nice 10 (or nice 5) instead of nice 15? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > Sending a few seconds of logged /proc/sched_debug will also help get a > picture of what's happening, and lovely would be a method to reproduce > the problem locally. also, by running this script: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh and sending us the file it produces we'll have most of the environmental information as well. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Sun, 2007-07-15 at 23:11 +0200, Markus wrote: > > > [1] http://lkml.org/lkml/2007/07/14/60 > > > > Hm. Tasks disappearing isn't you're typical process scheduler problem > > by any means, nor is an idle box exhibiting mouse "lurchiness". Is > > there anything unusual in your logs? > > I know that its not typical, but when my current kernel is stable and > shows the same problem with the cfs-patch applied like the > git-snapshot, I would say its a cfs issue. Yes, from your description, and with the now presented additional information that the git-snapshot exhibits the same symptoms, it sounds like cfs _may_ be implicated in some way. I can't imagine how at the moment. In your original report, there are other patches involved, which are an unknown variables. The git-snapshot contains very many changes other than cfs as well. I'd eliminate absolutely all unknowns as the first step. > But I can build a plain 2.6.22 without cfs and one with it and compare > dmesgs output, if that helps. Yes. It would definitely be worth while to test a virgin stable kernel, and then add only cfs with identical config. Dmesg output may not turn up anything, but eliminating all other variables should either pin the tail on the donkey (cfs?) or vindicate it, and that's what needs to be nailed down solidly first. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Sun, 2007-07-15 at 23:11 +0200, Markus wrote: [1] http://lkml.org/lkml/2007/07/14/60 Hm. Tasks disappearing isn't you're typical process scheduler problem by any means, nor is an idle box exhibiting mouse lurchiness. Is there anything unusual in your logs? I know that its not typical, but when my current kernel is stable and shows the same problem with the cfs-patch applied like the git-snapshot, I would say its a cfs issue. Yes, from your description, and with the now presented additional information that the git-snapshot exhibits the same symptoms, it sounds like cfs _may_ be implicated in some way. I can't imagine how at the moment. In your original report, there are other patches involved, which are an unknown variables. The git-snapshot contains very many changes other than cfs as well. I'd eliminate absolutely all unknowns as the first step. But I can build a plain 2.6.22 without cfs and one with it and compare dmesgs output, if that helps. Yes. It would definitely be worth while to test a virgin stable kernel, and then add only cfs with identical config. Dmesg output may not turn up anything, but eliminating all other variables should either pin the tail on the donkey (cfs?) or vindicate it, and that's what needs to be nailed down solidly first. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Mike Galbraith [EMAIL PROTECTED] wrote: Sending a few seconds of logged /proc/sched_debug will also help get a picture of what's happening, and lovely would be a method to reproduce the problem locally. also, by running this script: http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh and sending us the file it produces we'll have most of the environmental information as well. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Ed Tomlinson [EMAIL PROTECTED] wrote: I run a java application at nice 15. Its been a background application here for as long as SD and CFS have been around. If I have a compile running at nice 0, with v19 java gets so little cpu that the the wrapper that runs to monitor it is timing out waiting for it to start. This is new in v19 - something in v19 is not meshing well with my mix of applications... how much longer did the startup of the java app get relative to say v18? to debug this, could you check whether this problem goes away if you use nice 10 (or nice 5) instead of nice 15? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Markus [EMAIL PROTECTED] wrote: [...] The mouse is smooth, just when one app is being quit (dont know why...) the mouse will be jerking for a few seconds... is the mouse jerky on any app quitting? Or is your observation the following: _sometimes_ apps quit unexpectedly (their window just vanishes?), and _at the same time_, the mouse becomes jerky as well, for a few seconds? the mouse typically only becomes jerky when there's some really high load on the system - anything else would be a kernel bug. A jerky mouse on an unloaded system is definitely a sign of some sort of kernel bug (in or outside of the scheduler). An app vanishing unexpectedly might mean an OOM-kill - but that would should up in the syslog as well. Pretty weird. Can you make this regression trigger arbitrarily, so that we could debug it better? Apps exiting unexpectedly can be debugged via: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/broken-out/vdso-print-fatal-signals.patch you can turn it on via the print-fatal-signals=1 boot option or via: echo 1 /proc/sys/kernel/print-fatal-signals this feature will produce a small dump to the syslog about every app that exits unexpectedly. Note that this might not cover all types of window suddenly vanishes regressions. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Monday 16 July 2007 05:17, Ingo Molnar wrote: * Ed Tomlinson [EMAIL PROTECTED] wrote: I run a java application at nice 15. Its been a background application here for as long as SD and CFS have been around. If I have a compile running at nice 0, with v19 java gets so little cpu that the the wrapper that runs to monitor it is timing out waiting for it to start. This is new in v19 - something in v19 is not meshing well with my mix of applications... how much longer did the startup of the java app get relative to say v18? to debug this, could you check whether this problem goes away if you use nice 10 (or nice 5) instead of nice 15? Ingo, It may take a day to two before I get to test this. I have had to revert to 2.6.21 - it seems that 22 triggers a stall here (21 also can trigger this but its harder)... Thanks Ed - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
[...] The mouse is smooth, just when one app is being quit (dont know why...) the mouse will be jerking for a few seconds... is the mouse jerky on any app quitting? No. Or is your observation the following: _sometimes_ apps quit unexpectedly (their window just vanishes?), and _at the same time_, the mouse becomes jerky as well, for a few seconds? Exactly. the mouse typically only becomes jerky when there's some really high load on the system - anything else would be a kernel bug. A jerky mouse on an unloaded system is definitely a sign of some sort of kernel bug (in or outside of the scheduler). An app vanishing unexpectedly might mean an OOM-kill - but that would should up in the syslog as well. Pretty weird. Well, the system uses about 30% of the cpu (cool'n'quite put it on the lowest frequency). I made a plain 2.6.22.1 and could use it for about 2 hours without any problem. Then I applied the cfs-v19 for that kernel, rebuild from mrproper with the saved config and booted. After a few minutes the first app vanished... some more followed by time (I just surfed around a bit...) The dmesg output is not differing in any interesting point (just some numbers, like raid-benchmark, some irqs or usb-numbers...) So its obviously something within cfs... unfortunately... Can you make this regression trigger arbitrarily, so that we could debug it better? Apps exiting unexpectedly can be debugged via: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc6/2.6.22-rc6-mm1/broken-out/vdso-print-fatal-signals.patch you can turn it on via the print-fatal-signals=1 boot option or via: echo 1 /proc/sys/kernel/print-fatal-signals this feature will produce a small dump to the syslog about every app that exits unexpectedly. Note that this might not cover all types of window suddenly vanishes regressions. Nothing is printed for a disapeared app for me. Is there anything more I can try? Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On 07/13/2007 05:19 PM, Bill Davidsen wrote: I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
* Chuck Ebbert [EMAIL PROTECTED] wrote: On 07/13/2007 05:19 PM, Bill Davidsen wrote: I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? hmmm could you take out the kernel/time.c (sys_time()) changes from the CFS patch, does that solve the automount issue? If yes, could someone take a look at automount and check whether it makes use of time(2) and whether it combines it with finer grained time sources? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Ingo Molnar wrote: * Chuck Ebbert [EMAIL PROTECTED] wrote: On 07/13/2007 05:19 PM, Bill Davidsen wrote: I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? I will try that, but not until Tuesday night. I've been here too long today and have an out-of-state meeting tomorrow. I'll take a look after dinner. Note that the latest 2.6.21 with cfs-v19 doesn't have any problems of any nature, other than suspend to RAM not working, and I may have the config wrong. Runs really well otherwise, but I'll test drive 2.6.22 w/o the patch. hmmm could you take out the kernel/time.c (sys_time()) changes from the CFS patch, does that solve the automount issue? If yes, could someone take a look at automount and check whether it makes use of time(2) and whether it combines it with finer grained time sources? Will do. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Mon, 2007-07-16 at 23:55 +0200, Ingo Molnar wrote: * Chuck Ebbert [EMAIL PROTECTED] wrote: On 07/13/2007 05:19 PM, Bill Davidsen wrote: I should really go back to 2.6.21.6, 2.6.22 has many bizarre behaviors with FC6. Automount starts taking 30% of CPU (unused at the moment) Can you confirm whether CFS is involved, i.e. does it spin like that even without the CFS patch applied? hmmm could you take out the kernel/time.c (sys_time()) changes from the CFS patch, does that solve the automount issue? If yes, could someone take a look at automount and check whether it makes use of time(2) and whether it combines it with finer grained time sources? Yes it does and I have two reported bugs so far. In several places I have code similar to: wait.tv_sec = time(NULL) + 1; wait.tv_nsec = 0; signaled = 0; while (!signaled) { status = pthread_cond_timedwait(cond, mutex, wait); if (status) { if (status == ETIMEDOUT) break; fatal(status); } } lead to automount spinning with strace output a bit like: futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 130925919}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131160876}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131377080}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131593297}) = 0 time(NULL)= 1184593935 futex(0x80034b60, FUTEX_WAKE, 1) = 0 clock_gettime(CLOCK_REALTIME, {1184593936, 131871792}) = 0 There should be something like: futex(0x557868c4, FUTEX_WAIT, 5321099, {0, 998091311}) = -1 ETIMEDOUT (Connection timed out) in there I think. Ian - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> > [1] http://lkml.org/lkml/2007/07/14/60 > > Hm. Tasks disappearing isn't you're typical process scheduler problem > by any means, nor is an idle box exhibiting mouse "lurchiness". Is > there anything unusual in your logs? I know that its not typical, but when my current kernel is stable and shows the same problem with the cfs-patch applied like the git-snapshot, I would say its a cfs issue. There is nothing in the logs when a program dies, thats why asked for a way to make the kernel more verbose. But I can build a plain 2.6.22 without cfs and one with it and compare dmesgs output, if that helps. Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Sun, 2007-07-15 at 14:53 +0200, Markus wrote: > [1] http://lkml.org/lkml/2007/07/14/60 Hm. Tasks disappearing isn't you're typical process scheduler problem by any means, nor is an idle box exhibiting mouse "lurchiness". Is there anything unusual in your logs? -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
> Sending a few seconds of logged /proc/sched_debug will also help get a > picture of what's happening, and lovely would be a method to reproduce > the problem locally. Hi. Is there anything like the sched_debug in the 2.6.22-git5? Because I have a cfs-problem as well [1]. Markus [1] http://lkml.org/lkml/2007/07/14/60 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
Sending a few seconds of logged /proc/sched_debug will also help get a picture of what's happening, and lovely would be a method to reproduce the problem locally. Hi. Is there anything like the sched_debug in the 2.6.22-git5? Because I have a cfs-problem as well [1]. Markus [1] http://lkml.org/lkml/2007/07/14/60 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Sun, 2007-07-15 at 14:53 +0200, Markus wrote: [1] http://lkml.org/lkml/2007/07/14/60 Hm. Tasks disappearing isn't you're typical process scheduler problem by any means, nor is an idle box exhibiting mouse lurchiness. Is there anything unusual in your logs? -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
[1] http://lkml.org/lkml/2007/07/14/60 Hm. Tasks disappearing isn't you're typical process scheduler problem by any means, nor is an idle box exhibiting mouse lurchiness. Is there anything unusual in your logs? I know that its not typical, but when my current kernel is stable and shows the same problem with the cfs-patch applied like the git-snapshot, I would say its a cfs issue. There is nothing in the logs when a program dies, thats why asked for a way to make the kernel more verbose. But I can build a plain 2.6.22 without cfs and one with it and compare dmesgs output, if that helps. Markus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v19
On Sat, 2007-07-14 at 13:19 -0400, Ed Tomlinson wrote: > Hi, Hi Ed, > I run a java application at nice 15. Its been a background application here > for as long > as SD and CFS have been around. If I have a compile running at nice 0, with > v19 java > gets so little cpu that the the wrapper that runs to monitor it is timing out > waiting for > it to start. This is new in v19 - something in v19 is not meshing well with > my mix > of applications... > > Kernel is gentoo 2.6.22-r1 + cfs v19 > > How can I help to debug this? I hear Ingo is off having a genuine long weekend, but in the meantime, you could try echo 30 > /proc/sys/kernel/sched_features to eliminate the sleeper fairness changes in v19, since that was the biggest change. Sending a few seconds of logged /proc/sched_debug will also help get a picture of what's happening, and lovely would be a method to reproduce the problem locally. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/