Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
On Sun, 2007-03-25 at 11:14 -0700, Linus Torvalds wrote: > > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) > > This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think > only the original i386 had this bug ("feature"). > > But I agree, it does seem to be broken on such machines (I assume you > don't actually have one, but just tested by forcing it by hand ;) Yes, it's a genuine i386 embedded system and AFAIK the same feature is available on 486 clones. i386 and Co are still in used in the embedded space. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Sun, 25 Mar 2007, Thomas Gleixner wrote: > > > > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) > > This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I > think only the original i386 had this bug ("feature"). > > But I agree, it does seem to be broken on such machines (I assume you > don't actually have one, but just tested by forcing it by hand ;) actually, AFAIK this is a genuine i386 box Thomas has (an embedded board). Our hardware legacies and the resulting dependencies _really_ stick around for quite long time :-/ Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
On Sun, 25 Mar 2007, Thomas Gleixner wrote: > > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think only the original i386 had this bug ("feature"). But I agree, it does seem to be broken on such machines (I assume you don't actually have one, but just tested by forcing it by hand ;) > Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path, > which in turn calls > > down_read(current->mm->mmap_sem) - which might sleep > > and > > get_user_pages() - which has a cond_resched() inside. > > Not sure how to fix that. I agree. Nasty. But the thing is, it's actually much worse. We use "__put_user()" earlier to try to fault it in writably, and that one is totally broken on a CPU where wp_works_ok isn't set. The whole notion that we should do this at access time is broken. We should go back to doing it at "access_ok()", or we should just state that we don't support original-i386 CPU's any more. As it is, we don't do it right *anyway*, since we only do the tests properly in __copy_to_user(), and totally miss them in __put_user() and friends. So it's buggy on i386 however you try to fix it. The only way to fix it properly is to move the i386 fixup early, into "access_ok()", the way it used to be. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] __copy_to_user_inatomic broken on non Pentium machines
Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) Last known working kernel: 2.6.18 (did not try 2.6.19 yet) Enabling CONFIG_PREEMPT on latest mainline as well as 2.6.20 trigger [ 14.15] BUG: sleeping function called from invalid context at /home/tglx/work/kernel/vanilla/linux-2.6.20/kernel/rwsem.c:20 [ 14.16] in_atomic():1, irqs_disabled():0 [ 14.16] no locks held by init/1. [ 14.17] [] show_trace_log_lvl+0x1a/0x2f [ 14.18] [] show_trace+0x12/0x14 [ 14.19] [] dump_stack+0x16/0x18 [ 14.19] [] __might_sleep+0xc7/0xcd [ 14.20] [] down_read+0x18/0x47 [ 14.21] [] __copy_to_user_ll+0x5e/0x1b6 [ 14.22] [] file_read_actor+0x10b/0x149 [ 14.23] [] do_generic_mapping_read+0x187/0x433 [ 14.24] [] generic_file_aio_read+0x191/0x1ca [ 14.24] [] do_sync_read+0xc2/0xff [ 14.25] [] vfs_read+0x90/0x145 [ 14.26] [] sys_read+0x3f/0x63 [ 14.27] [] syscall_call+0x7/0xb [ 14.27] === and [ 22.66] BUG: scheduling while atomic: e2fsck/0x1001/272 [ 22.67] 1 lock held by e2fsck/272: [ 22.68] #0: (>mmap_sem){}, at: [] __copy_to_user_ll+0x5e/0x1b6 [ 22.69] [] show_trace_log_lvl+0x1a/0x2f [ 22.70] [] show_trace+0x12/0x14 [ 22.71] [] dump_stack+0x16/0x18 [ 22.72] [] __sched_text_start+0x71/0x57f [ 22.72] [] __cond_resched+0x21/0x3b [ 22.73] [] cond_resched+0x26/0x31 [ 22.74] [] get_user_pages+0x1e1/0x23c [ 22.75] [] __copy_to_user_ll+0x98/0x1b6 [ 22.76] [] file_read_actor+0x10b/0x149 [ 22.77] [] do_generic_mapping_read+0x187/0x433 [ 22.78] [] generic_file_aio_read+0x191/0x1ca [ 22.79] [] do_sync_read+0xc2/0xff [ 22.79] [] vfs_read+0x90/0x145 [ 22.80] [] sys_read+0x3f/0x63 [ 22.81] [] syscall_call+0x7/0xb [ 22.82] === which is not surprising. int file_read_actor(read_descriptor_t *desc, struct page *page, unsigned long offset, unsigned long size) { /* * Faults on the destination of a read are common, so do it before * taking the kmap. */ if (!fault_in_pages_writeable(desc->arg.buf, size)) { kaddr = kmap_atomic(page, KM_USER0); > left = __copy_to_user_inatomic(desc->arg.buf, kaddr + offset, size); is called with preempt_count == 1, due to the kmap_atomic() above. Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path, which in turn calls down_read(current->mm->mmap_sem) - which might sleep and get_user_pages() - which has a cond_resched() inside. Not sure how to fix that. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] __copy_to_user_inatomic broken on non Pentium machines
Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) Last known working kernel: 2.6.18 (did not try 2.6.19 yet) Enabling CONFIG_PREEMPT on latest mainline as well as 2.6.20 trigger [ 14.15] BUG: sleeping function called from invalid context at /home/tglx/work/kernel/vanilla/linux-2.6.20/kernel/rwsem.c:20 [ 14.16] in_atomic():1, irqs_disabled():0 [ 14.16] no locks held by init/1. [ 14.17] [c0103346] show_trace_log_lvl+0x1a/0x2f [ 14.18] [c0103441] show_trace+0x12/0x14 [ 14.19] [c0103cf5] dump_stack+0x16/0x18 [ 14.19] [c010aa62] __might_sleep+0xc7/0xcd [ 14.20] [c01213a1] down_read+0x18/0x47 [ 14.21] [c01a01e4] __copy_to_user_ll+0x5e/0x1b6 [ 14.22] [c012cf85] file_read_actor+0x10b/0x149 [ 14.23] [c012d7b2] do_generic_mapping_read+0x187/0x433 [ 14.24] [c012f64b] generic_file_aio_read+0x191/0x1ca [ 14.24] [c0141657] do_sync_read+0xc2/0xff [ 14.25] [c0141eb6] vfs_read+0x90/0x145 [ 14.26] [c014227e] sys_read+0x3f/0x63 [ 14.27] [c0102fb0] syscall_call+0x7/0xb [ 14.27] === and [ 22.66] BUG: scheduling while atomic: e2fsck/0x1001/272 [ 22.67] 1 lock held by e2fsck/272: [ 22.68] #0: (mm-mmap_sem){}, at: [c01a01e4] __copy_to_user_ll+0x5e/0x1b6 [ 22.69] [c0103346] show_trace_log_lvl+0x1a/0x2f [ 22.70] [c0103441] show_trace+0x12/0x14 [ 22.71] [c0103cf5] dump_stack+0x16/0x18 [ 22.72] [c024a189] __sched_text_start+0x71/0x57f [ 22.72] [c010b49f] __cond_resched+0x21/0x3b [ 22.73] [c024aca7] cond_resched+0x26/0x31 [ 22.74] [c0137ae5] get_user_pages+0x1e1/0x23c [ 22.75] [c01a021e] __copy_to_user_ll+0x98/0x1b6 [ 22.76] [c012cf85] file_read_actor+0x10b/0x149 [ 22.77] [c012d7b2] do_generic_mapping_read+0x187/0x433 [ 22.78] [c012f64b] generic_file_aio_read+0x191/0x1ca [ 22.79] [c0141657] do_sync_read+0xc2/0xff [ 22.79] [c0141eb6] vfs_read+0x90/0x145 [ 22.80] [c014227e] sys_read+0x3f/0x63 [ 22.81] [c0102fb0] syscall_call+0x7/0xb [ 22.82] === which is not surprising. int file_read_actor(read_descriptor_t *desc, struct page *page, unsigned long offset, unsigned long size) { /* * Faults on the destination of a read are common, so do it before * taking the kmap. */ if (!fault_in_pages_writeable(desc-arg.buf, size)) { kaddr = kmap_atomic(page, KM_USER0); left = __copy_to_user_inatomic(desc-arg.buf, kaddr + offset, size); is called with preempt_count == 1, due to the kmap_atomic() above. Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path, which in turn calls down_read(current-mm-mmap_sem) - which might sleep and get_user_pages() - which has a cond_resched() inside. Not sure how to fix that. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
On Sun, 25 Mar 2007, Thomas Gleixner wrote: Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) This shouldn't be pre-pentium, afaik. WP-works-ok on i486 too. I think only the original i386 had this bug (feature). But I agree, it does seem to be broken on such machines (I assume you don't actually have one, but just tested by forcing it by hand ;) Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path, which in turn calls down_read(current-mm-mmap_sem) - which might sleep and get_user_pages() - which has a cond_resched() inside. Not sure how to fix that. I agree. Nasty. But the thing is, it's actually much worse. We use __put_user() earlier to try to fault it in writably, and that one is totally broken on a CPU where wp_works_ok isn't set. The whole notion that we should do this at access time is broken. We should go back to doing it at access_ok(), or we should just state that we don't support original-i386 CPU's any more. As it is, we don't do it right *anyway*, since we only do the tests properly in __copy_to_user(), and totally miss them in __put_user() and friends. So it's buggy on i386 however you try to fix it. The only way to fix it properly is to move the i386 fixup early, into access_ok(), the way it used to be. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
* Linus Torvalds [EMAIL PROTECTED] wrote: On Sun, 25 Mar 2007, Thomas Gleixner wrote: Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) This shouldn't be pre-pentium, afaik. WP-works-ok on i486 too. I think only the original i386 had this bug (feature). But I agree, it does seem to be broken on such machines (I assume you don't actually have one, but just tested by forcing it by hand ;) actually, AFAIK this is a genuine i386 box Thomas has (an embedded board). Our hardware legacies and the resulting dependencies _really_ stick around for quite long time :-/ Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
On Sun, 2007-03-25 at 11:14 -0700, Linus Torvalds wrote: Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0) This shouldn't be pre-pentium, afaik. WP-works-ok on i486 too. I think only the original i386 had this bug (feature). But I agree, it does seem to be broken on such machines (I assume you don't actually have one, but just tested by forcing it by hand ;) Yes, it's a genuine i386 embedded system and AFAIK the same feature is available on 486 clones. i386 and Co are still in used in the embedded space. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/