Re: [PATCH] Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue <[EMAIL PROTECTED]> Date: Tue, 31 Jul 2007 08:55:20 +0100 (BST) > I have formulated a patch that prevents the update_mmu_cache from doing > enything if there is no context available. This apears to have no > immediate, undesirable side effects. > > This worked better than the alternative of setting up a context to work with. > > Can you for see any issues in doing this? > > If not, can you check+apply the attached (un-mangled) patch. Thanks for tracking this down Mark. The issue is that, when exec()'ing to userspace from a kernel thread, we need activate_context() to be invoked before we try to touch userspace at all. This new argument handling is invoking get_user_pages() before that happens. activate_context() happens via flush_old_exec(), but that occurs via load_elf_binary() et al. which is long after the argument fetching code runs in fs/exec.c that is using get_user_pages(). (Mark, hint: activate_context() is defined to switch_mm() on sparc32, which is sun4c_switch_mm() which you thought was only invoked from context switches :-)) Touching userspace before activate_context() is questionable at best, in my opinion. But I can't come up with a good way to fix this right now other than Mark's sparc patch, so I will apply it. Thanks again Mark! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Re: [SPARC32] NULL pointer derefference
Hi David, I have formulated a patch that prevents the update_mmu_cache from doing enything if there is no context available. This apears to have no immediate, undesirable side effects. This worked better than the alternative of setting up a context to work with. Can you for see any issues in doing this? If not, can you check+apply the attached (un-mangled) patch. diff -ruNpd linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c --- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100 +++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100 @@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar unsigned long flags; int pseg; + if (vma->vm_mm->context == NO_CONTEXT) + return; + local_irq_save(flags); address &= PAGE_MASK; if ((pseg = sun4c_get_segmap(address)) == invalid_segment) { Regards Mark Fortescue.From: Mark Fortescue <[EMAIL PROTECTED]> This deals with a sun4c issue caused by commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba: mm: variable length argument support. The new way the code works means that sun4c_update_mmu_cache gets called before a context has been selected, which results in invalid operation of the underling mm code. Simply ignoring update requests when there is no valid context solves the problem. Signed-off-by Mark Fortescue <[EMAIL PROTECTED]> --- This worked better than the alternative of setting up a context to work with. I definatly need to spend some time writting up the sun4c MMU and how Linux code uses it. diff -ruNpd -x '.[a-z]*' linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c --- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100 +++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100 @@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar unsigned long flags; int pseg; + if (vma->vm_mm->context == NO_CONTEXT) + return; + local_irq_save(flags); address &= PAGE_MASK; if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {
Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue <[EMAIL PROTECTED]> Date: Tue, 31 Jul 2007 06:35:29 +0100 (BST) > The original code did a job lot of pte stuf in install_arg_page. The > new code seems to replace this using get_user_pages but I have not > worked out how get_user_pages gets to the point at which it > allocated pte's i.e. maps the stack memory it is about to put the > arguments into. get_user_pages() essentially walks through the requested user address space, faults in pages if necessary, and returns references to those pages. The logic of get_user_pages() you need to be concerned about is this inner loop: while (!(page = follow_page(vma, start, foll_flags))) { int ret; ret = handle_mm_fault(mm, vma, start, foll_flags & FOLL_WRITE); ... } handle_mm_fault() does all the dirty work of a page fault, and is how we get to update_mmu_cache(), the sun4c implementation of which is where you see the crash. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue [EMAIL PROTECTED] Date: Tue, 31 Jul 2007 06:35:29 +0100 (BST) The original code did a job lot of pte stuf in install_arg_page. The new code seems to replace this using get_user_pages but I have not worked out how get_user_pages gets to the point at which it allocated pte's i.e. maps the stack memory it is about to put the arguments into. get_user_pages() essentially walks through the requested user address space, faults in pages if necessary, and returns references to those pages. The logic of get_user_pages() you need to be concerned about is this inner loop: while (!(page = follow_page(vma, start, foll_flags))) { int ret; ret = handle_mm_fault(mm, vma, start, foll_flags FOLL_WRITE); ... } handle_mm_fault() does all the dirty work of a page fault, and is how we get to update_mmu_cache(), the sun4c implementation of which is where you see the crash. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Re: [SPARC32] NULL pointer derefference
Hi David, I have formulated a patch that prevents the update_mmu_cache from doing enything if there is no context available. This apears to have no immediate, undesirable side effects. This worked better than the alternative of setting up a context to work with. Can you for see any issues in doing this? If not, can you check+apply the attached (un-mangled) patch. diff -ruNpd linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c --- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100 +++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100 @@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar unsigned long flags; int pseg; + if (vma-vm_mm-context == NO_CONTEXT) + return; + local_irq_save(flags); address = PAGE_MASK; if ((pseg = sun4c_get_segmap(address)) == invalid_segment) { Regards Mark Fortescue.From: Mark Fortescue [EMAIL PROTECTED] This deals with a sun4c issue caused by commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba: mm: variable length argument support. The new way the code works means that sun4c_update_mmu_cache gets called before a context has been selected, which results in invalid operation of the underling mm code. Simply ignoring update requests when there is no valid context solves the problem. Signed-off-by Mark Fortescue [EMAIL PROTECTED] --- This worked better than the alternative of setting up a context to work with. I definatly need to spend some time writting up the sun4c MMU and how Linux code uses it. diff -ruNpd -x '.[a-z]*' linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c --- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100 +++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100 @@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar unsigned long flags; int pseg; + if (vma-vm_mm-context == NO_CONTEXT) + return; + local_irq_save(flags); address = PAGE_MASK; if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {
Re: [PATCH] Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue [EMAIL PROTECTED] Date: Tue, 31 Jul 2007 08:55:20 +0100 (BST) I have formulated a patch that prevents the update_mmu_cache from doing enything if there is no context available. This apears to have no immediate, undesirable side effects. This worked better than the alternative of setting up a context to work with. Can you for see any issues in doing this? If not, can you check+apply the attached (un-mangled) patch. Thanks for tracking this down Mark. The issue is that, when exec()'ing to userspace from a kernel thread, we need activate_context() to be invoked before we try to touch userspace at all. This new argument handling is invoking get_user_pages() before that happens. activate_context() happens via flush_old_exec(), but that occurs via load_elf_binary() et al. which is long after the argument fetching code runs in fs/exec.c that is using get_user_pages(). (Mark, hint: activate_context() is defined to switch_mm() on sparc32, which is sun4c_switch_mm() which you thought was only invoked from context switches :-)) Touching userspace before activate_context() is questionable at best, in my opinion. But I can't come up with a good way to fix this right now other than Mark's sparc patch, so I will apply it. Thanks again Mark! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
Hi David, One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. I think it is somthing related to this but too much has changed for me to work out what is going on. At present, I don't have a good enough understanding of the virtual memory system and how it interracts with the sun4c mmu. The original code did a job lot of pte stuf in install_arg_page. The new code seems to replace this using get_user_pages but I have not worked out how get_user_pages gets to the point at which it allocated pte's i.e. maps the stack memory it is about to put the arguments into. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. I commented out the flush_cache_page callmade in the new code. This had no effect on the problem. Other tests have shown it is breaking earlier than this. I am going to try to narrow down exactly where the pointer gets messed up as this should help. Regards Mark Fortescue. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
On 7/29/07, Mark Fortescue <[EMAIL PROTECTED]> wrote: > Hi All, > > Unfortunatly Sparc32 sun4c low level memory management apears to be > incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba > mm: variable length argument support. I feel like I ought to help out with this since it's my change which broke things, but I don't have access to a Sparc32 box. Does anyone have a remotely rebootable machine I can use? Ollie - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
Hi David, Thanks for the comments. On Sun, 29 Jul 2007, David Miller wrote: From: Mark Fortescue <[EMAIL PROTECTED]> Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST) Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. For some reason, this commit corrupts the memory used by the low level context/pte handling ring buffers in arch/sparc/mm/sun4c (in add_ring_ordered, head->next becomes set to a NULL pointer). I had a quick look at http://www.linux-mm.org to see if there were any diagrams that show what is going on in the memory management systems, to see if there was something that I could use to help me work out what is going on, but I could not see any. One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. I will see if I can generate some debug code to check out this posibility. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. I backed the commit out of my latest git pull (app 2am this morning) and I end up with a working kernel so this confirms that is is somthing specific to this patch. I will try adding in a flush_cache_page() at an appropriate point on the pre-commit version of the code to see if that makes a mess of things. Regards Mark Fortescue. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
Hi David, Thanks for the comments. On Sun, 29 Jul 2007, David Miller wrote: From: Mark Fortescue [EMAIL PROTECTED] Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST) Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. For some reason, this commit corrupts the memory used by the low level context/pte handling ring buffers in arch/sparc/mm/sun4c (in add_ring_ordered, head-next becomes set to a NULL pointer). I had a quick look at http://www.linux-mm.org to see if there were any diagrams that show what is going on in the memory management systems, to see if there was something that I could use to help me work out what is going on, but I could not see any. One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. I will see if I can generate some debug code to check out this posibility. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. I backed the commit out of my latest git pull (app 2am this morning) and I end up with a working kernel so this confirms that is is somthing specific to this patch. I will try adding in a flush_cache_page() at an appropriate point on the pre-commit version of the code to see if that makes a mess of things. Regards Mark Fortescue. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
On 7/29/07, Mark Fortescue [EMAIL PROTECTED] wrote: Hi All, Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. I feel like I ought to help out with this since it's my change which broke things, but I don't have access to a Sparc32 box. Does anyone have a remotely rebootable machine I can use? Ollie - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
Hi David, One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. I think it is somthing related to this but too much has changed for me to work out what is going on. At present, I don't have a good enough understanding of the virtual memory system and how it interracts with the sun4c mmu. The original code did a job lot of pte stuf in install_arg_page. The new code seems to replace this using get_user_pages but I have not worked out how get_user_pages gets to the point at which it allocated pte's i.e. maps the stack memory it is about to put the arguments into. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. I commented out the flush_cache_page callmade in the new code. This had no effect on the problem. Other tests have shown it is breaking earlier than this. I am going to try to narrow down exactly where the pointer gets messed up as this should help. Regards Mark Fortescue. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue <[EMAIL PROTECTED]> Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST) > Unfortunatly Sparc32 sun4c low level memory management apears to be > incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba > mm: variable length argument support. > > For some reason, this commit corrupts the memory used by the low level > context/pte handling ring buffers in arch/sparc/mm/sun4c (in > add_ring_ordered, head->next becomes set to a NULL pointer). > > I had a quick look at http://www.linux-mm.org to see if there were any > diagrams that show what is going on in the memory management systems, to > see if there was something that I could use to help me work out what is > going on, but I could not see any. One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[SPARC32] NULL pointer derefference
Hi All, Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. For some reason, this commit corrupts the memory used by the low level context/pte handling ring buffers in arch/sparc/mm/sun4c (in add_ring_ordered, head->next becomes set to a NULL pointer). I had a quick look at http://www.linux-mm.org to see if there were any diagrams that show what is going on in the memory management systems, to see if there was something that I could use to help me work out what is going on, but I could not see any. Can any one help? Regards Mark Fortescue. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[SPARC32] NULL pointer derefference
Hi All, Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. For some reason, this commit corrupts the memory used by the low level context/pte handling ring buffers in arch/sparc/mm/sun4c (in add_ring_ordered, head-next becomes set to a NULL pointer). I had a quick look at http://www.linux-mm.org to see if there were any diagrams that show what is going on in the memory management systems, to see if there was something that I could use to help me work out what is going on, but I could not see any. Can any one help? Regards Mark Fortescue. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SPARC32] NULL pointer derefference
From: Mark Fortescue [EMAIL PROTECTED] Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST) Unfortunatly Sparc32 sun4c low level memory management apears to be incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba mm: variable length argument support. For some reason, this commit corrupts the memory used by the low level context/pte handling ring buffers in arch/sparc/mm/sun4c (in add_ring_ordered, head-next becomes set to a NULL pointer). I had a quick look at http://www.linux-mm.org to see if there were any diagrams that show what is going on in the memory management systems, to see if there was something that I could use to help me work out what is going on, but I could not see any. One possible issue is sequencing, perhaps the stack argument copy is occuring before the new context is setup properly on sun4c. Another issue might be the new flush_cache_page() call in this new code in fs/exec.c, there are now cases where flush_cache_page() will be called on kernel addresses, and sun4c's implementation might not like that at all. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/