Re: [PATCH] Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread David Miller
From: Mark Fortescue <[EMAIL PROTECTED]>
Date: Tue, 31 Jul 2007 08:55:20 +0100 (BST)

> I have formulated a patch that prevents the update_mmu_cache from doing 
> enything if there is no context available. This apears to have no 
> immediate, undesirable side effects.
> 
> This worked better than the alternative of setting up a context to work with.
> 
> Can you for see any issues in doing this?
> 
> If not, can you check+apply the attached (un-mangled) patch.

Thanks for tracking this down Mark.

The issue is that, when exec()'ing to userspace from a kernel thread,
we need activate_context() to be invoked before we try to touch
userspace at all.  This new argument handling is invoking
get_user_pages() before that happens.

activate_context() happens via flush_old_exec(), but that occurs via
load_elf_binary() et al. which is long after the argument fetching
code runs in fs/exec.c that is using get_user_pages().

(Mark, hint: activate_context() is defined to switch_mm() on
 sparc32, which is sun4c_switch_mm() which you thought was only
 invoked from context switches :-))

Touching userspace before activate_context() is questionable at best,
in my opinion.  But I can't come up with a good way to fix this right
now other than Mark's sparc patch, so I will apply it.

Thanks again Mark!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread Mark Fortescue

Hi David,

I have formulated a patch that prevents the update_mmu_cache from doing 
enything if there is no context available. This apears to have no 
immediate, undesirable side effects.


This worked better than the alternative of setting up a context to work with.

Can you for see any issues in doing this?

If not, can you check+apply the attached (un-mangled) patch.

diff -ruNpd linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100
+++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;

+   if (vma->vm_mm->context == NO_CONTEXT)
+   return;
+
local_irq_save(flags);
address &= PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {

Regards
Mark Fortescue.From: Mark Fortescue <[EMAIL PROTECTED]>

This deals with a sun4c issue caused by commit 
b6a2fea39318e43fee84fa7b0b90d68bed92d2ba:
mm: variable length argument support.
The new way the code works means that sun4c_update_mmu_cache gets called before 
a context
has been selected, which results in invalid operation of the underling mm code.

Simply ignoring update requests when there is no valid context solves the 
problem.

Signed-off-by Mark Fortescue <[EMAIL PROTECTED]>
---
This worked better than the alternative of setting up a context to work with.
I definatly need to spend some time writting up the sun4c MMU and how Linux 
code uses it.
diff -ruNpd -x '.[a-z]*' linux-2.6/arch/sparc/mm/sun4c.c 
linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100
+++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;
 
+   if (vma->vm_mm->context == NO_CONTEXT)
+   return;
+
local_irq_save(flags);
address &= PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {


Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread David Miller
From: Mark Fortescue <[EMAIL PROTECTED]>
Date: Tue, 31 Jul 2007 06:35:29 +0100 (BST)

> The original code did a job lot of pte stuf in install_arg_page. The
> new code seems to replace this using get_user_pages but I have not
> worked out how get_user_pages gets to the point at which it
> allocated pte's i.e.  maps the stack memory it is about to put the
> arguments into.

get_user_pages() essentially walks through the requested user address
space, faults in pages if necessary, and returns references to those
pages.

The logic of get_user_pages() you need to be concerned about is
this inner loop:

while (!(page = follow_page(vma, start, foll_flags))) {
int ret;
ret = handle_mm_fault(mm, vma, start,
foll_flags & FOLL_WRITE);
 ...
}

handle_mm_fault() does all the dirty work of a page fault, and
is how we get to update_mmu_cache(), the sun4c implementation of
which is where you see the crash.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread David Miller
From: Mark Fortescue [EMAIL PROTECTED]
Date: Tue, 31 Jul 2007 06:35:29 +0100 (BST)

 The original code did a job lot of pte stuf in install_arg_page. The
 new code seems to replace this using get_user_pages but I have not
 worked out how get_user_pages gets to the point at which it
 allocated pte's i.e.  maps the stack memory it is about to put the
 arguments into.

get_user_pages() essentially walks through the requested user address
space, faults in pages if necessary, and returns references to those
pages.

The logic of get_user_pages() you need to be concerned about is
this inner loop:

while (!(page = follow_page(vma, start, foll_flags))) {
int ret;
ret = handle_mm_fault(mm, vma, start,
foll_flags  FOLL_WRITE);
 ...
}

handle_mm_fault() does all the dirty work of a page fault, and
is how we get to update_mmu_cache(), the sun4c implementation of
which is where you see the crash.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread Mark Fortescue

Hi David,

I have formulated a patch that prevents the update_mmu_cache from doing 
enything if there is no context available. This apears to have no 
immediate, undesirable side effects.


This worked better than the alternative of setting up a context to work with.

Can you for see any issues in doing this?

If not, can you check+apply the attached (un-mangled) patch.

diff -ruNpd linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100
+++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;

+   if (vma-vm_mm-context == NO_CONTEXT)
+   return;
+
local_irq_save(flags);
address = PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {

Regards
Mark Fortescue.From: Mark Fortescue [EMAIL PROTECTED]

This deals with a sun4c issue caused by commit 
b6a2fea39318e43fee84fa7b0b90d68bed92d2ba:
mm: variable length argument support.
The new way the code works means that sun4c_update_mmu_cache gets called before 
a context
has been selected, which results in invalid operation of the underling mm code.

Simply ignoring update requests when there is no valid context solves the 
problem.

Signed-off-by Mark Fortescue [EMAIL PROTECTED]
---
This worked better than the alternative of setting up a context to work with.
I definatly need to spend some time writting up the sun4c MMU and how Linux 
code uses it.
diff -ruNpd -x '.[a-z]*' linux-2.6/arch/sparc/mm/sun4c.c 
linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.0 +0100
+++ linux-test/arch/sparc/mm/sun4c.c2007-07-31 08:28:13.0 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;
 
+   if (vma-vm_mm-context == NO_CONTEXT)
+   return;
+
local_irq_save(flags);
address = PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {


Re: [PATCH] Re: [SPARC32] NULL pointer derefference

2007-07-31 Thread David Miller
From: Mark Fortescue [EMAIL PROTECTED]
Date: Tue, 31 Jul 2007 08:55:20 +0100 (BST)

 I have formulated a patch that prevents the update_mmu_cache from doing 
 enything if there is no context available. This apears to have no 
 immediate, undesirable side effects.
 
 This worked better than the alternative of setting up a context to work with.
 
 Can you for see any issues in doing this?
 
 If not, can you check+apply the attached (un-mangled) patch.

Thanks for tracking this down Mark.

The issue is that, when exec()'ing to userspace from a kernel thread,
we need activate_context() to be invoked before we try to touch
userspace at all.  This new argument handling is invoking
get_user_pages() before that happens.

activate_context() happens via flush_old_exec(), but that occurs via
load_elf_binary() et al. which is long after the argument fetching
code runs in fs/exec.c that is using get_user_pages().

(Mark, hint: activate_context() is defined to switch_mm() on
 sparc32, which is sun4c_switch_mm() which you thought was only
 invoked from context switches :-))

Touching userspace before activate_context() is questionable at best,
in my opinion.  But I can't come up with a good way to fix this right
now other than Mark's sparc patch, so I will apply it.

Thanks again Mark!

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Mark Fortescue

Hi David,



One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.



I think it is somthing related to this but too much has changed for me to 
work out what is going on. At present, I don't have a good enough 
understanding of the virtual memory system and how it interracts with the 
sun4c mmu.


The original code did a job lot of pte stuf in install_arg_page. The new 
code seems to replace this using get_user_pages but I have not worked out 
how get_user_pages gets to the point at which it allocated pte's i.e. 
maps the stack memory it is about to put the arguments into.



Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.


I commented out the flush_cache_page callmade in the new code. This had no 
effect on the problem. Other tests have shown it is breaking earlier than 
this.


I am going to try to narrow down exactly where the pointer gets messed up 
as this should help.


Regards
Mark Fortescue.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Ollie Wild
On 7/29/07, Mark Fortescue <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> Unfortunatly Sparc32 sun4c low level memory management apears to be
> incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
> mm: variable length argument support.

I feel like I ought to help out with this since it's my change which
broke things, but I don't have access to a Sparc32 box.  Does anyone
have a remotely rebootable machine I can use?

Ollie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Mark Fortescue

Hi David,

Thanks for the comments.

On Sun, 29 Jul 2007, David Miller wrote:


From: Mark Fortescue <[EMAIL PROTECTED]>
Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)


Unfortunatly Sparc32 sun4c low level memory management apears to be
incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
mm: variable length argument support.

For some reason, this commit corrupts the memory used by the low level
context/pte handling ring buffers in arch/sparc/mm/sun4c (in
add_ring_ordered, head->next becomes set to a NULL pointer).

I had a quick look at http://www.linux-mm.org to see if there were any
diagrams that show what is going on in the memory management systems, to
see if there was something that I could use to help me work out what is
going on, but I could not see any.


One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.



I will see if I can generate some debug code to check out this posibility.


Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.



I backed the commit out of my latest git pull (app 2am this morning) and I 
end up with a working kernel so this confirms that is is somthing 
specific to this patch.


I will try adding in a flush_cache_page() at an appropriate point on the 
pre-commit version of the code to see if that makes a mess of things.


Regards
Mark Fortescue.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Mark Fortescue

Hi David,

Thanks for the comments.

On Sun, 29 Jul 2007, David Miller wrote:


From: Mark Fortescue [EMAIL PROTECTED]
Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)


Unfortunatly Sparc32 sun4c low level memory management apears to be
incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
mm: variable length argument support.

For some reason, this commit corrupts the memory used by the low level
context/pte handling ring buffers in arch/sparc/mm/sun4c (in
add_ring_ordered, head-next becomes set to a NULL pointer).

I had a quick look at http://www.linux-mm.org to see if there were any
diagrams that show what is going on in the memory management systems, to
see if there was something that I could use to help me work out what is
going on, but I could not see any.


One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.



I will see if I can generate some debug code to check out this posibility.


Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.



I backed the commit out of my latest git pull (app 2am this morning) and I 
end up with a working kernel so this confirms that is is somthing 
specific to this patch.


I will try adding in a flush_cache_page() at an appropriate point on the 
pre-commit version of the code to see if that makes a mess of things.


Regards
Mark Fortescue.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Ollie Wild
On 7/29/07, Mark Fortescue [EMAIL PROTECTED] wrote:
 Hi All,

 Unfortunatly Sparc32 sun4c low level memory management apears to be
 incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
 mm: variable length argument support.

I feel like I ought to help out with this since it's my change which
broke things, but I don't have access to a Sparc32 box.  Does anyone
have a remotely rebootable machine I can use?

Ollie
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-30 Thread Mark Fortescue

Hi David,



One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.



I think it is somthing related to this but too much has changed for me to 
work out what is going on. At present, I don't have a good enough 
understanding of the virtual memory system and how it interracts with the 
sun4c mmu.


The original code did a job lot of pte stuf in install_arg_page. The new 
code seems to replace this using get_user_pages but I have not worked out 
how get_user_pages gets to the point at which it allocated pte's i.e. 
maps the stack memory it is about to put the arguments into.



Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.


I commented out the flush_cache_page callmade in the new code. This had no 
effect on the problem. Other tests have shown it is breaking earlier than 
this.


I am going to try to narrow down exactly where the pointer gets messed up 
as this should help.


Regards
Mark Fortescue.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-29 Thread David Miller
From: Mark Fortescue <[EMAIL PROTECTED]>
Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)

> Unfortunatly Sparc32 sun4c low level memory management apears to be 
> incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
> mm: variable length argument support.
> 
> For some reason, this commit corrupts the memory used by the low level 
> context/pte handling ring buffers in arch/sparc/mm/sun4c (in 
> add_ring_ordered, head->next becomes set to a NULL pointer).
> 
> I had a quick look at http://www.linux-mm.org to see if there were any 
> diagrams that show what is going on in the memory management systems, to 
> see if there was something that I could use to help me work out what is 
> going on, but I could not see any.

One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.

Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[SPARC32] NULL pointer derefference

2007-07-29 Thread Mark Fortescue

Hi All,

Unfortunatly Sparc32 sun4c low level memory management apears to be 
incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba

mm: variable length argument support.

For some reason, this commit corrupts the memory used by the low level 
context/pte handling ring buffers in arch/sparc/mm/sun4c (in 
add_ring_ordered, head->next becomes set to a NULL pointer).


I had a quick look at http://www.linux-mm.org to see if there were any 
diagrams that show what is going on in the memory management systems, to 
see if there was something that I could use to help me work out what is 
going on, but I could not see any.


Can any one help?

Regards
Mark Fortescue.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[SPARC32] NULL pointer derefference

2007-07-29 Thread Mark Fortescue

Hi All,

Unfortunatly Sparc32 sun4c low level memory management apears to be 
incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba

mm: variable length argument support.

For some reason, this commit corrupts the memory used by the low level 
context/pte handling ring buffers in arch/sparc/mm/sun4c (in 
add_ring_ordered, head-next becomes set to a NULL pointer).


I had a quick look at http://www.linux-mm.org to see if there were any 
diagrams that show what is going on in the memory management systems, to 
see if there was something that I could use to help me work out what is 
going on, but I could not see any.


Can any one help?

Regards
Mark Fortescue.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SPARC32] NULL pointer derefference

2007-07-29 Thread David Miller
From: Mark Fortescue [EMAIL PROTECTED]
Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)

 Unfortunatly Sparc32 sun4c low level memory management apears to be 
 incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
 mm: variable length argument support.
 
 For some reason, this commit corrupts the memory used by the low level 
 context/pte handling ring buffers in arch/sparc/mm/sun4c (in 
 add_ring_ordered, head-next becomes set to a NULL pointer).
 
 I had a quick look at http://www.linux-mm.org to see if there were any 
 diagrams that show what is going on in the memory management systems, to 
 see if there was something that I could use to help me work out what is 
 going on, but I could not see any.

One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.

Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/