Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Don Lewis
On 18 Dec, Mateusz Guzik wrote:
> On Fri, Dec 18, 2015 at 09:44:10AM -0800, Don Lewis wrote:
>> On 18 Dec, Mateusz Guzik wrote:
>> > On Thu, Dec 17, 2015 at 02:33:46PM -0800, Don Lewis wrote:
>> >> On 17 Dec, Mateusz Guzik wrote:
>> >> > On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
>> >> >> On 17 Dec, Konstantin Belousov wrote:
>> >> >> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
>> >> >> >> I used to have a patch the deferred linking the new process into
>> >> >> >> proctree/allproc until it was fully formed.  The motivation was to 
>> >> >> >> get
>> >> >> >> rid of all of the PRS_NEW stuff scattered around the source.
>> >> >> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost 
>> >> >> >> it.
>> >> >> > 
>> >> >> > I had similar tought for a second as one of the possibilities to fix 
>> >> >> > the
>> >> >> > issue, but rejected it outright due to the way the pid allocator 
>> >> >> > works.
>> >> >> > The loop which faulted is the allocator, it depends on the new pid 
>> >> >> > being
>> >> >> > linked early to detect the duplicated alloc.
>> >> >> > 
>> >> >> > What you wrote could be done, but this restructuring requires the 
>> >> >> > separate
>> >> >> > pid allocator, and probably it must repeat all quirks and subtle 
>> >> >> > behaviour
>> >> >> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
>> >> >> > on its own.
>> >> >> 
>> >> >> I don't think it requires any changes to the allocater.  It should only
>> >> >> be necessary to delay the call to fork_findpid() until we are ready to
>> >> >> link the new proc into allproc.  Basically, drop the locks at the
>> >> >> beginning of do_fork(), then grab them again somewhere near the end
>> >> >> (probably where we are currently mark the process as PRS_NORMAL) and
>> >> >> move the call to fork_findpid(), the p2->p_pid assignment, and the list
>> >> >> manipulation code to a location after that.
>> >> >> 
>> >> >> It's probably not quite that simple though ...
>> >> > 
>> >> > That would mean you would need to be able to deconstruct the process
>> >> > because you cannot guarantee there are any pids left, which may or may
>> >> > not be easily doable.
>> >> 
>> >> It doesn't look like we handle that properly in the current code.  I
>> >> think fork_findpid() will loop forever.  It shouldn't be possible if
>> >> maxproc < pid_max / 3, or maybe pid_max / 2.  It might be a good idea to
>> >> enforce this.
>> >> 
>> > 
>> > Not sure I follow, can you rephrase/elaborate?
>> 
>> The first time through, fork_findpid() will start it's search with
>> trypid=lastpid+1 and searches upwards from there.  If it reaches PID_MAX
>> (I think that should be pid_max) without finding a free pid, it does a
>> goto retry, which resets trypid back to the beginning and restarts the
>> search.  IF there are no free pids, then trypid will goto retry and
>> repeat this same loop forever.
>> 
>> There is no error return from fork_findpid() to indicate that the fork
>> should fail if there are no free pids.
>> 
> 
> Oh, you mean nprocs check used prior to calling do_fork is insufficient.

Unlikely, but possible.  Each process could theoretically consume three
pid values, if its pid, process group id, and session id are all
different.  If you crank down pid_max, you could run out of free pid
values before you hit the total process limit.

> I can't test it right now, indeed sees like a real problem. Bitmap
> switch fixes this automatically without further hackery.

The existing implementation could be fixed by detected the second retry
and returning an error, but that would require that fork_findpid() have
a way of returning and error and the addition of error handling code in
the caller.

The bitmap implementation would probably need the same sort of retry
loop if it wanted to match the current sequential pid allocation
behavior and it would need the same sort of fix.

Unfortunately this doesn't help if we call fork_findpid() late as I was
suggesting.  Instead, I think we should maintain a free pid count and
use it in the same fashion as the nproc test.
 
>> >> > The current method is going to bite us performance-wise anyway and an
>> >> > allocater which does not require a walk over the tree is necessary in
>> >> > the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
>> >> > go here.
>> >> 
>> >> I think that separate bitmaps for process, process group, and session
>> >> ids would be needed.  It would waste some space, but it's probably more
>> >> efficent to use a byte array and store all the bits for the pid
>> >> together.
>> >> 
>> > 
>> > Well I had such separate bitmaps in mind with addition of a combined
>> > "the id is in use bitmap". This would make the common case of finding a
>> > new pid reasonably fast. Access to all bitmaps would be protected with
>> > proctree lock, which matches current locking scheme anyway.
>> 
>> That's also a possibility.  Maintaining the bitmaps woul

Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Mateusz Guzik
On Fri, Dec 18, 2015 at 09:44:10AM -0800, Don Lewis wrote:
> On 18 Dec, Mateusz Guzik wrote:
> > On Thu, Dec 17, 2015 at 02:33:46PM -0800, Don Lewis wrote:
> >> On 17 Dec, Mateusz Guzik wrote:
> >> > On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
> >> >> On 17 Dec, Konstantin Belousov wrote:
> >> >> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
> >> >> >> I used to have a patch the deferred linking the new process into
> >> >> >> proctree/allproc until it was fully formed.  The motivation was to 
> >> >> >> get
> >> >> >> rid of all of the PRS_NEW stuff scattered around the source.
> >> >> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost 
> >> >> >> it.
> >> >> > 
> >> >> > I had similar tought for a second as one of the possibilities to fix 
> >> >> > the
> >> >> > issue, but rejected it outright due to the way the pid allocator 
> >> >> > works.
> >> >> > The loop which faulted is the allocator, it depends on the new pid 
> >> >> > being
> >> >> > linked early to detect the duplicated alloc.
> >> >> > 
> >> >> > What you wrote could be done, but this restructuring requires the 
> >> >> > separate
> >> >> > pid allocator, and probably it must repeat all quirks and subtle 
> >> >> > behaviour
> >> >> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
> >> >> > on its own.
> >> >> 
> >> >> I don't think it requires any changes to the allocater.  It should only
> >> >> be necessary to delay the call to fork_findpid() until we are ready to
> >> >> link the new proc into allproc.  Basically, drop the locks at the
> >> >> beginning of do_fork(), then grab them again somewhere near the end
> >> >> (probably where we are currently mark the process as PRS_NORMAL) and
> >> >> move the call to fork_findpid(), the p2->p_pid assignment, and the list
> >> >> manipulation code to a location after that.
> >> >> 
> >> >> It's probably not quite that simple though ...
> >> > 
> >> > That would mean you would need to be able to deconstruct the process
> >> > because you cannot guarantee there are any pids left, which may or may
> >> > not be easily doable.
> >> 
> >> It doesn't look like we handle that properly in the current code.  I
> >> think fork_findpid() will loop forever.  It shouldn't be possible if
> >> maxproc < pid_max / 3, or maybe pid_max / 2.  It might be a good idea to
> >> enforce this.
> >> 
> > 
> > Not sure I follow, can you rephrase/elaborate?
> 
> The first time through, fork_findpid() will start it's search with
> trypid=lastpid+1 and searches upwards from there.  If it reaches PID_MAX
> (I think that should be pid_max) without finding a free pid, it does a
> goto retry, which resets trypid back to the beginning and restarts the
> search.  IF there are no free pids, then trypid will goto retry and
> repeat this same loop forever.
> 
> There is no error return from fork_findpid() to indicate that the fork
> should fail if there are no free pids.
> 

Oh, you mean nprocs check used prior to calling do_fork is insufficient.

I can't test it right now, indeed sees like a real problem. Bitmap
switch fixes this automatically without further hackery.

> >> > The current method is going to bite us performance-wise anyway and an
> >> > allocater which does not require a walk over the tree is necessary in
> >> > the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
> >> > go here.
> >> 
> >> I think that separate bitmaps for process, process group, and session
> >> ids would be needed.  It would waste some space, but it's probably more
> >> efficent to use a byte array and store all the bits for the pid
> >> together.
> >> 
> > 
> > Well I had such separate bitmaps in mind with addition of a combined
> > "the id is in use bitmap". This would make the common case of finding a
> > new pid reasonably fast. Access to all bitmaps would be protected with
> > proctree lock, which matches current locking scheme anyway.
> 
> That's also a possibility.  Maintaining the bitmaps would be more
> complicated because any time one of the individual bitmaps is updated,
> the combined bitmap would also have to be recalculated.  It would be
> possible to use bit_ffc() to find the first free pid, but that would
> always find the lowest available free pid and would not emulate the
> current default behaviour of allocating pids mostly sequentually.
> 

What do you mean recalculated? Maybe I got this wrong, but it seems
totally trivial.

When allocating either a new process group or a new session, set the bit
in the combined map.

When allocating a new pid, check the combined map. Free entry implies an
unused pid, so use that. Set the bit in the combined map and pid map.

Finally, when freeing either of these identifiers, unset the bit in the
combined map only if no other map has the bit set.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-c

Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Don Lewis
On 18 Dec, Mateusz Guzik wrote:
> On Thu, Dec 17, 2015 at 02:33:46PM -0800, Don Lewis wrote:
>> On 17 Dec, Mateusz Guzik wrote:
>> > On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
>> >> On 17 Dec, Konstantin Belousov wrote:
>> >> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
>> >> >> I used to have a patch the deferred linking the new process into
>> >> >> proctree/allproc until it was fully formed.  The motivation was to get
>> >> >> rid of all of the PRS_NEW stuff scattered around the source.
>> >> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.
>> >> > 
>> >> > I had similar tought for a second as one of the possibilities to fix the
>> >> > issue, but rejected it outright due to the way the pid allocator works.
>> >> > The loop which faulted is the allocator, it depends on the new pid being
>> >> > linked early to detect the duplicated alloc.
>> >> > 
>> >> > What you wrote could be done, but this restructuring requires the 
>> >> > separate
>> >> > pid allocator, and probably it must repeat all quirks and subtle 
>> >> > behaviour
>> >> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
>> >> > on its own.
>> >> 
>> >> I don't think it requires any changes to the allocater.  It should only
>> >> be necessary to delay the call to fork_findpid() until we are ready to
>> >> link the new proc into allproc.  Basically, drop the locks at the
>> >> beginning of do_fork(), then grab them again somewhere near the end
>> >> (probably where we are currently mark the process as PRS_NORMAL) and
>> >> move the call to fork_findpid(), the p2->p_pid assignment, and the list
>> >> manipulation code to a location after that.
>> >> 
>> >> It's probably not quite that simple though ...
>> > 
>> > That would mean you would need to be able to deconstruct the process
>> > because you cannot guarantee there are any pids left, which may or may
>> > not be easily doable.
>> 
>> It doesn't look like we handle that properly in the current code.  I
>> think fork_findpid() will loop forever.  It shouldn't be possible if
>> maxproc < pid_max / 3, or maybe pid_max / 2.  It might be a good idea to
>> enforce this.
>> 
> 
> Not sure I follow, can you rephrase/elaborate?

The first time through, fork_findpid() will start it's search with
trypid=lastpid+1 and searches upwards from there.  If it reaches PID_MAX
(I think that should be pid_max) without finding a free pid, it does a
goto retry, which resets trypid back to the beginning and restarts the
search.  IF there are no free pids, then trypid will goto retry and
repeat this same loop forever.

There is no error return from fork_findpid() to indicate that the fork
should fail if there are no free pids.

>> > The current method is going to bite us performance-wise anyway and an
>> > allocater which does not require a walk over the tree is necessary in
>> > the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
>> > go here.
>> 
>> I think that separate bitmaps for process, process group, and session
>> ids would be needed.  It would waste some space, but it's probably more
>> efficent to use a byte array and store all the bits for the pid
>> together.
>> 
> 
> Well I had such separate bitmaps in mind with addition of a combined
> "the id is in use bitmap". This would make the common case of finding a
> new pid reasonably fast. Access to all bitmaps would be protected with
> proctree lock, which matches current locking scheme anyway.

That's also a possibility.  Maintaining the bitmaps would be more
complicated because any time one of the individual bitmaps is updated,
the combined bitmap would also have to be recalculated.  It would be
possible to use bit_ffc() to find the first free pid, but that would
always find the lowest available free pid and would not emulate the
current default behaviour of allocating pids mostly sequentually.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Mateusz Guzik
On Thu, Dec 17, 2015 at 02:33:46PM -0800, Don Lewis wrote:
> On 17 Dec, Mateusz Guzik wrote:
> > On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
> >> On 17 Dec, Konstantin Belousov wrote:
> >> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
> >> >> I used to have a patch the deferred linking the new process into
> >> >> proctree/allproc until it was fully formed.  The motivation was to get
> >> >> rid of all of the PRS_NEW stuff scattered around the source.
> >> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.
> >> > 
> >> > I had similar tought for a second as one of the possibilities to fix the
> >> > issue, but rejected it outright due to the way the pid allocator works.
> >> > The loop which faulted is the allocator, it depends on the new pid being
> >> > linked early to detect the duplicated alloc.
> >> > 
> >> > What you wrote could be done, but this restructuring requires the 
> >> > separate
> >> > pid allocator, and probably it must repeat all quirks and subtle 
> >> > behaviour
> >> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
> >> > on its own.
> >> 
> >> I don't think it requires any changes to the allocater.  It should only
> >> be necessary to delay the call to fork_findpid() until we are ready to
> >> link the new proc into allproc.  Basically, drop the locks at the
> >> beginning of do_fork(), then grab them again somewhere near the end
> >> (probably where we are currently mark the process as PRS_NORMAL) and
> >> move the call to fork_findpid(), the p2->p_pid assignment, and the list
> >> manipulation code to a location after that.
> >> 
> >> It's probably not quite that simple though ...
> > 
> > That would mean you would need to be able to deconstruct the process
> > because you cannot guarantee there are any pids left, which may or may
> > not be easily doable.
> 
> It doesn't look like we handle that properly in the current code.  I
> think fork_findpid() will loop forever.  It shouldn't be possible if
> maxproc < pid_max / 3, or maybe pid_max / 2.  It might be a good idea to
> enforce this.
> 

Not sure I follow, can you rephrase/elaborate?

> > The current method is going to bite us performance-wise anyway and an
> > allocater which does not require a walk over the tree is necessary in
> > the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
> > go here.
> 
> I think that separate bitmaps for process, process group, and session
> ids would be needed.  It would waste some space, but it's probably more
> efficent to use a byte array and store all the bits for the pid
> together.
> 

Well I had such separate bitmaps in mind with addition of a combined
"the id is in use bitmap". This would make the common case of finding a
new pid reasonably fast. Access to all bitmaps would be protected with
proctree lock, which matches current locking scheme anyway.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Mateusz Guzik
On Fri, Dec 18, 2015 at 11:36:44AM +0100, Fabian Keil wrote:
> Fabian Keil  wrote:
> 
> > Konstantin Belousov  wrote:
> > 
> > > On Wed, Dec 16, 2015 at 04:09:17PM +0100, Fabian Keil wrote:  
> > > > Thanks. I'm currently testing the patch on three systems but it may 
> > > > take a while ...
> > > > 
> > > 
> > > Better use mjg' patch with the small adjustment.  I put it below.  
> > 
> > Will do.
> 
> Several million forks later the systems remain stable.
> 

Thanks, fix committed in
https://svnweb.freebsd.org/base?view=revision&revision=292440


-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-18 Thread Fabian Keil
Fabian Keil  wrote:

> Konstantin Belousov  wrote:
> 
> > On Wed, Dec 16, 2015 at 04:09:17PM +0100, Fabian Keil wrote:  
> > > Thanks. I'm currently testing the patch on three systems but it may take 
> > > a while ...
> > > 
> > 
> > Better use mjg' patch with the small adjustment.  I put it below.  
> 
> Will do.

Several million forks later the systems remain stable.

Fabian


pgpw0NeXyMOPB.pgp
Description: OpenPGP digital signature


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-17 Thread Don Lewis
On 17 Dec, Mateusz Guzik wrote:
> On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
>> On 17 Dec, Konstantin Belousov wrote:
>> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
>> >> I used to have a patch the deferred linking the new process into
>> >> proctree/allproc until it was fully formed.  The motivation was to get
>> >> rid of all of the PRS_NEW stuff scattered around the source.
>> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.
>> > 
>> > I had similar tought for a second as one of the possibilities to fix the
>> > issue, but rejected it outright due to the way the pid allocator works.
>> > The loop which faulted is the allocator, it depends on the new pid being
>> > linked early to detect the duplicated alloc.
>> > 
>> > What you wrote could be done, but this restructuring requires the separate
>> > pid allocator, and probably it must repeat all quirks and subtle behaviour
>> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
>> > on its own.
>> 
>> I don't think it requires any changes to the allocater.  It should only
>> be necessary to delay the call to fork_findpid() until we are ready to
>> link the new proc into allproc.  Basically, drop the locks at the
>> beginning of do_fork(), then grab them again somewhere near the end
>> (probably where we are currently mark the process as PRS_NORMAL) and
>> move the call to fork_findpid(), the p2->p_pid assignment, and the list
>> manipulation code to a location after that.
>> 
>> It's probably not quite that simple though ...
> 
> That would mean you would need to be able to deconstruct the process
> because you cannot guarantee there are any pids left, which may or may
> not be easily doable.

It doesn't look like we handle that properly in the current code.  I
think fork_findpid() will loop forever.  It shouldn't be possible if
maxproc < pid_max / 3, or maybe pid_max / 2.  It might be a good idea to
enforce this.

> The current method is going to bite us performance-wise anyway and an
> allocater which does not require a walk over the tree is necessary in
> the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
> go here.

I think that separate bitmaps for process, process group, and session
ids would be needed.  It would waste some space, but it's probably more
efficent to use a byte array and store all the bits for the pid
together.

> Meanwhile one can add a special process permanently in PRS_NEW state and
> poisoned pointers in debug kernels to help ensuring that all loops
> handle the case.
> 
> Not signing up for any of this work though.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-17 Thread Mateusz Guzik
On Thu, Dec 17, 2015 at 11:48:08AM -0800, Don Lewis wrote:
> On 17 Dec, Konstantin Belousov wrote:
> > On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
> >> I used to have a patch the deferred linking the new process into
> >> proctree/allproc until it was fully formed.  The motivation was to get
> >> rid of all of the PRS_NEW stuff scattered around the source.
> >> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.
> > 
> > I had similar tought for a second as one of the possibilities to fix the
> > issue, but rejected it outright due to the way the pid allocator works.
> > The loop which faulted is the allocator, it depends on the new pid being
> > linked early to detect the duplicated alloc.
> > 
> > What you wrote could be done, but this restructuring requires the separate
> > pid allocator, and probably it must repeat all quirks and subtle behaviour
> > of the current algorithm.  But I do not object, PRS_NEW is a trouble
> > on its own.
> 
> I don't think it requires any changes to the allocater.  It should only
> be necessary to delay the call to fork_findpid() until we are ready to
> link the new proc into allproc.  Basically, drop the locks at the
> beginning of do_fork(), then grab them again somewhere near the end
> (probably where we are currently mark the process as PRS_NORMAL) and
> move the call to fork_findpid(), the p2->p_pid assignment, and the list
> manipulation code to a location after that.
> 
> It's probably not quite that simple though ...

That would mean you would need to be able to deconstruct the process
because you cannot guarantee there are any pids left, which may or may
not be easily doable.

The current method is going to bite us performance-wise anyway and an
allocater which does not require a walk over the tree is necessary in
the long run. Seems like a bitmap (or a bunch of bitmaps) is the way to
go here.

Meanwhile one can add a special process permanently in PRS_NEW state and
poisoned pointers in debug kernels to help ensuring that all loops
handle the case.

Not signing up for any of this work though.
-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-17 Thread Don Lewis
On 17 Dec, Konstantin Belousov wrote:
> On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
>> I used to have a patch the deferred linking the new process into
>> proctree/allproc until it was fully formed.  The motivation was to get
>> rid of all of the PRS_NEW stuff scattered around the source.
>> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.
> 
> I had similar tought for a second as one of the possibilities to fix the
> issue, but rejected it outright due to the way the pid allocator works.
> The loop which faulted is the allocator, it depends on the new pid being
> linked early to detect the duplicated alloc.
> 
> What you wrote could be done, but this restructuring requires the separate
> pid allocator, and probably it must repeat all quirks and subtle behaviour
> of the current algorithm.  But I do not object, PRS_NEW is a trouble
> on its own.

I don't think it requires any changes to the allocater.  It should only
be necessary to delay the call to fork_findpid() until we are ready to
link the new proc into allproc.  Basically, drop the locks at the
beginning of do_fork(), then grab them again somewhere near the end
(probably where we are currently mark the process as PRS_NORMAL) and
move the call to fork_findpid(), the p2->p_pid assignment, and the list
manipulation code to a location after that.

It's probably not quite that simple though ...


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-17 Thread Konstantin Belousov
On Wed, Dec 16, 2015 at 11:08:02AM -0800, Don Lewis wrote:
> I used to have a patch the deferred linking the new process into
> proctree/allproc until it was fully formed.  The motivation was to get
> rid of all of the PRS_NEW stuff scattered around the source.
> Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.

I had similar tought for a second as one of the possibilities to fix the
issue, but rejected it outright due to the way the pid allocator works.
The loop which faulted is the allocator, it depends on the new pid being
linked early to detect the duplicated alloc.

What you wrote could be done, but this restructuring requires the separate
pid allocator, and probably it must repeat all quirks and subtle behaviour
of the current algorithm.  But I do not object, PRS_NEW is a trouble
on its own.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Fabian Keil
Konstantin Belousov  wrote:

> On Wed, Dec 16, 2015 at 04:09:17PM +0100, Fabian Keil wrote:
> > Thanks. I'm currently testing the patch on three systems but it may take a 
> > while ...
> >   
> 
> Better use mjg' patch with the small adjustment.  I put it below.

Will do.

Fabian


pgpARhuAh6qDb.pgp
Description: OpenPGP digital signature


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Don Lewis
On 16 Dec, Konstantin Belousov wrote:
> On Wed, Dec 16, 2015 at 12:21:16PM +0100, Fabian Keil wrote:
>> Konstantin Belousov  wrote:
>> > It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8.
>> 
>> Unfortunately it's not available and apparently I removed the attempts
>> to get it from the previous output.
> 
>> allproc is available and the first one matches lastpid and has an invalid
>> p_pgrp, but due to trypid being optimized out as well, it's not obvious
>> (to me) that it's the right process.
> 
> p_suspcount = 0, p_xthread = 0xf801162819a0, p_boundary_count = 0, 
> p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_treeflag = 0, p_magic 
> = 3203398350, p_osrel = 1100090, 
>>   p_comm = 0xf800304df3c4 "privoxy",
> p_pgrp = 0x618b0080,
> 
>> I've changed p's declaration to static so hopefully its value will
>> be available the next time the panic occurs, but it may take a while
>> until that happens.
> 
> From the state of the process you provided, it is a new (zigote) of the
> forking process, which was already linked into allproc list.  Also,
> it seems that bzero part of the forking procedure was finished, but bcopy
> was not yet.  The p_pgrp cannot be a pointer, it is not yet initialized.
> 
> There, we have at least one issue, since zigote is linked before the
> p_pgrp is initialized, and the proctree/allproc locks are dropped.
> As result, fork_findpid() accesses memory with undefined content.
> 
> It seems that the least morbid solution is to slightly extend the scope
> of the allproc lock in do_fork(), to prevent fork_findpid() from working
> while we did not finished copying data from old to new process.

I used to have a patch the deferred linking the new process into
proctree/allproc until it was fully formed.  The motivation was to get
rid of all of the PRS_NEW stuff scattered around the source.
Unfortunately the patch bit-rotted and I'm pretty sure that I lost it.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Konstantin Belousov
On Wed, Dec 16, 2015 at 04:09:17PM +0100, Fabian Keil wrote:
> Thanks. I'm currently testing the patch on three systems but it may take a 
> while ...
> 

Better use mjg' patch with the small adjustment.  I put it below.

diff --git a/sys/kern/kern_proc.c b/sys/kern/kern_proc.c
index 435a07b..fc39217 100644
--- a/sys/kern/kern_proc.c
+++ b/sys/kern/kern_proc.c
@@ -251,6 +251,7 @@ proc_init(void *mem, int size, int flags)
TAILQ_INIT(&p->p_threads);   /* all threads in proc */
EVENTHANDLER_INVOKE(process_init, p);
p->p_stats = pstats_alloc();
+   p->p_pgrp = NULL;
SDT_PROBE3(proc, kernel, init, return, p, size, flags);
return (0);
 }
diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index 90effa6..cb94318 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -586,7 +586,6 @@ struct proc {
int p_osrel;/* (x) osreldate for the
   binary (from ELF note, if any) */
charp_comm[MAXCOMLEN + 1];  /* (b) Process name. */
-   struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
struct sysentvec *p_sysent; /* (b) Syscall dispatch info. */
struct pargs*p_args;/* (c) Process arguments. */
rlim_t  p_cpulimit; /* (c) Current CPU limit in seconds. */
@@ -599,6 +598,7 @@ struct proc {
u_int   p_xsig; /* (c) Stop/kill sig. */
 /* End area that is copied on creation. */
 #definep_endcopy   p_xsig
+   struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
struct knlist   p_klist;/* (c) Knotes attached to this proc. */
int p_numthreads;   /* (c) Number of threads. */
struct mdproc   p_md;   /* Any machine-dependent fields. */
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Fabian Keil
Konstantin Belousov  wrote:

> On Wed, Dec 16, 2015 at 12:21:16PM +0100, Fabian Keil wrote:
> > Konstantin Belousov  wrote:  
> > > It is the values of *p and *(p->p_pgrp) that are needed, from the frame 
> > > 8.  
> > 
> > Unfortunately it's not available and apparently I removed the attempts
> > to get it from the previous output.  
> 
> > allproc is available and the first one matches lastpid and has an invalid
> > p_pgrp, but due to trypid being optimized out as well, it's not obvious
> > (to me) that it's the right process.  
> 
> p_suspcount = 0, p_xthread = 0xf801162819a0, p_boundary_count = 0, 
> p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_treeflag = 0, p_magic 
> = 3203398350, p_osrel = 1100090, 
> >   p_comm = 0xf800304df3c4 "privoxy",  
> p_pgrp = 0x618b0080,
> 
> > I've changed p's declaration to static so hopefully its value will
> > be available the next time the panic occurs, but it may take a while
> > until that happens.  
> 
> From the state of the process you provided, it is a new (zigote) of the
> forking process, which was already linked into allproc list.  Also,
> it seems that bzero part of the forking procedure was finished, but bcopy
> was not yet.  The p_pgrp cannot be a pointer, it is not yet initialized.
> 
> There, we have at least one issue, since zigote is linked before the
> p_pgrp is initialized, and the proctree/allproc locks are dropped.
> As result, fork_findpid() accesses memory with undefined content.
> 
> It seems that the least morbid solution is to slightly extend the scope
> of the allproc lock in do_fork(), to prevent fork_findpid() from working
> while we did not finished copying data from old to new process.

Thanks. I'm currently testing the patch on three systems but it may take a 
while ...

Fabian


pgpvGkzQ7IxHR.pgp
Description: OpenPGP digital signature


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Fabian Keil
Oliver Pinter  wrote:

> Yes, it's a HardenedBSD commit. Currently only a workaround, because I have
> now lesser time for the real fix in this month.
> 
> Are you running on ZFS?

Yes.

Fabian


pgpuOBy_BlV8u.pgp
Description: OpenPGP digital signature


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Konstantin Belousov
On Wed, Dec 16, 2015 at 02:54:27PM +0100, Mateusz Guzik wrote:
> While I agree with analysis the patch does not look right. Since the
> struct is only assigned and all locks get dropped, there is nothing
> preventing another thread from the forking process to change the process
> group.
> 
> Interestngly very same function assigns the pointer explicitely later:
> p2->p_pgrp = p1->p_pgrp;
> 
> As such, I would argue the right solution is to move p_pgrp out of the
> copied area. Exit path clears the pointer, so it should be fine to just
> do that.
For reused struct proc it would be enough, but not for the new allocation.
Neither init nor ctr for the proc zone do not initialize p_pgrp, so
you would end up with the garbage in the pointer.

I think that your patch should add explcit zeroing of the member into
proc_init().

> 
> That is (untested):
> 
> diff --git a/sys/sys/proc.h b/sys/sys/proc.h
> index 90effa6..cb94318 100644
> --- a/sys/sys/proc.h
> +++ b/sys/sys/proc.h
> @@ -586,7 +586,6 @@ struct proc {
>   int p_osrel;/* (x) osreldate for the
>  binary (from ELF note, if any) */
>   charp_comm[MAXCOMLEN + 1];  /* (b) Process name. */
> - struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
>   struct sysentvec *p_sysent; /* (b) Syscall dispatch info. */
>   struct pargs*p_args;/* (c) Process arguments. */
>   rlim_t  p_cpulimit; /* (c) Current CPU limit in seconds. */
> @@ -599,6 +598,7 @@ struct proc {
>   u_int   p_xsig; /* (c) Stop/kill sig. */
>  /* End area that is copied on creation. */
>  #define  p_endcopy   p_xsig
> + struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
>   struct knlist   p_klist;/* (c) Knotes attached to this proc. */
>   int p_numthreads;   /* (c) Number of threads. */
>   struct mdproc   p_md;   /* Any machine-dependent fields. */
> 
> -- 
> Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Oliver Pinter
Hi!

Is this with latest 11-CURRENT or 10-STABLE?

Or contains the ad578c311ef commit?

On Tuesday, December 15, 2015, Shawn Webb 
wrote:

> On Tue, Dec 15, 2015 at 05:42:38PM +0100, Fabian Keil wrote:
> > I've seen the following panic a couple of times in the last three
> > months, usually while poudriere was running and with sh being the
> > current process.
> >
> > This one is from a system based on r290926 running with
> > kern.randompid=9001 and forking frequently (>1000 forks/second)
> > due to poudriere and afl-fuzz:
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 04
> > fault virtual address   = 0x618b00a8
> > fault code  = supervisor read data, page not present
> > instruction pointer = 0x20:0x80909158
> > stack pointer   = 0x28:0xfe011e03b940
> > frame pointer   = 0x28:0xfe011e03b960
> > code segment= base 0x0, limit 0xf, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags= interrupt enabled, resume, IOPL = 0
> > current process = 71325 (sh)
> > trap number = 12
> > panic: page fault
> > cpuid = 1
> > KDB: stack backtrace:
> > [...]
> > Uptime: 13d20h43m20s
> > [...]
>
> Hey Fabien,
>
> I'm glad you've seen this, too. We've observed this in HardenedBSD,
> especially when running Poudriere and Jenkins. I think Oliver Pinter
> might have a potential patch to fix this. I've CC'd him on this thread.
>
> Thanks,
>
> --
> Shawn Webb
> HardenedBSD
>
> GPG Key ID:  0x6A84658F52456EEE
> GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Mateusz Guzik
On Wed, Dec 16, 2015 at 02:10:00PM +0200, Konstantin Belousov wrote:
> On Wed, Dec 16, 2015 at 12:21:16PM +0100, Fabian Keil wrote:
> > Konstantin Belousov  wrote:
> > > It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8.
> > 
> > Unfortunately it's not available and apparently I removed the attempts
> > to get it from the previous output.
> 
> > allproc is available and the first one matches lastpid and has an invalid
> > p_pgrp, but due to trypid being optimized out as well, it's not obvious
> > (to me) that it's the right process.
> 
> p_suspcount = 0, p_xthread = 0xf801162819a0, p_boundary_count = 0, 
> p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_treeflag = 0, p_magic 
> = 3203398350, p_osrel = 1100090, 
> >   p_comm = 0xf800304df3c4 "privoxy",
> p_pgrp = 0x618b0080,
> 
> > I've changed p's declaration to static so hopefully its value will
> > be available the next time the panic occurs, but it may take a while
> > until that happens.
> 
> From the state of the process you provided, it is a new (zigote) of the
> forking process, which was already linked into allproc list.  Also,
> it seems that bzero part of the forking procedure was finished, but bcopy
> was not yet.  The p_pgrp cannot be a pointer, it is not yet initialized.
> 
> There, we have at least one issue, since zigote is linked before the
> p_pgrp is initialized, and the proctree/allproc locks are dropped.
> As result, fork_findpid() accesses memory with undefined content.
> 
> It seems that the least morbid solution is to slightly extend the scope
> of the allproc lock in do_fork(), to prevent fork_findpid() from working
> while we did not finished copying data from old to new process.
> 
> diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
> index 1b556be..ae04c9f 100644
> --- a/sys/kern/kern_fork.c
> +++ b/sys/kern/kern_fork.c
> @@ -396,13 +396,12 @@ do_fork(struct thread *td, int flags, struct proc *p2, 
> struct thread *td2,
>   PROC_LOCK(p2);
>   PROC_LOCK(p1);
>  
> - sx_xunlock(&allproc_lock);
> -
>   bcopy(&p1->p_startcopy, &p2->p_startcopy,
>   __rangeof(struct proc, p_startcopy, p_endcopy));
>   pargs_hold(p2->p_args);
>  
>   PROC_UNLOCK(p1);
> + sx_xunlock(&allproc_lock);
>  
>   bzero(&p2->p_startzero,
>   __rangeof(struct proc, p_startzero, p_endzero));

While I agree with analysis the patch does not look right. Since the
struct is only assigned and all locks get dropped, there is nothing
preventing another thread from the forking process to change the process
group.

Interestngly very same function assigns the pointer explicitely later:
p2->p_pgrp = p1->p_pgrp;

As such, I would argue the right solution is to move p_pgrp out of the
copied area. Exit path clears the pointer, so it should be fine to just
do that.

That is (untested):

diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index 90effa6..cb94318 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -586,7 +586,6 @@ struct proc {
int p_osrel;/* (x) osreldate for the
   binary (from ELF note, if any) */
charp_comm[MAXCOMLEN + 1];  /* (b) Process name. */
-   struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
struct sysentvec *p_sysent; /* (b) Syscall dispatch info. */
struct pargs*p_args;/* (c) Process arguments. */
rlim_t  p_cpulimit; /* (c) Current CPU limit in seconds. */
@@ -599,6 +598,7 @@ struct proc {
u_int   p_xsig; /* (c) Stop/kill sig. */
 /* End area that is copied on creation. */
 #definep_endcopy   p_xsig
+   struct pgrp *p_pgrp;/* (c + e) Pointer to process group. */
struct knlist   p_klist;/* (c) Knotes attached to this proc. */
int p_numthreads;   /* (c) Number of threads. */
struct mdproc   p_md;   /* Any machine-dependent fields. */

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Oliver Pinter
Yes, it's a HardenedBSD commit. Currently only a workaround, because I have
now lesser time for the real fix in this month.

Are you running on ZFS?

On Wednesday, December 16, 2015, Fabian Keil 
wrote:

> Oliver Pinter > wrote:
>
> > Is this with latest 11-CURRENT or 10-STABLE?
> >
> > Or contains the ad578c311ef commit?
>
> The panic is from a system based on 11-CURRENT r290926.
>
> Is ad578c311ef a HardenedBSD commit? It doesn't seem to
> exist in https://github.com/freebsd/freebsd.git.
>
> Fabian
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Fabian Keil
Oliver Pinter  wrote:

> Is this with latest 11-CURRENT or 10-STABLE?
> 
> Or contains the ad578c311ef commit?

The panic is from a system based on 11-CURRENT r290926.

Is ad578c311ef a HardenedBSD commit? It doesn't seem to
exist in https://github.com/freebsd/freebsd.git.

Fabian


pgpHRlok72ddQ.pgp
Description: OpenPGP digital signature


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Konstantin Belousov
On Tue, Dec 15, 2015 at 05:42:38PM +0100, Fabian Keil wrote:
> I've seen the following panic a couple of times in the last three
> months, usually while poudriere was running and with sh being the
> current process.
> 
> This one is from a system based on r290926 running with
> kern.randompid=9001 and forking frequently (>1000 forks/second)
> due to poudriere and afl-fuzz:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 04
> fault virtual address   = 0x618b00a8
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x80909158
> stack pointer   = 0x28:0xfe011e03b940
> frame pointer   = 0x28:0xfe011e03b960
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 71325 (sh)
> trap number = 12
> panic: page fault
> cpuid = 1
> KDB: stack backtrace:
> [...]
> Uptime: 13d20h43m20s
> [...]
> (kgdb) where
> #0  doadump (textdump=1) at pcpu.h:221
> #1  0x8094a923 in kern_reboot (howto=260) at 
> /usr/src/sys/kern/kern_shutdown.c:364
> #2  0x8094ae8b in vpanic (fmt=, ap= optimized out>) at /usr/src/sys/kern/kern_shutdown.c:757
> #3  0x8094acc3 in panic (fmt=0x0) at 
> /usr/src/sys/kern/kern_shutdown.c:688
> #4  0x80c2fbb1 in trap_fatal (frame=, eva= optimized out>) at /usr/src/sys/amd64/amd64/trap.c:834
> #5  0x80c2fda4 in trap_pfault (frame=0xfe011e03b890, 
> usermode=) at /usr/src/sys/amd64/amd64/trap.c:684
> #6  0x80c2f55e in trap (frame=0xfe011e03b890) at 
> /usr/src/sys/amd64/amd64/trap.c:435
> #7  0x80c120a7 in calltrap () at 
> /usr/src/sys/amd64/amd64/exception.S:234
> #8  0x80909158 in fork_findpid (flags=) at 
> /usr/src/sys/kern/kern_fork.c:281
It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8.

> #9  0x80907225 in do_fork (td=0xf8009db9a9a0, flags=20, 
> p2=0xf8009dbe1a90, td2=0xf800aa6884d0, vm2=0xf800a9eee000, 
> pdflags=0) at /usr/src/sys/kern/kern_fork.c:385
> #10 0x80906c08 in fork1 (td=0xf8009db9a9a0, flags=20, 
> pages=, procp=0xfe011e03bac0, procdescp=0x0, 
> pdflags=9, fcaps=)
> at /usr/src/sys/kern/kern_fork.c:937
> #11 0x809066ca in sys_fork (td=0xf8009db9a9a0, uap= optimized out>) at /usr/src/sys/kern/kern_fork.c:108
> #12 0x80c3054b in amd64_syscall (td=0xf8009db9a9a0, traced=0) at 
> subr_syscall.c:140
> #13 0x80c1238b in Xfast_syscall () at 
> /usr/src/sys/amd64/amd64/exception.S:394
> #14 0x0008009257aa in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb) f 8
> #8  0x80909158 in fork_findpid (flags=) at 
> /usr/src/sys/kern/kern_fork.c:281
> warning: Source file is more recent than executable.
>
> 281 (p->p_pgrp != NULL &&
> (kgdb) l -
> 271  * id is kept reserved only while there is a
> 272  * non-reaped process in the subtree, so amount of
> 273  * reserved pids is limited by process limit times
> 274  * two.
> 275  */
> 276 p = LIST_FIRST(&allproc);
> 277 again:
> 278 for (; p != NULL; p = LIST_NEXT(p, p_list)) {
> 279 while (p->p_pid == trypid ||
> 280 p->p_reapsubtree == trypid ||
> (kgdb) l
> 281 (p->p_pgrp != NULL &&
> 282 (p->p_pgrp->pg_id == trypid ||
> 283 (p->p_session != NULL &&
> 284 p->p_session->s_sid == trypid {
> 285 trypid++;
> 286 if (trypid >= pidchecked)
> 287 goto retry;
> 288 }
> 289 if (p->p_pid > trypid && pidchecked > 
> p->p_pid)
> 290 pidchecked = p->p_pid;
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Konstantin Belousov
On Wed, Dec 16, 2015 at 12:21:16PM +0100, Fabian Keil wrote:
> Konstantin Belousov  wrote:
> > It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8.
> 
> Unfortunately it's not available and apparently I removed the attempts
> to get it from the previous output.

> allproc is available and the first one matches lastpid and has an invalid
> p_pgrp, but due to trypid being optimized out as well, it's not obvious
> (to me) that it's the right process.

p_suspcount = 0, p_xthread = 0xf801162819a0, p_boundary_count = 0, 
p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_treeflag = 0, p_magic = 
3203398350, p_osrel = 1100090, 
>   p_comm = 0xf800304df3c4 "privoxy",
p_pgrp = 0x618b0080,

> I've changed p's declaration to static so hopefully its value will
> be available the next time the panic occurs, but it may take a while
> until that happens.

>From the state of the process you provided, it is a new (zigote) of the
forking process, which was already linked into allproc list.  Also,
it seems that bzero part of the forking procedure was finished, but bcopy
was not yet.  The p_pgrp cannot be a pointer, it is not yet initialized.

There, we have at least one issue, since zigote is linked before the
p_pgrp is initialized, and the proctree/allproc locks are dropped.
As result, fork_findpid() accesses memory with undefined content.

It seems that the least morbid solution is to slightly extend the scope
of the allproc lock in do_fork(), to prevent fork_findpid() from working
while we did not finished copying data from old to new process.

diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
index 1b556be..ae04c9f 100644
--- a/sys/kern/kern_fork.c
+++ b/sys/kern/kern_fork.c
@@ -396,13 +396,12 @@ do_fork(struct thread *td, int flags, struct proc *p2, 
struct thread *td2,
PROC_LOCK(p2);
PROC_LOCK(p1);
 
-   sx_xunlock(&allproc_lock);
-
bcopy(&p1->p_startcopy, &p2->p_startcopy,
__rangeof(struct proc, p_startcopy, p_endcopy));
pargs_hold(p2->p_args);
 
PROC_UNLOCK(p1);
+   sx_xunlock(&allproc_lock);
 
bzero(&p2->p_startzero,
__rangeof(struct proc, p_startzero, p_endzero));
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-16 Thread Fabian Keil
Konstantin Belousov  wrote:

> On Tue, Dec 15, 2015 at 05:42:38PM +0100, Fabian Keil wrote:
> > I've seen the following panic a couple of times in the last three
> > months, usually while poudriere was running and with sh being the
> > current process.
> > 
> > This one is from a system based on r290926 running with
> > kern.randompid=9001 and forking frequently (>1000 forks/second)
> > due to poudriere and afl-fuzz:
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 04
> > fault virtual address   = 0x618b00a8
> > fault code  = supervisor read data, page not present
> > instruction pointer = 0x20:0x80909158
> > stack pointer   = 0x28:0xfe011e03b940
> > frame pointer   = 0x28:0xfe011e03b960
> > code segment= base 0x0, limit 0xf, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags= interrupt enabled, resume, IOPL = 0
> > current process = 71325 (sh)
> > trap number = 12
> > panic: page fault
> > cpuid = 1
> > KDB: stack backtrace:
> > [...]
> > Uptime: 13d20h43m20s
> > [...]
> > (kgdb) where
> > #0  doadump (textdump=1) at pcpu.h:221
> > #1  0x8094a923 in kern_reboot (howto=260) at 
> > /usr/src/sys/kern/kern_shutdown.c:364
> > #2  0x8094ae8b in vpanic (fmt=, ap= > optimized out>) at /usr/src/sys/kern/kern_shutdown.c:757
> > #3  0x8094acc3 in panic (fmt=0x0) at 
> > /usr/src/sys/kern/kern_shutdown.c:688
> > #4  0x80c2fbb1 in trap_fatal (frame=, 
> > eva=) at /usr/src/sys/amd64/amd64/trap.c:834
> > #5  0x80c2fda4 in trap_pfault (frame=0xfe011e03b890, 
> > usermode=) at /usr/src/sys/amd64/amd64/trap.c:684
> > #6  0x80c2f55e in trap (frame=0xfe011e03b890) at 
> > /usr/src/sys/amd64/amd64/trap.c:435
> > #7  0x80c120a7 in calltrap () at 
> > /usr/src/sys/amd64/amd64/exception.S:234
> > #8  0x80909158 in fork_findpid (flags=) at 
> > /usr/src/sys/kern/kern_fork.c:281  
> It is the values of *p and *(p->p_pgrp) that are needed, from the frame 8.

Unfortunately it's not available and apparently I removed the attempts
to get it from the previous output.

#8  0x80909158 in fork_findpid (flags=) at 
/usr/src/sys/kern/kern_fork.c:281
warning: Source file is more recent than executable.

281 (p->p_pgrp != NULL &&
Current language:  auto; currently minimal
(kgdb) p p
No symbol "p" in current context.
(kgdb)  p trypid
$1 = 
(kgdb)  p pidchecked
$2 = 9
(kgdb) p lastpid
$3 = 51281

allproc is available and the first one matches lastpid and has an invalid
p_pgrp, but due to trypid being optimized out as well, it's not obvious
(to me) that it's the right process.

(kgdb)  p *allproc->lh_first
$4 = {p_list = {le_next = 0xf800a99e4548, le_prev = 0x813f3cc8}, 
p_threads = {tqh_first = 0xf801162819a0, tqh_last = 0xf801162819b0}, 
p_slock = {lock_object = {
  lo_name = 0x80e22449 "process slock", lo_flags = 537067520, 
lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xf8009d07, 
p_fd = 0x0, p_fdtol = 0x0, p_stats = 0xf800299e5800, 
  p_limit = 0x0, p_limco = {c_links = {le = {le_next = 0x0, le_prev = 0x0}, sle 
= {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, 
c_precision = 0, c_arg = 0x0, c_func = 0, 
c_lock = 0xf800304df120, c_flags = 0, c_iflags = 0, c_cpu = 0}, 
p_sigacts = 0x0, p_flag = 268443648, p_flag2 = 0, p_state = PRS_NEW, p_pid = 
51281, p_hash = {le_next = 0x0, 
le_prev = 0xfec8a288}, p_pglist = {le_next = 0x0, le_prev = 
0xf800aa94d618}, p_pptr = 0xf800aa94d548, p_sibling = {le_next = 0x0, 
le_prev = 0xf800aa94d640}, p_children = {
lh_first = 0x0}, p_reaper = 0xf800029a5548, p_reaplist = {lh_first = 
0x0}, p_reapsibling = {le_next = 0xf8007e713548, le_prev = 
0xf80023df1110}, p_mtx = {lock_object = {
  lo_name = 0x80e2243c "process lock", lo_flags = 558039040, 
lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446735280470265856}, p_statmtx = 
{lock_object = {lo_name = 0x80e22457 "pstatl", 
  lo_flags = 537067520, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, 
p_itimmtx = {lock_object = {lo_name = 0x80e2245e "pitiml", lo_flags = 
537067520, lo_data = 0, lo_witness = 0x0}, 
mtx_lock = 4}, p_profmtx = {lock_object = {lo_name = 0x80e22465 
"pprofl", lo_flags = 537067520, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, 
p_ksi = 0xf80126950380, p_sigqueue = {
sq_signals = {__bits = 0xf800304df1a8}, sq_kill = {__bits = 
0xf800304df1b8}, sq_list = {tqh_first = 0x0, tqh_last = 
0xf800304df1c8}, sq_proc = 0xf800304df000, sq_flags = 1}, p_oppid = 0, 
  p_vmspace = 0x0, p_swtick = 3344683412, p_cowgen = 0, p_realtimer = 
{it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 
0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {

Re: fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-15 Thread Shawn Webb
On Tue, Dec 15, 2015 at 05:42:38PM +0100, Fabian Keil wrote:
> I've seen the following panic a couple of times in the last three
> months, usually while poudriere was running and with sh being the
> current process.
> 
> This one is from a system based on r290926 running with
> kern.randompid=9001 and forking frequently (>1000 forks/second)
> due to poudriere and afl-fuzz:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 04
> fault virtual address   = 0x618b00a8
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x80909158
> stack pointer   = 0x28:0xfe011e03b940
> frame pointer   = 0x28:0xfe011e03b960
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 71325 (sh)
> trap number = 12
> panic: page fault
> cpuid = 1
> KDB: stack backtrace:
> [...]
> Uptime: 13d20h43m20s
> [...]

Hey Fabien,

I'm glad you've seen this, too. We've observed this in HardenedBSD,
especially when running Poudriere and Jenkins. I think Oliver Pinter
might have a potential patch to fix this. I've CC'd him on this thread.

Thanks,

-- 
Shawn Webb
HardenedBSD

GPG Key ID:  0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE


signature.asc
Description: PGP signature


fork_findpid() - Fatal trap 12: page fault while in kernel mode

2015-12-15 Thread Fabian Keil
I've seen the following panic a couple of times in the last three
months, usually while poudriere was running and with sh being the
current process.

This one is from a system based on r290926 running with
kern.randompid=9001 and forking frequently (>1000 forks/second)
due to poudriere and afl-fuzz:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 04
fault virtual address   = 0x618b00a8
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80909158
stack pointer   = 0x28:0xfe011e03b940
frame pointer   = 0x28:0xfe011e03b960
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 71325 (sh)
trap number = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
[...]
Uptime: 13d20h43m20s
[...]
(kgdb) where
#0  doadump (textdump=1) at pcpu.h:221
#1  0x8094a923 in kern_reboot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:364
#2  0x8094ae8b in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:757
#3  0x8094acc3 in panic (fmt=0x0) at 
/usr/src/sys/kern/kern_shutdown.c:688
#4  0x80c2fbb1 in trap_fatal (frame=, eva=) at /usr/src/sys/amd64/amd64/trap.c:834
#5  0x80c2fda4 in trap_pfault (frame=0xfe011e03b890, 
usermode=) at /usr/src/sys/amd64/amd64/trap.c:684
#6  0x80c2f55e in trap (frame=0xfe011e03b890) at 
/usr/src/sys/amd64/amd64/trap.c:435
#7  0x80c120a7 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:234
#8  0x80909158 in fork_findpid (flags=) at 
/usr/src/sys/kern/kern_fork.c:281
#9  0x80907225 in do_fork (td=0xf8009db9a9a0, flags=20, 
p2=0xf8009dbe1a90, td2=0xf800aa6884d0, vm2=0xf800a9eee000, 
pdflags=0) at /usr/src/sys/kern/kern_fork.c:385
#10 0x80906c08 in fork1 (td=0xf8009db9a9a0, flags=20, pages=, procp=0xfe011e03bac0, procdescp=0x0, pdflags=9, 
fcaps=)
at /usr/src/sys/kern/kern_fork.c:937
#11 0x809066ca in sys_fork (td=0xf8009db9a9a0, uap=) at /usr/src/sys/kern/kern_fork.c:108
#12 0x80c3054b in amd64_syscall (td=0xf8009db9a9a0, traced=0) at 
subr_syscall.c:140
#13 0x80c1238b in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:394
#14 0x0008009257aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) f 8
#8  0x80909158 in fork_findpid (flags=) at 
/usr/src/sys/kern/kern_fork.c:281
warning: Source file is more recent than executable.
   
281 (p->p_pgrp != NULL &&
(kgdb) l -
271  * id is kept reserved only while there is a
272  * non-reaped process in the subtree, so amount of
273  * reserved pids is limited by process limit times
274  * two.
275  */
276 p = LIST_FIRST(&allproc);
277 again:
278 for (; p != NULL; p = LIST_NEXT(p, p_list)) {
279 while (p->p_pid == trypid ||
280 p->p_reapsubtree == trypid ||
(kgdb) l
281 (p->p_pgrp != NULL &&
282 (p->p_pgrp->pg_id == trypid ||
283 (p->p_session != NULL &&
284 p->p_session->s_sid == trypid {
285 trypid++;
286 if (trypid >= pidchecked)
287 goto retry;
288 }
289 if (p->p_pid > trypid && pidchecked > p->p_pid)
290 pidchecked = p->p_pid;
(kgdb) f 6
#6  0x80c2f55e in trap (frame=0xfe011e03b890) at 
/usr/src/sys/amd64/amd64/trap.c:435
warning: Source file is more recent than executable.
   
435 (void) trap_pfault(frame, FALSE);
(kgdb) p *frame
$2 = {tf_rdi = 1636499584, tf_rsi = 51281, tf_rdx = -8795282608128, tf_rcx = 1, 
tf_r8 = 9, tf_r9 = 9, tf_rax = 0, tf_rbx = 60137, tf_rbp = 
-2194224727712, tf_r10 = 0, tf_r11 = 0,
  tf_r12 = -8793446540656, tf_r13 = -2194224727360, tf_r14 = 0, tf_r15 = 
-8793450915184, tf_trapno = 12, tf_fs = 19, tf_gs = 27, tf_addr = 1636499624, 
tf_flags = 1, tf_es = 59, tf_ds = 59, tf_err = 0,
  tf_rip = -2138009256, tf_cs = 32, tf_rflags = 66050, tf_rsp = -2194224727728, 
tf_ss = 40}
(kgdb) f 9
#9  0x80907225 in do_fork (td=0xf8009db9a9a0, flags=20, 
p2=0xf8009dbe1a90, td2=0xf800aa6884d0, vm2=0xf800a9eee000, 
pdflags=0) at /usr/src/sys/kern/kern_fork.c:385
385 trypid = fork_findpid(flags);
(kgdb) p flags
$3 = 20
(kgdb) p *td
$4 = {td_lock = 0x8129b100, td_proc = 0xf8009d7