re: How to identify specific wait-state for a "DE" process?

2016-01-10 Thread matthew green
Stephan writes: > > # crash > > Crash version 7.99.25, image version 7.99.25. > > Output from a running system is unreliable. > > crash> trace/t 0t455 > > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > > sleepq_block() at sleepq_block+0xa2 > > cv_wait() at cv_wait+0x116 > > fd_close() at

re: How to identify specific wait-state for a "DE" process?

2016-01-10 Thread matthew green
> Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? filemon(4) as written should just be replaced with a method that works without replacing system calls or borrowing fds or any of these

Re: How to identify specific wait-state for a "DE" process?

2016-01-08 Thread Paul Goyette
On Sat, 9 Jan 2016, Rhialto wrote: On Wed 06 Jan 2016 at 17:44:45 +, Taylor R Campbell wrote: This only fixes the problem for certain orderings of file descriptors. I was thinking of a different hack. Given tha filemon now knows there are issues if it has to use a fd lower than its own

Re: How to identify specific wait-state for a "DE" process?

2016-01-08 Thread Rhialto
On Wed 06 Jan 2016 at 17:44:45 +, Taylor R Campbell wrote: > This only fixes the problem for certain orderings of file descriptors. I was thinking of a different hack. Given tha filemon now knows there are issues if it has to use a fd lower than its own fd, it can avoid the situation. If it

Re: How to identify specific wait-state for a "DE" process?

2016-01-06 Thread Taylor R Campbell
Date: Wed, 6 Jan 2016 09:22:44 -0800 From: Brian Buhrow hello. Is there a particular reason file descriptors are closed in ascending order? Traditionally, file descriptors 2, 1 and 0 are always in use and it seems like it might be a good idea to have

Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]

2016-01-06 Thread Brian Buhrow
hello. Is there a particular reason file descriptors are closed in ascending order? Traditionally, file descriptors 2, 1 and 0 are always in use and it seems like it might be a good idea to have those be the last to get closed. I've seen some applications that close all their

Re: How to identify specific wait-state for a "DE" process?

2016-01-06 Thread Paul Goyette
On Wed, 6 Jan 2016, David Holland wrote: On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? A better suggestion: remove the

Re: How to identify specific wait-state for a "DE" process?

2016-01-06 Thread David Holland
On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? A better suggestion: remove the broken behavior of close(). -- David A.

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Michael van Elst
p...@vps1.whooppee.com (Paul Goyette) writes: >cv_wait() at cv_wait+0x116 >fd_close() at fd_close+0x39a >fd_free() at fd_free+0x178 >exit1() at exit1+0x10a >sys_exit() at sys_exit+0x3a >syscall() at syscall+0x9c >--- syscall (number 1) --- >So I guess I need to figure out which/what condvar it

Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]

2016-01-05 Thread Thor Lancelot Simon
On Wed, Jan 06, 2016 at 11:38:00AM +0800, Paul Goyette wrote: > On Wed, 6 Jan 2016, Taylor R Campbell wrote: > > > Date: Tue, 5 Jan 2016 21:48:42 -0500 > > From: Thor Lancelot Simon > > > > You can probably use workqueues for this. Looking at the manual page > > again for

workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]

2016-01-05 Thread Taylor R Campbell
Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is meant by "A work must not be enqueued again

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Thor Lancelot Simon
On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? You can probably use workqueues for this. Looking at the manual page again

Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]

2016-01-05 Thread Paul Goyette
On Wed, 6 Jan 2016, Taylor R Campbell wrote: Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is

Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]

2016-01-05 Thread Paul Goyette
On Tue, 5 Jan 2016, Thor Lancelot Simon wrote: On Wed, Jan 06, 2016 at 11:38:00AM +0800, Paul Goyette wrote: On Wed, 6 Jan 2016, Taylor R Campbell wrote: Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Michael van Elst
p...@whooppee.com (Paul Goyette) writes: >I'm pretty sure that the device in question is the console terminal >driver /dev/console since the problem does not happen if filemon is >sending the entries to a "real" file. But I can't figure why it is >waiting, so I don't know what I should do to

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Paul Goyette
On Tue, 5 Jan 2016, Stephan wrote: # crash Crash version 7.99.25, image version 7.99.25. Output from a running system is unreliable. crash> trace/t 0t455 trace: pid 455 lid 1 at 0xfe8002ff0ce0 sleepq_block() at sleepq_block+0xa2 cv_wait() at cv_wait+0x116 fd_close() at fd_close+0x39a

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Paul Goyette
On Tue, 5 Jan 2016, Michael van Elst wrote: p...@vps1.whooppee.com (Paul Goyette) writes: cv_wait() at cv_wait+0x116 fd_close() at fd_close+0x39a fd_free() at fd_free+0x178 exit1() at exit1+0x10a sys_exit() at sys_exit+0x3a syscall() at syscall+0x9c --- syscall (number 1) --- So I guess I

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Stephan
> # crash > Crash version 7.99.25, image version 7.99.25. > Output from a running system is unreliable. > crash> trace/t 0t455 > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > sleepq_block() at sleepq_block+0xa2 > cv_wait() at cv_wait+0x116 > fd_close() at fd_close+0x39a > fd_free() at fd_free+0x178

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Stephan
> # crash > Crash version 7.99.25, image version 7.99.25. > Output from a running system is unreliable. > crash> trace/t 0t455 > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > sleepq_block() at sleepq_block+0xa2 > cv_wait() at cv_wait+0x116 > fd_close() at fd_close+0x39a > fd_free() at fd_free+0x178

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Paul Goyette
On Wed, 6 Jan 2016, Paul Goyette wrote: I need to figure out why this is a problem when filemon(4) "borrows" the fd for stdout, but is not a problem when it borrows a real file. OK, I figured out what's going on. In the failure scenario, we have the following events: 1. Process

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread bch
This scenario reminds me of: https://www.sqlite.org/compile.html#minimum_file_descriptor -bch On 1/5/16, Paul Goyette wrote: > On Wed, 6 Jan 2016, Paul Goyette wrote: > >> I need to figure out why this is a problem when filemon(4) "borrows" the >> fd >> for stdout,

Re: How to identify specific wait-state for a "DE" process?

2016-01-05 Thread Paul Goyette
On Tue, 5 Jan 2016, Michael van Elst wrote: p...@whooppee.com (Paul Goyette) writes: I'm pretty sure that the device in question is the console terminal driver /dev/console since the problem does not happen if filemon is sending the entries to a "real" file. But I can't figure why it is