Re: 11.x deadlocking during pfault (was Re: FreeBSD 11.x grinds to a halt after about 48h of uptime)

2016-10-27 Thread Ulrich Spörlein
2016-10-27 14:51 GMT+02:00 Daniel Nebdal :
>
> On Wed, Oct 26, 2016 at 6:45 PM, Ulrich Spörlein  wrote:
> >
> > On Wed, 2016-10-26 at 18:43:43 +0200, Ulrich Spörlein wrote:
> > > On Mon, 2016-10-24 at 19:43:27 +0200, Ulrich Spörlein wrote:
> > > > On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> > > > > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> > > > > 
> > > > > wrote:
> > > > >
> > > > > > On 10/15/16 18:18, Ulrich Spörlein wrote:
> > > > > >
> > > > > >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> > > > > >> since I upgraded it to 11-CURRENT years ago. I have no idea when 
> > > > > >> this
> > > > > >> started, actually, but what always happens is this:
> > > > > >>
> > > > > >> - System and X11 is up and running, I keep it running over night 
> > > > > >> as I'm
> > > > > >> too lazy to reboot and restart everthing.
> > > > > >> - There's a bunch of xterms, Chrome, Clementine-Player and some 
> > > > > >> other
> > > > > >> programs running
> > > > > >> - Coming back to the machine the next day (or the day after) it 
> > > > > >> will
> > > > > >> exit the screensaver just fine and then either I can use it for a 
> > > > > >> couple
> > > > > >> of seconds before it freezes, or it's pretty much dead already. The
> > > > > >> mouse cursor still moves for a bit, but the also freezes (so it 
> > > > > >> this a
> > > > > >> GPU problem??)
> > > > > >>
> > > > > >> Now what I currently see on the screen is a clock widget stuck at 
> > > > > >> 18:04
> > > > > >> but conky itself has last updated at 18:00:18 ...
> > > > > >>
> > > > > >> This time I had some SSH sessions from another machine to see some 
> > > > > >> more
> > > > > >> useful things. There was nothing in various logs under /var/log (I 
> > > > > >> also
> > > > > >> can't run dmesg anymore ...)
> > > > > >> I had top(1) running in a loop, this is the last output:
> > > > > >>
> > > > > >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> > > > > >> 18:00:12
> > > > > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> > > > > >>
> > > > > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> > > > > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M 
> > > > > >> Other
> > > > > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> > > > > >>
> > > > > >>
> > > > > >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIME
> > > > > >> WCPU
> > > > > >> COMMAND
> > > > > >>11 root8 155 ki31 0K   128K CPU00 364.6H 
> > > > > >> 772.95%
> > > > > >> idle
> > > > > >>  3122 uqs15  280  7113M  5861M uwait   > > > > > >> 0
> > > > > >> 94:44  13.96% chrome
> > > > > >>2887 uqs28  220  1394M  
> > > > > >>  237M
> > > > > >> select  2 172:53   6.98% chrome
> > > > > >>2890 uqs11  21  
> > > > > >>   0
> > > > > >> 1034M   178M select  5 231:21   1.95% chrome
> > > > > >>1062 root   
> > > > > >>  9
> > > > > >> 210   440M 47220K select  0  67:09   0.98% Xorg
> > > > > >>  3002 
> > > > > >> uqs
> > > > > >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> > > > > >>  3139 uqs17  255  1163M   156M uwait   2  16:15   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  3001 uqs18  255  1639M   575M uwait   0  16:05   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>12 root   24 -64- 0K   384K WAIT   -1  10:53   
> > > > > >> 0.00%
> > > > > >> intr
> > > > > >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  2822 uqs 9  200   217M 81300K select  0   5:10   
> > > > > >> 0.00%
> > > > > >> conky
> > > > > >>  3174 root1  200 21532K  3188K select  0   4:20   
> > > > > >> 0.00%
> > > > > >> systat
> > > > > >>  3130 uqs16  200  1058M   131M uwait   4   3:03   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  2998 uqs16  200  1110M   123M uwait   2   2:53   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  3165 uqs10  200  1209M   215M uwait   6   2:52   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  3142 uqs11  255  1344M   195M uwait   2   2:46   
> > > > > >> 0.00%
> > > > > >> chrome
> > > > > >>  2876 uqs19  200   580M 37164K select  3   2:42   
> > > > > >> 0.00%
> > > > > >> clementine-player
> > > > > >>20 root2 -16- 0K32K psleep  6   2:25   
> > > > > >> 0.00%
> > > > > >> pagedaemon
> > > > > >>
> > > > > >> I also had systat -vm running and it continued to update its 
> > > > > >> screen ...

Re: 11.x deadlocking during pfault (was Re: FreeBSD 11.x grinds to a halt after about 48h of uptime)

2016-10-27 Thread Daniel Nebdal
On Wed, Oct 26, 2016 at 6:45 PM, Ulrich Spörlein  wrote:
>
> On Wed, 2016-10-26 at 18:43:43 +0200, Ulrich Spörlein wrote:
> > On Mon, 2016-10-24 at 19:43:27 +0200, Ulrich Spörlein wrote:
> > > On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> > > > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> > > > wrote:
> > > >
> > > > > On 10/15/16 18:18, Ulrich Spörlein wrote:
> > > > >
> > > > >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> > > > >> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> > > > >> started, actually, but what always happens is this:
> > > > >>
> > > > >> - System and X11 is up and running, I keep it running over night as 
> > > > >> I'm
> > > > >> too lazy to reboot and restart everthing.
> > > > >> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> > > > >> programs running
> > > > >> - Coming back to the machine the next day (or the day after) it will
> > > > >> exit the screensaver just fine and then either I can use it for a 
> > > > >> couple
> > > > >> of seconds before it freezes, or it's pretty much dead already. The
> > > > >> mouse cursor still moves for a bit, but the also freezes (so it this 
> > > > >> a
> > > > >> GPU problem??)
> > > > >>
> > > > >> Now what I currently see on the screen is a clock widget stuck at 
> > > > >> 18:04
> > > > >> but conky itself has last updated at 18:00:18 ...
> > > > >>
> > > > >> This time I had some SSH sessions from another machine to see some 
> > > > >> more
> > > > >> useful things. There was nothing in various logs under /var/log (I 
> > > > >> also
> > > > >> can't run dmesg anymore ...)
> > > > >> I had top(1) running in a loop, this is the last output:
> > > > >>
> > > > >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> > > > >> 18:00:12
> > > > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> > > > >>
> > > > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> > > > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M 
> > > > >> Other
> > > > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> > > > >>
> > > > >>
> > > > >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIME
> > > > >> WCPU
> > > > >> COMMAND
> > > > >>11 root8 155 ki31 0K   128K CPU00 364.6H 
> > > > >> 772.95%
> > > > >> idle
> > > > >>  3122 uqs15  280  7113M  5861M uwait   0
> > > > >> 94:44  13.96% chrome
> > > > >>2887 uqs28  220  1394M   
> > > > >> 237M
> > > > >> select  2 172:53   6.98% chrome
> > > > >>2890 uqs11  21> > > > >> 0
> > > > >> 1034M   178M select  5 231:21   1.95% chrome
> > > > >>1062 root 
> > > > >>9
> > > > >> 210   440M 47220K select  0  67:09   0.98% Xorg
> > > > >>  3002 uqs
> > > > >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> > > > >>  3139 uqs17  255  1163M   156M uwait   2  16:15   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  3001 uqs18  255  1639M   575M uwait   0  16:05   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>12 root   24 -64- 0K   384K WAIT   -1  10:53   
> > > > >> 0.00%
> > > > >> intr
> > > > >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  2822 uqs 9  200   217M 81300K select  0   5:10   
> > > > >> 0.00%
> > > > >> conky
> > > > >>  3174 root1  200 21532K  3188K select  0   4:20   
> > > > >> 0.00%
> > > > >> systat
> > > > >>  3130 uqs16  200  1058M   131M uwait   4   3:03   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  2998 uqs16  200  1110M   123M uwait   2   2:53   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  3165 uqs10  200  1209M   215M uwait   6   2:52   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  3142 uqs11  255  1344M   195M uwait   2   2:46   
> > > > >> 0.00%
> > > > >> chrome
> > > > >>  2876 uqs19  200   580M 37164K select  3   2:42   
> > > > >> 0.00%
> > > > >> clementine-player
> > > > >>20 root2 -16- 0K32K psleep  6   2:25   
> > > > >> 0.00%
> > > > >> pagedaemon
> > > > >>
> > > > >> I also had systat -vm running and it continued to update its screen 
> > > > >> ...
> > > > >> for a short while, this is the last update before SSH died:
> > > > >>
> > > > >>
> > > > >>Mem usage:  0k%Phy  5%Kmem
> > > > >> Mem: KBREALVIRTUAL  VN PAGER   
> > > > >> SWAP
> > > > >> PAGER
> > > > >> Tot   Share  TotShareFree   in   out 
> > > > >> in
> > > > >>  out

Re: 11.x deadlocking during pfault (was Re: FreeBSD 11.x grinds to a halt after about 48h of uptime)

2016-10-26 Thread Ulrich Spörlein
On Wed, 2016-10-26 at 18:43:43 +0200, Ulrich Spörlein wrote:
> On Mon, 2016-10-24 at 19:43:27 +0200, Ulrich Spörlein wrote:
> > On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> > > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> > > wrote:
> > > 
> > > > On 10/15/16 18:18, Ulrich Spörlein wrote:
> > > >
> > > >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> > > >> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> > > >> started, actually, but what always happens is this:
> > > >>
> > > >> - System and X11 is up and running, I keep it running over night as I'm
> > > >> too lazy to reboot and restart everthing.
> > > >> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> > > >> programs running
> > > >> - Coming back to the machine the next day (or the day after) it will
> > > >> exit the screensaver just fine and then either I can use it for a 
> > > >> couple
> > > >> of seconds before it freezes, or it's pretty much dead already. The
> > > >> mouse cursor still moves for a bit, but the also freezes (so it this a
> > > >> GPU problem??)
> > > >>
> > > >> Now what I currently see on the screen is a clock widget stuck at 18:04
> > > >> but conky itself has last updated at 18:00:18 ...
> > > >>
> > > >> This time I had some SSH sessions from another machine to see some more
> > > >> useful things. There was nothing in various logs under /var/log (I also
> > > >> can't run dmesg anymore ...)
> > > >> I had top(1) running in a loop, this is the last output:
> > > >>
> > > >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> > > >> 18:00:12
> > > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> > > >>
> > > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> > > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
> > > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> > > >>
> > > >>
> > > >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU
> > > >> COMMAND
> > > >>11 root8 155 ki31 0K   128K CPU00 364.6H 772.95%
> > > >> idle
> > > >>  3122 uqs15  280  7113M  5861M uwait   0
> > > >> 94:44  13.96% chrome
> > > >>2887 uqs28  220  1394M   
> > > >> 237M
> > > >> select  2 172:53   6.98% chrome
> > > >>2890 uqs11  210
> > > >> 1034M   178M select  5 231:21   1.95% chrome
> > > >>1062 root   
> > > >>  9
> > > >> 210   440M 47220K select  0  67:09   0.98% Xorg
> > > >>  3002 uqs
> > > >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> > > >>  3139 uqs17  255  1163M   156M uwait   2  16:15   0.00%
> > > >> chrome
> > > >>  3001 uqs18  255  1639M   575M uwait   0  16:05   0.00%
> > > >> chrome
> > > >>12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00%
> > > >> intr
> > > >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00%
> > > >> chrome
> > > >>  2822 uqs 9  200   217M 81300K select  0   5:10   0.00%
> > > >> conky
> > > >>  3174 root1  200 21532K  3188K select  0   4:20   0.00%
> > > >> systat
> > > >>  3130 uqs16  200  1058M   131M uwait   4   3:03   0.00%
> > > >> chrome
> > > >>  2998 uqs16  200  1110M   123M uwait   2   2:53   0.00%
> > > >> chrome
> > > >>  3165 uqs10  200  1209M   215M uwait   6   2:52   0.00%
> > > >> chrome
> > > >>  3142 uqs11  255  1344M   195M uwait   2   2:46   0.00%
> > > >> chrome
> > > >>  2876 uqs19  200   580M 37164K select  3   2:42   0.00%
> > > >> clementine-player
> > > >>20 root2 -16- 0K32K psleep  6   2:25   0.00%
> > > >> pagedaemon
> > > >>
> > > >> I also had systat -vm running and it continued to update its screen ...
> > > >> for a short while, this is the last update before SSH died:
> > > >>
> > > >>
> > > >>Mem usage:  0k%Phy  5%Kmem
> > > >> Mem: KBREALVIRTUAL  VN PAGER   SWAP
> > > >> PAGER
> > > >> Tot   Share  TotShareFree   in   out in
> > > >>  out
> > > >> Act  11051k   67868 71051992   255448   61840  count
> > > >> All  11051k   67924 71058776   262100  pages
> > > >> Proc:
> > > >> Interrupts
> > > >>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224
> > > >> total
> > > >>  25 730  11   724  109  404  101   13 cow   2
> > > >> ehci0 16
> > > >>   zfod  3
> > > >> ehci1 23
> > > >>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16
> > > >> 

11.x deadlocking during pfault (was Re: FreeBSD 11.x grinds to a halt after about 48h of uptime)

2016-10-26 Thread Ulrich Spörlein
On Mon, 2016-10-24 at 19:43:27 +0200, Ulrich Spörlein wrote:
> On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> > On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> > wrote:
> > 
> > > On 10/15/16 18:18, Ulrich Spörlein wrote:
> > >
> > >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> > >> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> > >> started, actually, but what always happens is this:
> > >>
> > >> - System and X11 is up and running, I keep it running over night as I'm
> > >> too lazy to reboot and restart everthing.
> > >> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> > >> programs running
> > >> - Coming back to the machine the next day (or the day after) it will
> > >> exit the screensaver just fine and then either I can use it for a couple
> > >> of seconds before it freezes, or it's pretty much dead already. The
> > >> mouse cursor still moves for a bit, but the also freezes (so it this a
> > >> GPU problem??)
> > >>
> > >> Now what I currently see on the screen is a clock widget stuck at 18:04
> > >> but conky itself has last updated at 18:00:18 ...
> > >>
> > >> This time I had some SSH sessions from another machine to see some more
> > >> useful things. There was nothing in various logs under /var/log (I also
> > >> can't run dmesg anymore ...)
> > >> I had top(1) running in a loop, this is the last output:
> > >>
> > >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> > >> 18:00:12
> > >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> > >>
> > >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> > >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
> > >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> > >>
> > >>
> > >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU
> > >> COMMAND
> > >>11 root8 155 ki31 0K   128K CPU00 364.6H 772.95%
> > >> idle
> > >>  3122 uqs15  280  7113M  5861M uwait   0
> > >> 94:44  13.96% chrome
> > >>2887 uqs28  220  1394M   237M
> > >> select  2 172:53   6.98% chrome
> > >>2890 uqs11  210
> > >> 1034M   178M select  5 231:21   1.95% chrome
> > >>1062 root9
> > >> 210   440M 47220K select  0  67:09   0.98% Xorg
> > >>  3002 uqs
> > >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> > >>  3139 uqs17  255  1163M   156M uwait   2  16:15   0.00%
> > >> chrome
> > >>  3001 uqs18  255  1639M   575M uwait   0  16:05   0.00%
> > >> chrome
> > >>12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00%
> > >> intr
> > >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00%
> > >> chrome
> > >>  2822 uqs 9  200   217M 81300K select  0   5:10   0.00%
> > >> conky
> > >>  3174 root1  200 21532K  3188K select  0   4:20   0.00%
> > >> systat
> > >>  3130 uqs16  200  1058M   131M uwait   4   3:03   0.00%
> > >> chrome
> > >>  2998 uqs16  200  1110M   123M uwait   2   2:53   0.00%
> > >> chrome
> > >>  3165 uqs10  200  1209M   215M uwait   6   2:52   0.00%
> > >> chrome
> > >>  3142 uqs11  255  1344M   195M uwait   2   2:46   0.00%
> > >> chrome
> > >>  2876 uqs19  200   580M 37164K select  3   2:42   0.00%
> > >> clementine-player
> > >>20 root2 -16- 0K32K psleep  6   2:25   0.00%
> > >> pagedaemon
> > >>
> > >> I also had systat -vm running and it continued to update its screen ...
> > >> for a short while, this is the last update before SSH died:
> > >>
> > >>
> > >>Mem usage:  0k%Phy  5%Kmem
> > >> Mem: KBREALVIRTUAL  VN PAGER   SWAP
> > >> PAGER
> > >> Tot   Share  TotShareFree   in   out in
> > >>  out
> > >> Act  11051k   67868 71051992   255448   61840  count
> > >> All  11051k   67924 71058776   262100  pages
> > >> Proc:
> > >> Interrupts
> > >>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224
> > >> total
> > >>  25 730  11   724  109  404  101   13 cow   2
> > >> ehci0 16
> > >>   zfod  3
> > >> ehci1 23
> > >>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16
> > >> cpu0:timer
> > >> ||||||||||   %ozfod
> > >>  xhci0 264
> > >>   daefr 3 em0
> > >> 265
> > >> 50 dtbuf  prcfr94
> > >> hdac1 266
> > >> 

Re: FreeBSD 11.x grinds to a halt after about 48h of uptime

2016-10-24 Thread Ulrich Spörlein
On Sat, 2016-10-15 at 09:36:27 -0700, Kevin Oberman wrote:
> On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> wrote:
> 
> > On 10/15/16 18:18, Ulrich Spörlein wrote:
> >
> >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> >> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> >> started, actually, but what always happens is this:
> >>
> >> - System and X11 is up and running, I keep it running over night as I'm
> >> too lazy to reboot and restart everthing.
> >> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> >> programs running
> >> - Coming back to the machine the next day (or the day after) it will
> >> exit the screensaver just fine and then either I can use it for a couple
> >> of seconds before it freezes, or it's pretty much dead already. The
> >> mouse cursor still moves for a bit, but the also freezes (so it this a
> >> GPU problem??)
> >>
> >> Now what I currently see on the screen is a clock widget stuck at 18:04
> >> but conky itself has last updated at 18:00:18 ...
> >>
> >> This time I had some SSH sessions from another machine to see some more
> >> useful things. There was nothing in various logs under /var/log (I also
> >> can't run dmesg anymore ...)
> >> I had top(1) running in a loop, this is the last output:
> >>
> >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> >> 18:00:12
> >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> >>
> >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
> >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> >>
> >>
> >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU
> >> COMMAND
> >>11 root8 155 ki31 0K   128K CPU00 364.6H 772.95%
> >> idle
> >>  3122 uqs15  280  7113M  5861M uwait   0
> >> 94:44  13.96% chrome
> >>2887 uqs28  220  1394M   237M
> >> select  2 172:53   6.98% chrome
> >>2890 uqs11  210
> >> 1034M   178M select  5 231:21   1.95% chrome
> >>1062 root9
> >> 210   440M 47220K select  0  67:09   0.98% Xorg
> >>  3002 uqs
> >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> >>  3139 uqs17  255  1163M   156M uwait   2  16:15   0.00%
> >> chrome
> >>  3001 uqs18  255  1639M   575M uwait   0  16:05   0.00%
> >> chrome
> >>12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00%
> >> intr
> >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00%
> >> chrome
> >>  2822 uqs 9  200   217M 81300K select  0   5:10   0.00%
> >> conky
> >>  3174 root1  200 21532K  3188K select  0   4:20   0.00%
> >> systat
> >>  3130 uqs16  200  1058M   131M uwait   4   3:03   0.00%
> >> chrome
> >>  2998 uqs16  200  1110M   123M uwait   2   2:53   0.00%
> >> chrome
> >>  3165 uqs10  200  1209M   215M uwait   6   2:52   0.00%
> >> chrome
> >>  3142 uqs11  255  1344M   195M uwait   2   2:46   0.00%
> >> chrome
> >>  2876 uqs19  200   580M 37164K select  3   2:42   0.00%
> >> clementine-player
> >>20 root2 -16- 0K32K psleep  6   2:25   0.00%
> >> pagedaemon
> >>
> >> I also had systat -vm running and it continued to update its screen ...
> >> for a short while, this is the last update before SSH died:
> >>
> >>
> >>Mem usage:  0k%Phy  5%Kmem
> >> Mem: KBREALVIRTUAL  VN PAGER   SWAP
> >> PAGER
> >> Tot   Share  TotShareFree   in   out in
> >>  out
> >> Act  11051k   67868 71051992   255448   61840  count
> >> All  11051k   67924 71058776   262100  pages
> >> Proc:
> >> Interrupts
> >>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224
> >> total
> >>  25 730  11   724  109  404  101   13 cow   2
> >> ehci0 16
> >>   zfod  3
> >> ehci1 23
> >>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16
> >> cpu0:timer
> >> ||||||||||   %ozfod
> >>  xhci0 264
> >>   daefr 3 em0
> >> 265
> >> 50 dtbuf  prcfr94
> >> hdac1 266
> >> Namei Name-cache   Dir-cache349167 desvn  totfr
> >>  ahci0 270
> >>Callshits   %hits   %349155 numvn  react 5
> >> cpu1:timer
> >>  121 121 100253501 frevn  pdwak 1
> >> cpu2:timer
> >> 

Re: FreeBSD 11.x grinds to a halt after about 48h of uptime

2016-10-15 Thread Ulrich Spörlein
2016-10-15 18:36 GMT+02:00 Kevin Oberman :
>
> On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
> wrote:
>
> > On 10/15/16 18:18, Ulrich Spörlein wrote:
> >
> >> Hey all, while 11.x is -STABLE now, this happens to my machine ever
> >> since I upgraded it to 11-CURRENT years ago. I have no idea when this
> >> started, actually, but what always happens is this:
> >>
> >> - System and X11 is up and running, I keep it running over night as I'm
> >> too lazy to reboot and restart everthing.
> >> - There's a bunch of xterms, Chrome, Clementine-Player and some other
> >> programs running
> >> - Coming back to the machine the next day (or the day after) it will
> >> exit the screensaver just fine and then either I can use it for a couple
> >> of seconds before it freezes, or it's pretty much dead already. The
> >> mouse cursor still moves for a bit, but the also freezes (so it this a
> >> GPU problem??)
> >>
> >> Now what I currently see on the screen is a clock widget stuck at 18:04
> >> but conky itself has last updated at 18:00:18 ...
> >>
> >> This time I had some SSH sessions from another machine to see some more
> >> useful things. There was nothing in various logs under /var/log (I also
> >> can't run dmesg anymore ...)
> >> I had top(1) running in a loop, this is the last output:
> >>
> >> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
> >> 18:00:12
> >> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
> >>
> >> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
> >> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
> >> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
> >>
> >>
> >>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU
> >> COMMAND
> >>11 root8 155 ki31 0K   128K CPU00 364.6H 772.95%
> >> idle
> >>  3122 uqs15  280  7113M  5861M uwait   0
> >> 94:44  13.96% chrome
> >>2887 uqs28  220  1394M   237M
> >> select  2 172:53   6.98% chrome
> >>2890 uqs11  210
> >> 1034M   178M select  5 231:21   1.95% chrome
> >>1062 root9
> >> 210   440M 47220K select  0  67:09   0.98% Xorg
> >>  3002 uqs
> >>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
> >>  3139 uqs17  255  1163M   156M uwait   2  16:15   0.00%
> >> chrome
> >>  3001 uqs18  255  1639M   575M uwait   0  16:05   0.00%
> >> chrome
> >>12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00%
> >> intr
> >>  3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00%
> >> chrome
> >>  2822 uqs 9  200   217M 81300K select  0   5:10   0.00%
> >> conky
> >>  3174 root1  200 21532K  3188K select  0   4:20   0.00%
> >> systat
> >>  3130 uqs16  200  1058M   131M uwait   4   3:03   0.00%
> >> chrome
> >>  2998 uqs16  200  1110M   123M uwait   2   2:53   0.00%
> >> chrome
> >>  3165 uqs10  200  1209M   215M uwait   6   2:52   0.00%
> >> chrome
> >>  3142 uqs11  255  1344M   195M uwait   2   2:46   0.00%
> >> chrome
> >>  2876 uqs19  200   580M 37164K select  3   2:42   0.00%
> >> clementine-player
> >>20 root2 -16- 0K32K psleep  6   2:25   0.00%
> >> pagedaemon
> >>
> >> I also had systat -vm running and it continued to update its screen ...
> >> for a short while, this is the last update before SSH died:
> >>
> >>
> >>Mem usage:  0k%Phy  5%Kmem
> >> Mem: KBREALVIRTUAL  VN PAGER   SWAP
> >> PAGER
> >> Tot   Share  TotShareFree   in   out in
> >>  out
> >> Act  11051k   67868 71051992   255448   61840  count
> >> All  11051k   67924 71058776   262100  pages
> >> Proc:
> >> Interrupts
> >>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224
> >> total
> >>  25 730  11   724  109  404  101   13 cow   2
> >> ehci0 16
> >>   zfod  3
> >> ehci1 23
> >>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16
> >> cpu0:timer
> >> ||||||||||   %ozfod
> >>  xhci0 264
> >>   daefr 3 em0
> >> 265
> >> 50 dtbuf  prcfr94
> >> hdac1 266
> >> Namei Name-cache   Dir-cache349167 desvn  totfr
> >>  ahci0 270
> >>Callshits   %hits   %349155 numvn  react 5
> >> cpu1:timer
> >>  121 121 100253501 frevn  pdwak 1
> >> cpu2:timer
> 

Re: FreeBSD 11.x grinds to a halt after about 48h of uptime

2016-10-15 Thread Kevin Oberman
On Sat, Oct 15, 2016 at 9:26 AM, Hans Petter Selasky 
wrote:

> On 10/15/16 18:18, Ulrich Spörlein wrote:
>
>> Hey all, while 11.x is -STABLE now, this happens to my machine ever
>> since I upgraded it to 11-CURRENT years ago. I have no idea when this
>> started, actually, but what always happens is this:
>>
>> - System and X11 is up and running, I keep it running over night as I'm
>> too lazy to reboot and restart everthing.
>> - There's a bunch of xterms, Chrome, Clementine-Player and some other
>> programs running
>> - Coming back to the machine the next day (or the day after) it will
>> exit the screensaver just fine and then either I can use it for a couple
>> of seconds before it freezes, or it's pretty much dead already. The
>> mouse cursor still moves for a bit, but the also freezes (so it this a
>> GPU problem??)
>>
>> Now what I currently see on the screen is a clock widget stuck at 18:04
>> but conky itself has last updated at 18:00:18 ...
>>
>> This time I had some SSH sessions from another machine to see some more
>> useful things. There was nothing in various logs under /var/log (I also
>> can't run dmesg anymore ...)
>> I had top(1) running in a loop, this is the last output:
>>
>> last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:28
>> 18:00:12
>> 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting
>>
>> Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
>> ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
>> Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse
>>
>>
>>   PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU
>> COMMAND
>>11 root8 155 ki31 0K   128K CPU00 364.6H 772.95%
>> idle
>>  3122 uqs15  280  7113M  5861M uwait   0
>> 94:44  13.96% chrome
>>2887 uqs28  220  1394M   237M
>> select  2 172:53   6.98% chrome
>>2890 uqs11  210
>> 1034M   178M select  5 231:21   1.95% chrome
>>1062 root9
>> 210   440M 47220K select  0  67:09   0.98% Xorg
>>  3002 uqs
>>   15  255  1159M   172M uwait   2  19:09   0.00% chrome
>>  3139 uqs17  255  1163M   156M uwait   2  16:15   0.00%
>> chrome
>>  3001 uqs18  255  1639M   575M uwait   0  16:05   0.00%
>> chrome
>>12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00%
>> intr
>>  3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00%
>> chrome
>>  2822 uqs 9  200   217M 81300K select  0   5:10   0.00%
>> conky
>>  3174 root1  200 21532K  3188K select  0   4:20   0.00%
>> systat
>>  3130 uqs16  200  1058M   131M uwait   4   3:03   0.00%
>> chrome
>>  2998 uqs16  200  1110M   123M uwait   2   2:53   0.00%
>> chrome
>>  3165 uqs10  200  1209M   215M uwait   6   2:52   0.00%
>> chrome
>>  3142 uqs11  255  1344M   195M uwait   2   2:46   0.00%
>> chrome
>>  2876 uqs19  200   580M 37164K select  3   2:42   0.00%
>> clementine-player
>>20 root2 -16- 0K32K psleep  6   2:25   0.00%
>> pagedaemon
>>
>> I also had systat -vm running and it continued to update its screen ...
>> for a short while, this is the last update before SSH died:
>>
>>
>>Mem usage:  0k%Phy  5%Kmem
>> Mem: KBREALVIRTUAL  VN PAGER   SWAP
>> PAGER
>> Tot   Share  TotShareFree   in   out in
>>  out
>> Act  11051k   67868 71051992   255448   61840  count
>> All  11051k   67924 71058776   262100  pages
>> Proc:
>> Interrupts
>>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224
>> total
>>  25 730  11   724  109  404  101   13 cow   2
>> ehci0 16
>>   zfod  3
>> ehci1 23
>>  0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16
>> cpu0:timer
>> ||||||||||   %ozfod
>>  xhci0 264
>>   daefr 3 em0
>> 265
>> 50 dtbuf  prcfr94
>> hdac1 266
>> Namei Name-cache   Dir-cache349167 desvn  totfr
>>  ahci0 270
>>Callshits   %hits   %349155 numvn  react 5
>> cpu1:timer
>>  121 121 100253501 frevn  pdwak 1
>> cpu2:timer
>>   pdpgs29
>> cpu7:timer
>> Disks   md0  ada0  ada1 pass0 pass1 pass2 intrn12
>> cpu3:timer
>> KB/t   0.00  0.00  0.00  0.00  0.00  0.00 5318892 wire 41
>> cpu6:timer
>> tps   0 0   

Re: FreeBSD 11.x grinds to a halt after about 48h of uptime

2016-10-15 Thread Hans Petter Selasky

On 10/15/16 18:18, Ulrich Spörlein wrote:

Hey all, while 11.x is -STABLE now, this happens to my machine ever
since I upgraded it to 11-CURRENT years ago. I have no idea when this
started, actually, but what always happens is this:

- System and X11 is up and running, I keep it running over night as I'm
too lazy to reboot and restart everthing.
- There's a bunch of xterms, Chrome, Clementine-Player and some other
programs running
- Coming back to the machine the next day (or the day after) it will
exit the screensaver just fine and then either I can use it for a couple
of seconds before it freezes, or it's pretty much dead already. The
mouse cursor still moves for a bit, but the also freezes (so it this a
GPU problem??)

Now what I currently see on the screen is a clock widget stuck at 18:04
but conky itself has last updated at 18:00:18 ...

This time I had some SSH sessions from another machine to see some more
useful things. There was nothing in various logs under /var/log (I also
can't run dmesg anymore ...)
I had top(1) running in a loop, this is the last output:

last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:2818:00:12
202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting

Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse


  PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU COMMAND
   11 root8 155 ki31 0K   128K CPU00 364.6H 772.95% idle

 3122 uqs15  280  7113M  5861M uwait   0  94:44  13.96% chrome  

 2887 uqs28  220  1394M   237M select  2 172:53   6.98% chrome  

 2890 uqs11  210  1034M   178M select  5 231:21   1.95% chrome  

 1062 root9  210   440M 47220K select  0  67:09   0.98% Xorg

 3002 uqs15  255  1159M   172M uwait   2  19:09   0.00% chrome
 3139 uqs17  255  1163M   156M uwait   2  16:15   0.00% chrome
 3001 uqs18  255  1639M   575M uwait   0  16:05   0.00% chrome
   12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00% intr
 3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00% chrome
 2822 uqs 9  200   217M 81300K select  0   5:10   0.00% conky
 3174 root1  200 21532K  3188K select  0   4:20   0.00% systat
 3130 uqs16  200  1058M   131M uwait   4   3:03   0.00% chrome
 2998 uqs16  200  1110M   123M uwait   2   2:53   0.00% chrome
 3165 uqs10  200  1209M   215M uwait   6   2:52   0.00% chrome
 3142 uqs11  255  1344M   195M uwait   2   2:46   0.00% chrome
 2876 uqs19  200   580M 37164K select  3   2:42   0.00% 
clementine-player
   20 root2 -16- 0K32K psleep  6   2:25   0.00% 
pagedaemon

I also had systat -vm running and it continued to update its screen ...
for a short while, this is the last update before SSH died:


   Mem usage:  0k%Phy  5%Kmem
Mem: KBREALVIRTUAL  VN PAGER   SWAP PAGER
Tot   Share  TotShareFree   in   out in   out
Act  11051k   67868 71051992   255448   61840  count
All  11051k   67924 71058776   262100  pages
Proc:Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224 total
 25 730  11   724  109  404  101   13 cow   2 ehci0 16
  zfod  3 ehci1 23
 0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16 cpu0:timer
||||||||||   %ozfod   xhci0 264
  daefr 3 em0 265
50 dtbuf  prcfr94 hdac1 266
Namei Name-cache   Dir-cache349167 desvn  totfr   ahci0 270
   Callshits   %hits   %349155 numvn  react 5 cpu1:timer
 121 121 100253501 frevn  pdwak 1 cpu2:timer
  pdpgs29 cpu7:timer
Disks   md0  ada0  ada1 pass0 pass1 pass2 intrn12 cpu3:timer
KB/t   0.00  0.00  0.00  0.00  0.00  0.00 5318892 wire 41 cpu6:timer
tps   0 0 0 0 0 0 9261404 act  12 cpu5:timer
MB/s   0.00  0.00  0.00  

FreeBSD 11.x grinds to a halt after about 48h of uptime

2016-10-15 Thread Ulrich Spörlein
Hey all, while 11.x is -STABLE now, this happens to my machine ever
since I upgraded it to 11-CURRENT years ago. I have no idea when this
started, actually, but what always happens is this:

- System and X11 is up and running, I keep it running over night as I'm
too lazy to reboot and restart everthing.
- There's a bunch of xterms, Chrome, Clementine-Player and some other
programs running
- Coming back to the machine the next day (or the day after) it will
exit the screensaver just fine and then either I can use it for a couple
of seconds before it freezes, or it's pretty much dead already. The
mouse cursor still moves for a bit, but the also freezes (so it this a
GPU problem??)

Now what I currently see on the screen is a clock widget stuck at 18:04
but conky itself has last updated at 18:00:18 ...

This time I had some SSH sessions from another machine to see some more
useful things. There was nothing in various logs under /var/log (I also
can't run dmesg anymore ...)
I had top(1) running in a loop, this is the last output:

last pid: 25633;  load averages:  0.27,  0.39,  0.36  up 1+23:03:2818:00:12
202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting

Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free
ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other
Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse


  PID USERNAME  THR PRI NICE   SIZERES STATE   C   TIMEWCPU COMMAND
   11 root8 155 ki31 0K   128K CPU00 364.6H 772.95% idle

 3122 uqs15  280  7113M  5861M uwait   0  94:44  13.96% chrome  

 2887 uqs28  220  1394M   237M select  2 172:53   6.98% chrome  

 2890 uqs11  210  1034M   178M select  5 231:21   1.95% chrome  

 1062 root9  210   440M 47220K select  0  67:09   0.98% Xorg

 3002 uqs15  255  1159M   172M uwait   2  19:09   0.00% chrome
 3139 uqs17  255  1163M   156M uwait   2  16:15   0.00% chrome
 3001 uqs18  255  1639M   575M uwait   0  16:05   0.00% chrome
   12 root   24 -64- 0K   384K WAIT   -1  10:53   0.00% intr
 3129 uqs12  200  2820M  1746M uwait   6   8:36   0.00% chrome
 2822 uqs 9  200   217M 81300K select  0   5:10   0.00% conky
 3174 root1  200 21532K  3188K select  0   4:20   0.00% systat
 3130 uqs16  200  1058M   131M uwait   4   3:03   0.00% chrome
 2998 uqs16  200  1110M   123M uwait   2   2:53   0.00% chrome
 3165 uqs10  200  1209M   215M uwait   6   2:52   0.00% chrome
 3142 uqs11  255  1344M   195M uwait   2   2:46   0.00% chrome
 2876 uqs19  200   580M 37164K select  3   2:42   0.00% 
clementine-player
   20 root2 -16- 0K32K psleep  6   2:25   0.00% 
pagedaemon

I also had systat -vm running and it continued to update its screen ...
for a short while, this is the last update before SSH died:


   Mem usage:  0k%Phy  5%Kmem
Mem: KBREALVIRTUAL  VN PAGER   SWAP PAGER
Tot   Share  TotShareFree   in   out in   out
Act  11051k   67868 71051992   255448   61840  count
All  11051k   67924 71058776   262100  pages  
Proc:Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltioflt   224 total
 25 730  11   724  109  404  101   13 cow   2 ehci0 16
  zfod  3 ehci1 23
 0.0%Sys   0.1%Intr  0.0%User  0.0%Nice 99.9%Idle ozfod16 cpu0:timer
||||||||||   %ozfod   xhci0 264
  daefr 3 em0 265
50 dtbuf  prcfr94 hdac1 266
Namei Name-cache   Dir-cache349167 desvn  totfr   ahci0 270
   Callshits   %hits   %349155 numvn  react 5 cpu1:timer
 121 121 100253501 frevn  pdwak 1 cpu2:timer
  pdpgs29 cpu7:timer
Disks   md0  ada0  ada1 pass0 pass1 pass2 intrn12 cpu3:timer
KB/t   0.00  0.00  0.00  0.00  0.00  0.00 5318892 wire 41 cpu6:timer
tps   0 0 0 0 0 0 9261404 act  12 cpu5:timer
MB/s   0.00  0.00  0.00  0.00  0.00  0.00 1598184