Re: buildworld without libncursesw

2015-03-04 Thread Brooks Davis
On Wed, Mar 04, 2015 at 10:47:44AM +1100, Dewayne Geraghty wrote:
 
 On 4/03/2015 8:13 AM, Brooks Davis wrote:
  On Tue, Mar 03, 2015 at 08:20:57PM +1100, Dewayne Geraghty wrote:
  Is there a preferred way to buildworld without libncursesw?
 
  When I add to /etc/src.conf
  WITHOUT_NCURSESW=yes
 
  I find that a buildworld fails due to missing libncursesw.*.
  So what uses libncurses?  These guys do
  /usr/bin/dialog
  /usr/bin/dpv
   
  /usr/sbin/sade - /usr/libexec/bsdinstall/partedit
  /usr/sbin/tzsetup
 
  Getting a little frustrated I modifed the Makefile:, so for example
  dialog (/usr/src/contrib/dialog)
 
  +.include bsd.own.mk
  +
  +.if ${MK_NCURSESW} == no
  +DPADD= ${LIBDPV} ${LIBDIALOG} ${LIBFIGPAR} ${LIBNCURSES}
  ${LIBUTIL} ${LIBM}
  +LDADD= -ldpv -ldialog -lfigpar -lncurses -lutil -lm
  +.else
   DPADD= ${LIBDPV} ${LIBDIALOG} ${LIBFIGPAR} ${LIBNCURSESW}
  ${LIBUTIL} ${LIBM}
   LDADD= -ldpv -ldialog -lfigpar -lncursesw -lutil -lm
  +.endif
 
  And checking
  # make -VMK_NCURSESW
  no
 
  I'm at a bit of a loss as to why these are proving difficult to build,
  or what I can do to get the desired outcome, ie no libncursesw.so*
  I tried to make this work a while ago and it's not practical.  Instead,
  we need to remove libncurses (or more likely replace it with a linker
  script to cause libncursesw to be used.)
 
  It should be the case that nothing in the base system uses libncurses,
  but it's all too likely that someone has broken that since I switched
  the remaining bits over.
 
  -- Brooks
 Unfortunately I can't say which ones use libncurses as I've sprinkled
 things like this over anything that uses libncursesw
 
 -DPADD= ${LIBDEVSTAT} ${LIBKVM} ${LIBGEOM} ${LIBBSDXML} ${LIBSBUF}
 ${LIBEDIT} ${LIBNCURSESW}
 -LDADD= -ldevstat -lkvm -lgeom -lbsdxml -lsbuf -ledit -lncursesw
 +DPADD= ${LIBDEVSTAT} ${LIBKVM} ${LIBGEOM} ${LIBBSDXML} ${LIBSBUF}
 ${LIBEDIT}
 +LDADD= -ldevstat -lkvm -lgeom -lbsdxml -lsbuf -ledit
 
 +.include bsd.own.mk
 +
 +.if ${MK_NCURSESW} == no
 +DPADD+= ${LIBNCURSES}
 +LDADD+= -lncurses
 +.else
 +DPADD+= ${LIBNCURSESW}
 +LDADD+= -lncursesw
 +.endif
 +
 
 and only the above 4 programs are more of a challenge.
 
 Any consistency is a good thing, so honouring WITHOUT_NCURSESW should be
 the trigger.  This situation arose because I needed some things in
 /rescue and there was a conflict stuffing both libncurses and
 libncursesw into the /usr/src/rescue build, as you'd expect. :)

I'd forgotten I'd merged WITHOUT_NCURSESW to 10.  That was a mistake as
it turns out to be unmaintainable.  I removed it from head long ago and
it will not return.  Unless you want to fix it and keep fixing it as
things are merged from head we should either remove it or document it as
broken.

-- Brooks


pgpFsfYzIEBlL.pgp
Description: PGP signature


Re: Stale TIME_WAIT tcp connections

2015-03-04 Thread Rumen Telbizov
Hello again,

Thank you for the responses.
No I don't have any IPSEC in the kernel. Further observations overnight
revealed that:

a) Those stale TIME_WAIT sockets do expire at some time, since I was
watching one of them which seemed to stay around for hours but in the
morning it actually was gone.

b) It seems like both sockets which don't get established (only syn sent)
are getting registered and get stuck there as well as fully established and
properly closed ones. Here are a couple of examples:

Monitoring the traffic from a specific client host (server IP obfuscated to
1.2.3.4, client IP to 5.6.7.8):

IP 5.6.7.8.43440  1.2.3.4.5666: Flags [S], seq 4056322107, win 5840,
options [mss 1460,sackOK,TS val 729030596 ecr 0,nop,wscale 7], length 0
IP 5.6.7.8.43437  1.2.3.4.5666: Flags [S], seq 3979308195, win 5840,
options [mss 1460,sackOK,TS val 729031604 ecr 0,nop,wscale 7], length 0

Those are connections that never got established. I picked up and watched
one of those syn-only tuples and it seems like it does allocate and consume
a connection:

# date ; sockstat | grep 5.6.7.8:43440
Wed Mar  4 19:02:24 UTC 2015
??  ? ?  tcp4   1.2.3.4:5666 5.6.7.8:43440
# date ; sockstat | grep 5.6.7.8:43440
Wed Mar  4 19:10:11 UTC 2015
??  ? ?  tcp4   1.2.3.4:5666 5.6.7.8:43440
# date ; netstat -na | grep 5.6.7.8.43440
Wed Mar  4 19:38:56 UTC 2015
tcp4   0  0 1.2.3.4.5666  5.6.7.8.43440 TIME_WAIT


And here's a properly established and closed TCP socket between the same
client and server:

19:14:47.827359 IP 5.6.7.8.33877  1.2.3.4.5666: Flags [S], seq 3819001779,
win 5840, options [mss 1460,sackOK,TS val 729095309 ecr 0,nop,wscale 7],
length 0
19:14:47.827390 IP 1.2.3.4.5666  5.6.7.8.33877: Flags [S.], seq
2990857548, ack 3819001780, win 65535, options [mss 1436,nop,wscale
6,sackOK,TS val 2460189516 ecr 729095309], length 0
19:14:47.979287 IP 5.6.7.8.33877  1.2.3.4.5666: Flags [.], ack 1, win 46,
options [nop,nop,TS val 729095347 ecr 2460189516], length 0
19:14:47.979408 IP 5.6.7.8.33877  1.2.3.4.5666: Flags [P.], seq 1:1041,
ack 1, win 46, options [nop,nop,TS val 729095347 ecr 2460189516], length
1040
19:14:47.980136 IP 1.2.3.4.5666  5.6.7.8.33877: Flags [F.], seq 1, ack
1041, win 1045, options [nop,nop,TS val 2460189668 ecr 729095347], length 0
19:14:48.132156 IP 5.6.7.8.33877  1.2.3.4.5666: Flags [F.], seq 1041, ack
2, win 46, options [nop,nop,TS val 729095386 ecr 2460189668], length 0
19:14:48.132173 IP 1.2.3.4.5666  5.6.7.8.33877: Flags [.], ack 1042, win
1045, options [nop,nop,TS val 2460189821 ecr 729095386], length 0


It also gets stuck there for quite a while:

# sockstat | grep 5.6.7.8:33877
??  ? ?  tcp4   1.2.3.4:5666 5.6.7.8:33877
# date ; netstat -na | grep 5.6.7.8.33877
Wed Mar  4 19:16:09 UTC 2015
tcp4   0  0 1.2.3.4.5666  5.6.7.8.33877 TIME_WAIT
# date ; netstat -na | grep 5.6.7.8.33877
Wed Mar  4 19:31:31 UTC 2015
tcp4   0  0 1.2.3.4.5666  5.6.7.8.33877 TIME_WAIT

So naturally the server never manages to get on top of things due to not
discarding those on time.

Any other ideas and suggestions?

Regards,
Rumen Telbizov

On Tue, Mar 3, 2015 at 5:41 PM, Michael Ross g...@ross.cx wrote:

 On Wed, 04 Mar 2015 01:36:18 +0100, Rumen Telbizov telbi...@gmail.com
 wrote:

  Hello everyone,

 We have a server running 9.3-RELEASE which is exhibiting a high number of
 TIME_WAIT tcp connections which are NOT being recycled. That is, netstat
 reports them over and over again, no matter how long we wait for them to
 be
 flushed out. Currently this server has been out of rotation for a couple
 of
 hours and I still see the same tcp sockets there. Overall we have:

 # netstat -na | grep TIME_WAIT | wc -l
*30066*

 Tracking one particular TCP socket in TIME_WAIT proves that it stays there
 all the time.

 Another observation is that pfctl shows a very large number of state
 entries, even after pfctl -F all, or disable/enable sequence.

 # pfctl -si
 State Table  Total Rate
   current entries*59280*

 At the same time though:

 # pfctl -ss | wc -l
   18

 After the problem was discovered we tried tweaking the following settings
 without any luck:

 net.inet.tcp.fast_finwait2_recycle=1
 net.inet.tcp.finwait2_timeout=5000
 net.inet.tcp.maxtcptw=5
 net.inet.tcp.msl=100

 ​So it seems like this system is stuck and ​doesn't recycle those TCP
 sockets. Again, the machine is out of rotation and not actively accepting
 any traffic. I will keep it like that in case further investigation is
 required. Please do let me know if there's anything else you'd like to
 know
 from the state of the machine or something I could try.

 ​Regards,


 Are you using any IPSEC?
 I observed something similar a while back, haven't checked again since i
 reported this.
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194690
 Affected 9.2, too.

 Michael




--