On Dec 22, 2017, at 7:46 AM, Olivier R. <m...@grammalecte.net> wrote: > > Le 19/12/2017 à 21:49, Warren Young a écrit : > >> If it’s a sign of a bug, then it means something very bad has >> happened, like the network stack has lost track of its client >> somehow. To see that, you’d need to do a network capture on that >> fossil instance’s network sockets. Use netstat -nap or lsof -i to >> find out which TCP ports those are. > > There is now more than 24 subprocesses of Fossil running, and it’s getting > really slow. > > netstat -na | grep :8080 > > tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN > tcp 0 0 10.2.148.67:8080 125.46.156.43:59187 ESTABLISHED > tcp 0 0 10.2.148.67:8080 175.42.2.225:51516 ESTABLISHED
Two things I learn from this: 1. Your repo is public-facing. Is this a reasonable number of clients to be connected at any given time to this repo? It seems high to me, given the transient nature of most Fossil connections. 2. I don’t see many WAIT states, so we’re probably not looking at the sort of weird bug as in that Cloudflare blog post I linked to earlier. > After clicking on “timeline” on the web ui. > > netstat -nap | grep :8080 The -p in that command was intended to let you find the Fossil process by name, rather than by port. So, “netstat -nap | grep fossil” rather than grepping for the port. I’d expect it to give no big difference relative to what you got. > lsof -i Here again I expected you to look at the docs and infer that I meant for you to filter out only the interesting ports, 8080 in your case. -i:8080. But since we have all the TCP/IP connections, let’s see if we can learn something interesting from it. > fossil2 10468 myuser 3u IPv4 35083942 0t0 TCP *:http-alt (LISTEN) > fossil2 10469 myuser 3u IPv4 35083942 0t0 TCP *:http-alt (LISTEN) > fossil2 10470 myuser 3u IPv4 35083942 0t0 TCP *:http-alt (LISTEN) etc. This is odd. It looks like Fossil is forking off children which listen and then never get anything. Yet, we see the same PID having many connections each already established. What’s the TCP connection rate to this machine? There must be nice tools for that which a network security admin would know about, but my off-the-cuff programmer brain brings up only this: $ sudo tshark -b duration:1 port 8080 and tcp.flags.syn==1 | wc -l Ignore the complaint about “multiple capture files”. We’re just wanting to know how many SYNs per second appear. Consider increasing it to 10 seconds or so to get a better baseline if the connection rate is in the single digits. > It seems some connections are just never closed for some reason. Right, which makes me wonder if you’ve got some botnet on your hands, banging on the server for no good reason. That’s why I asked for the connection rate. _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users