On Dec 22, 2017, at 7:46 AM, Olivier R. <m...@grammalecte.net> wrote:
> 
> Le 19/12/2017 à 21:49, Warren Young a écrit :
> 
>> If it’s a sign of a bug, then it means something very bad has
>> happened, like the network stack has lost track of its client
>> somehow.  To see that, you’d need to do a network capture on that
>> fossil instance’s network sockets.  Use netstat -nap or lsof -i to
>> find out which TCP ports those are.
> 
> There is now more than 24 subprocesses of Fossil running, and it’s getting 
> really slow.
> 
> netstat -na | grep :8080
> 
> tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN
> tcp        0      0 10.2.148.67:8080        125.46.156.43:59187 ESTABLISHED
> tcp        0      0 10.2.148.67:8080        175.42.2.225:51516 ESTABLISHED

Two things I learn from this:

1. Your repo is public-facing.  Is this a reasonable number of clients to be 
connected at any given time to this repo?  It seems high to me, given the 
transient nature of most Fossil connections.

2. I don’t see many WAIT states, so we’re probably not looking at the sort of 
weird bug as in that Cloudflare blog post I linked to earlier.

> After clicking on “timeline” on the web ui.
> 
> netstat -nap | grep :8080

The -p in that command was intended to let you find the Fossil process by name, 
rather than by port.  So, “netstat -nap | grep fossil” rather than grepping for 
the port.

I’d expect it to give no big difference relative to what you got.

> lsof -i

Here again I expected you to look at the docs and infer that I meant for you to 
filter out only the interesting ports, 8080 in your case.  -i:8080.

But since we have all the TCP/IP connections, let’s see if we can learn 
something interesting from it.

> fossil2 10468 myuser    3u  IPv4 35083942      0t0  TCP *:http-alt (LISTEN)
> fossil2 10469 myuser    3u  IPv4 35083942      0t0  TCP *:http-alt (LISTEN)
> fossil2 10470 myuser    3u  IPv4 35083942      0t0  TCP *:http-alt (LISTEN)

etc.  This is odd.  It looks like Fossil is forking off children which listen 
and then never get anything.  Yet, we see the same PID having many connections 
each already established.

What’s the TCP connection rate to this machine?

There must be nice tools for that which a network security admin would know 
about, but my off-the-cuff programmer brain brings up only this:

   $ sudo tshark -b duration:1 port 8080 and tcp.flags.syn==1 | wc -l

Ignore the complaint about “multiple capture files”.  We’re just wanting to 
know how many SYNs per second appear.  Consider increasing it to 10 seconds or 
so to get a better baseline if the connection rate is in the single digits.

> It seems some connections are just never closed for some reason.

Right, which makes me wonder if you’ve got some botnet on your hands, banging 
on the server for no good reason.

That’s why I asked for the connection rate.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to