Re: Large number of used ports from tserver

2018-01-26 Thread Adam J. Shook
I checked all tablet servers across all six of our environments and it seems to be present in all of them, with some having upwards of 73k connections. I disabled replication in our dev cluster and restarted the tablet servers. Left it running overnight and checked the connections -- a

Re: Large number of used ports from tserver

2018-01-25 Thread Christopher
Interesting. It's possible we're mishandling an IOException from DFSClient or something... but it's also possible there's a bug in DFSClient somewhere. I found a few similar issues from the past... some might still be not fully resolved: https://issues.apache.org/jira/browse/HDFS-1836

Re: Large number of used ports from tserver

2018-01-25 Thread Michael Wall
What tables/tablets are on that tserver? On Thu, Jan 25, 2018 at 11:27 AM Adam J. Shook wrote: > We're running Ubuntu 14.04, HDFS 2.6.0, ZooKeeper 3.4.6, and Accumulo > 1.8.1. I'm using `lsof -i` and grepping for the tserver PID to list all > the connections. Just now

Re: Large number of used ports from tserver

2018-01-25 Thread Adam J. Shook
We're running Ubuntu 14.04, HDFS 2.6.0, ZooKeeper 3.4.6, and Accumulo 1.8.1. I'm using `lsof -i` and grepping for the tserver PID to list all the connections. Just now there are ~25k connections for this one tserver, of which 99.9% of them are all writing to various DataNodes on port 50010. It's

Re: Large number of used ports from tserver

2018-01-24 Thread Josh Elser
+1 to looking at the remote end of the socket and see where they're going/coming to/from. I've seen a few HDFS JIRA issues filed about sockets left in CLOSED_WAIT. Lucky you, this is a fun Linux rabbit hole to go down :)

Re: Large number of used ports from tserver

2018-01-24 Thread Christopher
I haven't seen that, but I'm curious what OS, Hadoop, ZooKeeper, and Accumulo version you're running. I'm assuming you verified that it was the TabletServer process holding these TCP sockets open using `netstat -p` and cross-referencing the PID with `jps -ml` (or similar)? Are you able to confirm

Large number of used ports from tserver

2018-01-24 Thread Adam J. Shook
Hello all, Has anyone come across an issue with a TabletServer occupying a large number of ports in a CLOSED_WAIT state? 'Normal' number of used ports on a 12-node cluster are around 12,000 to 20,000 ports. In one instance, there were over 68k and it was affecting other applications from