i managed to solve this. i don't know why, exactly, but only loading snmp, syslog, and write_graphite plugins was the culprit. one of the following has stopped the closed_wait situation, even though i am not using any of them.
+LoadPlugin syslog +LoadPlugin cpu +LoadPlugin interface +LoadPlugin load +LoadPlugin memory +LoadPlugin network can anyone explain that to me? On Thu, Oct 17, 2013 at 5:49 PM, ryanL <[email protected]> wrote: > heya. i've compiled 5.4 for linux (centos) at commit 0a161fcfd, and > seem to be having a problem that does not exist at 5.1. > > my collectd is pretty barebones, just doing snmp polling against > network devices every 60s. when first starting it up, i get an > established TCP connection to my graphite collector and values get > written. then, we get stuck. i can see in tcpdump that collectd is > polling the network and getting values, but can't write to graphite. > > i see this: > > # while sleep 1; do pgrep collectd | xargs sudo /usr/sbin/lsof -Pnp | > grep TCP; done > collectd 7996 produser 10u IPv4 36198298 0t0 > TCP 10.1.12.2:53798->10.101.3.213:2003 (CLOSE_WAIT) > collectd 7996 produser 10u IPv4 36198298 0t0 > TCP 10.1.12.2:53798->10.101.3.213:2003 (CLOSE_WAIT) > collectd 7996 produser 10u IPv4 36198298 0t0 > TCP 10.1.12.2:53798->10.101.3.213:2003 (CLOSE_WAIT) > collectd 7996 produser 10u IPv4 36198298 0t0 > TCP 10.1.12.2:53798->10.101.3.213:2003 (CLOSE_WAIT) > collectd 7996 produser 10u IPv4 36198298 0t0 > TCP 10.1.12.2:53798->10.101.3.213:2003 (CLOSE_WAIT) > > it stays in this state forever until i restart collectd. upon doing so > i'll get one initial blast of collected data, and then we're jammed > again. > > my relevant collectd config: > > <Plugin write_graphite> > <Carbon> > Host "graphite-collector" > Port "2003" > Protocol "tcp" > Prefix "collectd." > StoreRates false > AlwaysAppendDS false > Postfix "" > EscapeCharacter "_" > </Carbon> > </Plugin> > > on the collectd 5.1 box, it remains like this: > > $ while sleep 1; do pgrep collectd | xargs sudo /usr/sbin/lsof -Pnp | > grep TCP; done > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > collectd 5638 root 9u IPv4 561435650 0t0 TCP > 10.101.3.9:51249->10.101.3.213:2003 (ESTABLISHED) > > any ideas, or further info i can give you guys? > > thanks! > > ryan _______________________________________________ collectd mailing list [email protected] http://mailman.verplant.org/listinfo/collectd
