Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Ramon Bastiaans
This is with gmond version 3.3.1, with a simple udp_receive_channel set like this: udp_recv_channel { port = 8669 } - Ramon. On 23-4-2012 12:03, Ramon Bastiaans wrote: Hi, While troubleshooting an other network issue, I enabled the netstats.py module to report udp_rcvbufrerrors.

[Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Ramon Bastiaans
Hi, While troubleshooting an other network issue, I enabled the netstats.py module to report udp_rcvbufrerrors. Ironically, it seems to me as if gmond itself is experiencing udp receive buffer errors. When I check out /proc/net/udp for drops, amongst other things I see: sl

Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Daniel Pocock
Hi Ramon, Vladimir asked about similar errors on IRC recently I thought buffer sizes may be an issue, so the 3.3.7 release candidate has logging of RX buffer sizes (it is logged at debug level when gmond starts). It may be interesting and helpful to compare those buffer sizes, system

Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Ramon Bastiaans
Hi Daniel, Ah ok. Before you sent your email I had already created a small patch for myself. It almost seems that APR ignores the OS settings (i.e.: net.core.rmem_default) and creates a socket with it's own default (receive) buffer size. Attached is a patch against 3.3.6 for lib/apr_net.c

Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Daniel Pocock
On 23/04/12 22:24, Vladimir Vuksan wrote: I was having identical issues. I used your patch with the exception that I bumped up buffer size first to 10M from 1M you had. There was a massive improvement but still was seeing some drops so I just decided to bump it up to 30M and it's even better

Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Vladimir Vuksan
Right. I have a few VMs aggregators as well as physical hardware. VMs have more issues than physical hardware but are still susceptible to loss. This is very evident with metrics that arrive at the same time e.g. cron triggered gmetric jobs. Also something unexpected happened. I have two VMs

Re: [Ganglia-developers] gmond udp receive buffer errors

2012-04-23 Thread Vladimir Vuksan
I was having identical issues. I used your patch with the exception that I bumped up buffer size first to 10M from 1M you had. There was a massive improvement but still was seeing some drops so I just decided to bump it up to 30M and it's even better although I still see occasional drops. To