Right. I have a few VMs aggregators as well as physical hardware. VMs have 
more issues than physical hardware but are still susceptible to loss. This 
is very evident with metrics that arrive at the same time e.g. cron 
triggered gmetric jobs.

Also something unexpected happened. I have two VMs that are a pair ie. all 
nodes send metrics to both in case one fails we still have metrics. I 
upgraded e.g. aggregator2. I did not touch aggregator1 yet UDP errors 
vanished on aggregator1 as well. Puzzling.

Vladimir

On Mon, 23 Apr 2012, Daniel Pocock wrote:

>
>
> On 23/04/12 22:24, Vladimir Vuksan wrote:
>> I was having identical issues. I used your patch with the exception that
>> I bumped up buffer size first to 10M from 1M you had. There was a
>> massive improvement but still was seeing some drops so I just decided to
>> bump it up to 30M and it's even better although I still see occasional
>> drops.
>
> If you have such a big buffer, then you could also have latency issues,
> as it suggests your CPU is just not able to process all the work in time
>
> You would either need to revise the workload (by splitting clusters,
> etc) or re-write gmond to be multithreaded (so it can use more cores)
>
>

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to