There are two knobs that, together, throttle the agent processes. These are httpConnector.maxPostSize and httpConnector.minPostInterval
The maximum configured agent bandwidth is the ratio between those. I would try reducing the min post interval. The defaults are, if I remember right, something like 2 MB/ 5 seconds = 400 k/sec. You can crank that down a long ways. Nothing should explode even if you set it to 1 ms. --Ari On Fri, Aug 13, 2010 at 9:11 AM, Eric Fiala <e...@fiala.ca> wrote: > Hello all, > We would like to bring our production Chukwa (0.3.0) infrastructure to the > next level. > Currently, we have 5 machines generating 400GB per day (80GB in single log, > per machine). > These are using chukwa-agent CharFileTailingAdaptorUTF8. Of > note, chukwaAgent.fileTailingAdaptor.maxReadSize has been upped to 4000000. > We've left httpConnector.maxPostSize to default. > The agents are sending to 3 chukwa-collectors which are simply gateways into > HDFS (one also handles demux/processing - but this doesn't appear to be the > wall... yet). The agents have all three collectors listed in their conf. > We are hitting walls somewhere, the whole 400GB is worked all the way into > our repos over the course of the day, but during peeks we are falling > upwards of 1-2 hours behind between being written to the tailed log and > hitting hdfs://chukwa/logs as a .chukwa. > Further we have observed that hdfs://chukwa/logs in our setup does not fill > faster than 2GB per 5 minute period. This is whether we use 2 chukwa > collectors or 3. This is further discouragement once foreseeable growth > takes us to over ~ 575GB per day. > All the machines are definitely not load bound, have noticed that chukwa was > built with low resource utilization in mind - one thought is if this could > be tweaked we could probably get more data through quicker. > We have toyed with changing default Xmx or like value but don't want to > start turning too many knobs before consulting the experts, considering all > the pieces involved it's probably wise. Scaling out is also an option, but > I'm determined to squeeze x10 or more than current out of these multicore > machines. > Any suggestions are welcome, > Thanks. > EF -- Ari Rabkin asrab...@gmail.com UC Berkeley Computer Science Department