Re: counters for specific http status code

Ruoshan Huang Mon, 11 Jul 2016 21:31:47 -0700

hi,

> On Jul 12, 2016, at 12:05 PM, Willy Tarreau <[email protected]> wrote:
> 
> Hi Ruoshan,
> 
> On Tue, Jul 05, 2016 at 12:28:43PM +0800, Ruoshan Huang wrote:
>> hi,
>>    I use the hrsp4xx/hrsp5xx a lot to do timeseries metric collecting,
>>    sometimes the these counters lack details. I'm wondering what's the
>>    badside of no collecting all the HTTP status (just an array of 500
>>    length? I know there're many empty slot in it).
> 
> What I'm seeing is that it will consume 4 more kB of memory in each
> proxy. That can be a bit problematic for people who use a *lot* of
> backends. The largest configs I've been aware were 300000 backends
> (yes I know that sounds insane). This would add 1.2 GB of RAM just
> to add these extra counters. We could imagine dynamically storing
> them into a tree but it would consume so much more per individual
> entry that it would not be worth it.


Thank you so much for elaborating in such detail !

Indeed I only care about 20 status code or so, like 400, 401, 403 (I have been 
bitten many times by treating those codes as 404 :( ), and 50x also helps a lot 
in our alerting system as a first method for diagnosing abnormal things. By and 
large, the "big" array does waste a lot of memory (I know), but that's the fast 
way I try to make a demo.

> 
>>    Currently I have to do the accounting on the log file. but on a high load
>>    machine, it cost a lot to processing the log itself.
> 
> Well, on my local machine, halog takes only 0.98 second to process a 2.3 GB
> file and give me the status distribution (8.4 million lines). That's 2.4 GB
> per second or 8.6 million lines per second. This is not something I would
> consider as a high load, because if you rotate your log files past 2 GB,
> you'd eat less than 1 second of CPU for each analysis. It's probably less
> than what can be lost per hour due to cache misses when adding 4kB per
> proxy to a large config in fact :-/

havn't used `halog` for parsing, will try it. but I prefer collecting metrics 
remotely, there are many hosts and stat sock can dump the stick table (stick 
table is realy amazing, thank you for developing this!)

> 
>> There're ways like
>>    logstash or log to remote machine. but do the accounting on HA seems to
>>    be a very light approach.
>>    I've a draft patch for this feature, just adding a stats socket command
>>    like `show httprsp <code start> <code end>  [backend]`. Wish to get some
>>    idea on this. Thanks
> 
> I'm not opposed to trying to store a bit more stats, but as you saw we'd
> be eating a lot of memory to store a lot of holes. Could you please do me
> a favor, and count the number of different status codes you're seeing on
> your side ? There's a compact variant of the ebtree which saves two pointers
> and which would require only 2 pointers, a key and a value per node, hence
> around 26 bytes, or 18 or even 14 with relative addressing if we limit the
> number of different statuses per proxy. Maybe this combined with dynamic
> allocation to move that part out of the struct proxy would make sense
> because 1) it would eat much less memory per proxy for small number of
> statuses (eg: break-even around 150 different statuses) and would not
> eat memory for backends with little or no traffic. It would also allow
> us to remove the rsp[6] array since we would be able to compute these
> values by walking over the tree.
> 

At first I was thinking whether we could track the response status in stick 
table, then it may be neat. but currently there isn't `http-response track-sc?` 
directive. can it?

Using a dynamic ebtree will eventually recreating a tracking table or something 
like the stick table, so same question here: is it possible do the stick 
counter tracking in response phase, so more things can be tracked.

--
Good day!
ruoshan

Re: counters for specific http status code

Reply via email to