Re: [PATCH] MEDIUM: stats: Add `show events` command.

Aman Gupta Sat, 24 Mar 2012 22:56:36 -0700

> Agreed. For this reason we've recently extended the log format to optionally
> include new fields such as the outgoing source IP:port. I would not have
> expected another request to come that soon :-)


Quite the coincidence. I was working in a haproxy-1.4 checkout, so I
missed the new logging commits.

> This is really nice that you did that, as it's been on my todo list for
> a long time. But I have comments on this before I can merge it :

Awesome, thanks. This is my first time working with the haproxy
codebase and I wasn't sure if I was on the right track. Definitely
appreciate the feedback.

>
> 1) I'd rather use "debug sessions" than "show events" because till now
>   "show anything" only lists a finite list, and "events" can be a lot
>   of things.

I was initially using "stream", then "events". I considered "show sess
events" but "show events" was simpler to implement.

I agree re-using the "show" prefix is confusing, so I'll switch to
"debug sessions" (or "stream sessions"?)

>
> 2) it looks to me like the stats socket is still usable in parallel to the
>   command. If so, we could as well as an "undebug all" command to stop
>   the debugging. But I'm not requesting that for now, it's more out of
>   curiosity.

I wasn't quite sure what to do here either. Initially, once streaming
was enabled you would have to reconnect to do anything else. In the
current implementation, if a new command is received (or invalid input
with a newline), streaming is turned off and we go back to processing
commands.

Right now it doesn't make sense to use any other commands while
streaming, because it would be possible to separate the output. But if
we plan on adding more streaming/debug subcommands, then an "undebug
all" makes a lot of sense.

While streaming, it probably makes sense to disable commands like
"show sess" so they do not interfere with the streaming output.

> I would personally like to see 3 steps instead of 2 per session :
>     - incoming session accepted
>     - forward to server
>     - closed session to client
>
>   We could then have a single letter char instead of +/- which would report
>   the state (A/F/C) :
>
>    A 1 127.0.0.1:50869 - 127.0.0.1:9418
>    F 1 127.0.0.1:50869 - 127.0.0.1:9418 | 127.0.0.1:50870 - 127.0.0.1:6000
>    C 1 127.0.0.1:50869 - 127.0.0.1:9418

I like this.

I'm not sure how far to go in this direction, though. It would be
useful to include a lot of the details from the "show sess" command-
at the very least frontend/backend/srv names. It probably also makes
sense to include the id that "show sess <id>" uses, so you can lookup
detailed session information from an incoming event.

For my use case the bare minimum I needed was the proxied and
disconnected events, so that's where I started.

>> diff --git a/src/dumpstats.c b/src/dumpstats.c
>> index f680134..959b114 100644
>> --- a/src/dumpstats.c
>> +++ b/src/dumpstats.c
> (...)
>> +void stats_event_new_session(struct session *s)
>> +{
>> +     struct session *curr;
>> +     struct stream_interface *si;
>> +     int ret;
>> +
>> +     char addrs[4][INET6_ADDRSTRLEN + strlen(":65535") + 1] = {"","","",""};
>> +     int i, fd, port;
>> +     const void *addr;
>> +     struct sockaddr_storage sock;
>> +     socklen_t addr_size;
>> +
>> +     for(i = 0; i < 4; i++) {
>> +             fd = s->si[ i/2 ].fd;
>> +
>> +             addr_size = sizeof(sock);
>> +             if (!(i%2==0 ? getpeername : getsockname)(fd, (struct sockaddr 
>> *)&sock, &addr_size)) {
>
> Here it's a waste of CPU cycles to call getpeername/getsockname 4 times
> for addresses we generally already have. You have everything in the
> stream_interface's addr (.from, .to). The local address is not always
> filled, you have to check the session's flags for SN_FRT_ADDR_SET and
> SN_BCK_ADDR_SET and call the associated functions accordingly prior
> to use the addresses. The advantage is that the getpeername/getsockname
> are already performed once in all of a session's life.

I saw cli_addr was available in the session, but wasn't sure about the
others. I'll fix this up now that I know where the addresses are
stored, and call getpeername only once when necessary.

>
>> +                     switch (sock.ss_family) {
>> +                     case AF_INET:
>> +                             addr = (const void *)&((struct sockaddr_in 
>> *)&sock)->sin_addr;
>> +                             port = ntohs(((struct sockaddr_in 
>> *)&sock)->sin_port);
>> +                             inet_ntop(sock.ss_family, addr, addrs[i], 
>> sizeof(addrs[i]));
>> +                             snprintf(addrs[i], sizeof(addrs[i]), "%s:%d", 
>> addrs[i], port);
>> +                             break;
>> +
>> +                     case AF_INET6:
>> +                             addr = (const void *)&((struct sockaddr_in6 
>> *)&sock)->sin6_addr;
>> +                             port = ntohs(((struct sockaddr_in6 
>> *)&sock)->sin6_port);
>> +                             inet_ntop(sock.ss_family, addr, addrs[i], 
>> sizeof(addrs[i]));
>> +                             snprintf(addrs[i], sizeof(addrs[i]), "%s:%d", 
>> addrs[i], port);
>> +                             break;
>> +
>> +                     case AF_UNIX:
>> +                     default:
>> +                             sprintf(addrs[i], "%s", "unknown");
>
> In case of UNIX sockets, check how it's done in the logs, in short we log
> "unix:$listener_id" on the frontend and nothing right now on the backend,
> but if one day we support it it will probably be the destination socket
> path.

Cool. I'll copy the format from the logging code.

>
>> +                     }
>> +             }
>> +     }
>> +
>> +     if (LIST_ISEMPTY(&stats_event_listeners))
>> +             return;
>
> Hmmm.. this should have been put at the top of the function in order
> to save this expensive work when nobody is listening.

Yep, I caught this yesterday in some refactoring and it's already fixed.

>> +     /* no timeouts when streaming events */
>> +     if (si->st0 == STAT_CLI_EVENTS)
>> +             s->req->rex = TICK_ETERNITY;
>> +
>
> This is not appropriate in my opinion. 1) it's a specific use, and 2)
> it's already possible to set the timeout on the stats command line,
> so better increase the CLI timeout before starting the dump.

Is it possible to use "set timeout cli" to set an infinite timeout?

I'm not convinced it makes sense to have a timeout when streaming, but
if its possible to disable it beforehand I guess that's ok.

>
>>   out:
>>       DPRINTF(stderr, "%s@%d: st=%d, rqf=%x, rpf=%x, rql=%d, rqs=%d, rl=%d, 
>> rs=%d\n",
>>               __FUNCTION__, __LINE__,
>>               si->state, req->flags, res->flags, req->l, req->send_max, 
>> res->l, res->send_max);
>>
>>       if (unlikely(si->state == SI_ST_DIS || si->state == SI_ST_CLO)) {
>> +             stats_event_listener_remove(s);
>>               /* check that we have released everything then unregister */
>>               stream_int_unregister_handler(si);
>
> Normally you don't need this, you just need to add a .release callback
> in cli_applet (dumpstats.c) which does the work.

I can't seem to find the release callback you're referring to.

>
>> diff --git a/src/session.c b/src/session.c
>> index 6906656..ae100db 100644
>> --- a/src/session.c
>> +++ b/src/session.c
>> @@ -350,6 +350,8 @@ void sess_establish(struct session *s, struct 
>> stream_interface *si)
>>       rep->analysers |= s->fe->fe_rsp_ana | s->be->be_rsp_ana;
>>       rep->flags |= BF_READ_ATTACHED; /* producer is now attached */
>>       req->wex = TICK_ETERNITY;
>> +
>> +     stats_event_new_session(s);
>
> This call is expensive on every session establishment. I'd really like
> to see it guarded by "if (!LIST_ISEMPTY(&stats_event_listeners))".
> Alternatively you can rename your existing function 
> __stats_event_new_session()
> and have stats_event_new_session() be inline and add perform the test if you
> prefer, I don't care. What I don't want is to call a function most of the
> time for nothing.

I fixed this yesterday by adding an enabled flag and checking if
(unlikely(stats_event_enabled)) before the function call.

I like your idea of an inline wrapper, though.

> Thanks for doing this, it looks really appealing ! I'm impatient to
> see what realtime info everyone will think about ! I suspect that
> filtering on a frontend will be among the first requests in the
> wish list, quickly followed by source-mask filtering !

Thanks for taking the time to code review. I'm glad this is something
you think is useful and worth merging upstream.

I did a lot of cleanup yesterday, to refactor and bring the code in
line with the style guide. My branch is at
https://github.com/tmm1/haproxy/compare/session-events (or
https://github.com/tmm1/haproxy/compare/session-events.diff for just
the patch).

I'll incorporate your fixes above and email another patch for review.

As far as filtering, I already started on a fe/be filter with the
"show events [<name>]" syntax. This uses strcmp though, since I wasn't
sure how best to store each user's filters.
(https://github.com/tmm1/haproxy/compare/session-events...session-events-filter).

Filtering by event type would also be interesting, especially as the
number of events increases.

Finally, one thing I struggled with was the linked list of event
listeners. It went through a couple iterations and I think I have it
implemented sanely now, but if you could take a look and make sure I'm
not doing anything stupid that would be great.

  Aman

>
> Willy
>

Re: [PATCH] MEDIUM: stats: Add `show events` command.

Reply via email to