On Tue, 8 Sep 2009, Willy Tarreau wrote:

On Mon, Sep 07, 2009 at 10:47:48PM +0200, Krzysztof Oledzki wrote:
However, I found that it was hard to understand the status codes in
the HTML stats page. Some people are already complaining about columns
they don't trivially understand, but here I think that status codes
are close to cryptic. Also, while it is not *that* hard to tell
which one means what when you can compare all of them, they must be
unambiguous when found individually.

I could propose some adjustments, but it's just a proposal, I'm open
to other suggestions :

Currently there are 9 status:
UNK       -> unknown

From my tests, this means this one is not tested.

Or not tested yet - when haproxy starts.

Ah OK. Maybe we'd need an additional 'INI' state then ?

Good idea.

(...)
L14OK     -> check passed on layer 1-4, no upper layers testing enabled
L14TMOUT  -> layer 1-4 timeout
L14UNR    -> layer 1-4 unreachable, for example
        "Connection refused" (tcp rst) or "No route to host" (icmp)

I'd suggest "L4" instead of "L14". And also after all, layer 4 is what
we're aware of, and people are used to associate it to TCP (eventhough
TCP is rather layer 5 if we're purist).

Are you sure? Last time when I checked TCP was L4,

Problem is, TCP/IP doesn't perfectly fit the ISO model. Everyone seems
to agree that IP is layer 3, and so conclude that anything above IP is
layer 4. However, while I have no problem saying that UDP is layer 4
(it contains port information), TCP is higher as it has session states
which are a very important part of the protocol, and session is layer 5.
Some (rare) firewall vendors who correctly follow TCP sequence numbers
even claim to have layer 5 firewalls as opposed to all other ones which
are more layer 4 (basically support port+flags).

Yep, it is indeed very confusing. However, for me L4 provides end-to-end connections and reliability and there is no reliability in tcp without sequence numbers. L5 (session) is defined as "interhost communication" and covers protocols like SSH for example, which is definitely above TCP. Maybe the name - "session" is misleading? Anyway, I have no intention in startining an academic discussion, so...

and for example ICMP
(No route to host) is L3

it depends for whom. Due to the reason above, some people see it at
3, others at 4 because it's above IP, and others say "3.5" because UDP
and TCP rely on it too.

and can report problems in for example L2, like
ARP resolution issues.

same here, you can hear about layer 2 because of MAC addresses, L3
because of IP addresses and routing tables checking to decide
whether to reply or not, or layer "2.5" because IP relies on it.

So, I would rather stay with 1-4 if that is OK for you.

I find it a lot more confusing. We all know that layer 4 cannot work
if lower layers don't either. So either L4 is OK, and it implies that
all lower layers are OK, or L4 is not OK and we can't say anything
about lower layers. However, I find it interesting to have L4OK,
because if later we support ping, we'll be able to report L3OK for
example.

... OK, let's agree that L4OK means L1-4 is OK and TCP works.

Same principle here, I'd suggest L7 instead of L57 for the same reasons
as above. Let's shorten the TMOUT one too. We could also turn L57INVRSP
into L7INV.

SSL/TLS is for example L6 and one day we may support checks for L5, like
for example SSH/SCP or RTCP.

once again it depends for whom. For some people, a transport layer is
layer4, whether it's clear or ciphered. For others, there's some encoding
notion so they'd rather tell layer 6. It's all a matter of embedding low
layers into higher ones, which is not covered by the OSI model.

It depends if you want to blindly translate 4 layers of TCP/IP into 7 layers of OSI/ISO or not. Most people do it that way, so we have L3.5 for ICMP or L2.5 for ARP, but it does not mean we have to do it the same way.

After all, when you make an openvpn tunnel work on top of an HTTP session across an HTTP proxy, you provide layer 3 on top of some layer 6/7 itself on top of another layer 7...

Exactly - it is L3 over L6/7 over L7.

Anyway, what I find commonly understood by people in the field is that
L3 is rounting only, L4 is routing+ports and L7 is anything implying a
payload. This is even how products are advertised and it sort of makes
sense, eventhough it's far from being exact.

Product's names and definitions are very often mangled by marketing departments so we end up with "hardware raid" provided by windows drivers or 24bit LCD displays that are rather 18-19 bits + interpolation. I think we should tend to focus on the primary definition.

But saying that L7 works
means that we could talk. L4 works means we can connect. L3 works means
we can reach.

OK, but how about pure SSL/TLS? I don't think we should call it L7. However, we may add an additional L6{OK,TOUT,INV} states and teach the code to use it when necessary, what do you think?

During my tests, I got a SOCKERR while the server was stopped. I
think it's an asynchronous "connection refused" which triggered
this error, probably here (but I may be wrong) :

@@ -370,8 +468,10 @@ static int event_srv_chk_w(int fd)
                        }
                        else if (ret == 0 || errno == EAGAIN)
                                goto out_poll;
-                       else
+                       else {
+                               set_server_check_status(s,
HCHK_STATUS_SOCKERR);
                                goto out_error;
+                       }
                }

How it is possible to reproduce this situation?

I found it by sending checks on 127.0.0.1 to a port without
anyone listening. If you don't have time, I can try to reproduce
and fire strace at it to figure out what errno is returned.

Thanks, I'll try to reproduce it.

Best regards,

                        Krzysztof Olędzki

Reply via email to