Re: inspecting incoming tcp content
On Tue, 4 Mar 2014 07:40:48 +0100 Willy Tarreau w...@1wt.eu wrote: Hi, On Mon, Mar 03, 2014 at 09:12:27PM +0100, PiBa-NL wrote: Hi, Im not sure if this is the exact issue that Anup was having, and maybe i'm hijacking his thread, if so i'm sorry for that, but when try to check how it works i also having difficulties getting it to work as i expected it to. I'm using HAProxy v1.5dev21 on FreeBSD 8.3. Ive written in a frontend the following which checks for a GET web request to determine which backend to use, this works..: mode tcp tcp-request inspect-delay 5s acl PAYLOADcheck req.payload(0,3) -m bin 474554 use_backend web_80_tcp if PAYLOADcheck tcp-request content accept if PAYLOADcheck However when changing the match line to the following it fails: acl PAYLOADcheck req.payload(0,3) -m str GET or acl PAYLOADcheck req.payload(0,3) -m sub GET or acl PAYLOADcheck req.payload(0,3) -m reg -i GET The req.payload returns a piece of 'binary' data, but the 'compatibility matrix' seems to say that converting for use with sub/reg/others should not be an issue. Then the next step is of course to not match only the first 3 characters but some content further in the 'middle' of the data stream.. Am i missing something ? Or might there be an issue with the implementation? What you've done is absolutely correct. It is possible that there's a bug somewhere in the cast. I'm CCing Thierry who has a pending patch set of about 50 patches to rework ACLs (merge ACL+map and allow to update them on-the-fly) to ensure he checks this case. The match bin get the configuration string 474554 and convert it as the binary sequence GET. The match str get the configuration string GET and use it as is. The fetch req.payload() returns a binary content. When you try to match with str method, the binary content is converted as string. The converter produce string representing hexadecimal content: 474554. If you write acl PAYLOADcheck req.payload(0,3) -m str 474554 The system works perfectly. This behavior is not intuitive. Maybe it can be change later. Thierry
Re: inspecting incoming tcp content
On Tue, Mar 04, 2014 at 04:51:56PM +0100, Thierry FOURNIER wrote: The match bin get the configuration string 474554 and convert it as the binary sequence GET. The match str get the configuration string GET and use it as is. The fetch req.payload() returns a binary content. When you try to match with str method, the binary content is converted as string. The converter produce string representing hexadecimal content: 474554. If you write acl PAYLOADcheck req.payload(0,3) -m str 474554 The system works perfectly. This behavior is not intuitive. Maybe it can be change later. Indeed, thank you for diagnosing this. Originally we chose to cast bin to str as hex dump because it was only used in stick tables. But now that we support other storage and usages, it becomes less and less natural. I think we'll change this before the final release so that bin automatically casts to str as-is and we'll add a tohex converter for people who want to explicitly convert a bin to an hex string. Willy
Re: inspecting incoming tcp content
Ok seems to work now knowing this. Though it hase some side affects. i could now match param=TEST using the following acl: acl PAYLOADcheck req.payload(0,0) -m reg -i 706172616d3D54455354 Case insensitive matching works 'perfectly', but for the hex code (see the D and d above), but doesnt match different cases of letters which one would probably expect. So even though i use -i, if i use the word TEST in lower case it doesn't match anymore. There might be a workaround for that with the ,lower option (i didnt confirm if that is applied before the hex conversion.) Also the current documentation gives several examples which indicate a different working: On systems where the regex library is much slower when using -i, it is possible to convert the sample to lowercase before matching, like this : acl script_tag payload(0,500),lower -m reg script This doesn't work for detecting the text script as its hex equivalent should be there, also if less than 500 bytes are send in the initial request it doesn't match at all. So seems like this part of the manual could use a little more clarification. (Praise though for the overall completeness/clarity of the manual!) Though if implementation now changes to match the manual, and possibly a additional tohex option that would be great. As its used on mode tcp certainly the option should exist to match binary/hex values that cannot be easily expressed with normal text. So the original design implementation does make sense, just not for 'textual' protocols. Thanks for investigating. PiBa-NL Willy Tarreau schreef op 4-3-2014 17:28: On Tue, Mar 04, 2014 at 04:51:56PM +0100, Thierry FOURNIER wrote: The match bin get the configuration string 474554 and convert it as the binary sequence GET. The match str get the configuration string GET and use it as is. The fetch req.payload() returns a binary content. When you try to match with str method, the binary content is converted as string. The converter produce string representing hexadecimal content: 474554. If you write acl PAYLOADcheck req.payload(0,3) -m str 474554 The system works perfectly. This behavior is not intuitive. Maybe it can be change later. Indeed, thank you for diagnosing this. Originally we chose to cast bin to str as hex dump because it was only used in stick tables. But now that we support other storage and usages, it becomes less and less natural. I think we'll change this before the final release so that bin automatically casts to str as-is and we'll add a tohex converter for people who want to explicitly convert a bin to an hex string. Willy
Re: inspecting incoming tcp content
On Wed, Mar 05, 2014 at 12:55:47AM +0100, PiBa-NL wrote: Ok seems to work now knowing this. Though it hase some side affects. i could now match param=TEST using the following acl: acl PAYLOADcheck req.payload(0,0) -m reg -i 706172616d3D54455354 Case insensitive matching works 'perfectly', but for the hex code (see the D and d above), but doesnt match different cases of letters which one would probably expect. So even though i use -i, if i use the word TEST in lower case it doesn't match anymore. Indeed, you'd have to match it this way in order to match the input bytes, not the hex string : acl PAYLOADcheck req.payload(0,0) -m reg -i [57]0[46]1[57]2[46]1[46]d3D[57]4[46]5[57]3[57]4 There might be a workaround for that with the ,lower option (i didnt confirm if that is applied before the hex conversion.) Yes it would be much easier. The way the match is done is : 1) sample fetch function. Here, it is req.payload(). 2) converters. Here none, unless you add ,lower 3) cast to the input type of the ACL match (here, reg takes a string so it remains the same) 4) execution of the match function (here reg) for all patterns. Also the current documentation gives several examples which indicate a different working: On systems where the regex library is much slower when using -i, it is possible to convert the sample to lowercase before matching, like this : acl script_tag payload(0,500),lower -m reg script This doesn't work for detecting the text script as its hex equivalent should be there, also if less than 500 bytes are send in the initial request it doesn't match at all. You're absolutely right. We really need to change this confusing behaviour before the release. I'm sure we'll break one or two setups, but it we're still in the development phase until we release, and the fix will be trivial. So seems like this part of the manual could use a little more clarification. (Praise though for the overall completeness/clarity of the manual!) I tend to consider that the doc is the reference which people use to write their confs. So when something has never been working properly, I prefer to make the code work as documented than fix the doc. Though if implementation now changes to match the manual, and possibly a additional tohex option that would be great. Yes it will be necessary so that the very few users (if any) who rely on the current behaviour can fix their configs. As its used on mode tcp certainly the option should exist to match binary/hex values that cannot be easily expressed with normal text. So the original design implementation does make sense, just not for 'textual' protocols. I agree. Thanks, Willy
Re: inspecting incoming tcp content
Hi, Im not sure if this is the exact issue that Anup was having, and maybe i'm hijacking his thread, if so i'm sorry for that, but when try to check how it works i also having difficulties getting it to work as i expected it to. I'm using HAProxy v1.5dev21 on FreeBSD 8.3. Ive written in a frontend the following which checks for a GET web request to determine which backend to use, this works..: mode tcp tcp-request inspect-delay 5s acl PAYLOADcheck req.payload(0,3) -m bin 474554 use_backend web_80_tcp if PAYLOADcheck tcp-request content accept if PAYLOADcheck However when changing the match line to the following it fails: acl PAYLOADcheck req.payload(0,3) -m str GET or acl PAYLOADcheck req.payload(0,3) -m sub GET or acl PAYLOADcheck req.payload(0,3) -m reg -i GET The req.payload returns a piece of 'binary' data, but the 'compatibility matrix' seems to say that converting for use with sub/reg/others should not be an issue. Then the next step is of course to not match only the first 3 characters but some content further in the 'middle' of the data stream.. Am i missing something ? Or might there be an issue with the implementation? This is currently only for finding if and how that req.payload check can be used. Of course using 'mode http' would be much better for this purpose when running http traffic, but that isn't the purpose of this question.. Ive spoken on irc with mculp who was trying something similar but couldnt get it to work either, and seen a previous question http://comments.gmane.org/gmane.comp.web.haproxy/11942 which seems to have gone without a final solution as well. So the question is, is this possible or might there be some issues in 'converting' the checks? Thanks for your time. Greets PiBa-NL Baptiste schreef op 28-2-2014 10:57: Hi, and where is your problem exactly? Baptiste On Tue, Feb 25, 2014 at 7:39 AM, anup katariya anup.katar...@gmail.com wrote: Hi, I wanted to inspect incoming tcp request. I wanted to something like below payload(0, 100) match with string like 49=ABC. Thanks, Anup
Re: inspecting incoming tcp content
Hi, On Mon, Mar 03, 2014 at 09:12:27PM +0100, PiBa-NL wrote: Hi, Im not sure if this is the exact issue that Anup was having, and maybe i'm hijacking his thread, if so i'm sorry for that, but when try to check how it works i also having difficulties getting it to work as i expected it to. I'm using HAProxy v1.5dev21 on FreeBSD 8.3. Ive written in a frontend the following which checks for a GET web request to determine which backend to use, this works..: mode tcp tcp-request inspect-delay 5s acl PAYLOADcheck req.payload(0,3) -m bin 474554 use_backend web_80_tcp if PAYLOADcheck tcp-request content accept if PAYLOADcheck However when changing the match line to the following it fails: acl PAYLOADcheck req.payload(0,3) -m str GET or acl PAYLOADcheck req.payload(0,3) -m sub GET or acl PAYLOADcheck req.payload(0,3) -m reg -i GET The req.payload returns a piece of 'binary' data, but the 'compatibility matrix' seems to say that converting for use with sub/reg/others should not be an issue. Then the next step is of course to not match only the first 3 characters but some content further in the 'middle' of the data stream.. Am i missing something ? Or might there be an issue with the implementation? What you've done is absolutely correct. It is possible that there's a bug somewhere in the cast. I'm CCing Thierry who has a pending patch set of about 50 patches to rework ACLs (merge ACL+map and allow to update them on-the-fly) to ensure he checks this case. Thanks, Willy
Re: inspecting incoming tcp content
Hi, and where is your problem exactly? Baptiste On Tue, Feb 25, 2014 at 7:39 AM, anup katariya anup.katar...@gmail.com wrote: Hi, I wanted to inspect incoming tcp request. I wanted to something like below payload(0, 100) match with string like 49=ABC. Thanks, Anup