Re: inspecting incoming tcp content

2014-03-04 Thread Thierry FOURNIER
On Tue, 4 Mar 2014 07:40:48 +0100
Willy Tarreau w...@1wt.eu wrote:

 Hi,
 
 On Mon, Mar 03, 2014 at 09:12:27PM +0100, PiBa-NL wrote:
  Hi,
  
  Im not sure if this is the exact issue that Anup was having, and maybe 
  i'm hijacking his thread, if so i'm sorry for that, but when try to 
  check how it works i also having difficulties getting it to work as i 
  expected it to.
  
  I'm using HAProxy v1.5dev21 on FreeBSD 8.3.
  
  Ive written in a frontend the following which checks for a GET web 
  request to determine which backend to use, this works..:
  mode tcp
  tcp-request inspect-delay 5s
  acl PAYLOADcheck req.payload(0,3) -m bin 474554
  use_backend web_80_tcp if PAYLOADcheck
  tcp-request content accept if PAYLOADcheck
  
  However when changing the match line to the following it fails:
  acl PAYLOADcheck req.payload(0,3) -m str GET
  or
  acl PAYLOADcheck req.payload(0,3) -m sub GET
  or
  acl PAYLOADcheck req.payload(0,3) -m reg -i GET
  
  The req.payload returns a piece of 'binary' data, but the 'compatibility 
  matrix' seems to say that converting for use with sub/reg/others should 
  not be an issue.
  
  Then the next step is of course to not match only the first 3 characters 
  but some content further in the 'middle' of the data stream..
  
  Am i missing something ? Or might there be an issue with the implementation?
 
 What you've done is absolutely correct. It is possible that there's a
 bug somewhere in the cast. I'm CCing Thierry who has a pending patch
 set of about 50 patches to rework ACLs (merge ACL+map and allow to update
 them on-the-fly) to ensure he checks this case.
 


The match bin get the configuration string 474554 and convert it as
the binary sequence GET. The match str get the configuration string
GET and use it as is.

The fetch req.payload() returns a binary content. When you try to
match with str method, the binary content is converted as string. The
converter produce string representing hexadecimal content: 474554.

If you write 

   acl PAYLOADcheck req.payload(0,3) -m str 474554

The system works perfectly.

This behavior is not intuitive. Maybe it can be change later.

Thierry





Re: inspecting incoming tcp content

2014-03-04 Thread Willy Tarreau
On Tue, Mar 04, 2014 at 04:51:56PM +0100, Thierry FOURNIER wrote:
 The match bin get the configuration string 474554 and convert it as
 the binary sequence GET. The match str get the configuration string
 GET and use it as is.
 
 The fetch req.payload() returns a binary content. When you try to
 match with str method, the binary content is converted as string. The
 converter produce string representing hexadecimal content: 474554.
 
 If you write 
 
acl PAYLOADcheck req.payload(0,3) -m str 474554
 
 The system works perfectly.
 
 This behavior is not intuitive. Maybe it can be change later.

Indeed, thank you for diagnosing this. Originally we chose to cast
bin to str as hex dump because it was only used in stick tables. But
now that we support other storage and usages, it becomes less and
less natural. I think we'll change this before the final release so
that bin automatically casts to str as-is and we'll add a tohex
converter for people who want to explicitly convert a bin to an hex
string.

Willy




Re: inspecting incoming tcp content

2014-03-04 Thread PiBa-NL

Ok seems to work now knowing this. Though it hase some side affects.

i could now match param=TEST using the following acl:
acl PAYLOADcheck req.payload(0,0) -m reg -i 706172616d3D54455354

Case insensitive matching works 'perfectly', but for the hex code (see 
the D and d above), but doesnt match different cases of letters which 
one would probably expect. So even though i use -i, if i use the word 
TEST in lower case it doesn't match anymore.


There might be a workaround for that with the ,lower option (i didnt 
confirm if that is applied before the hex conversion.)


Also the current documentation gives several examples which indicate a 
different working:


On systems where the regex library is much slower when using -i, it is 
possible to convert the sample to lowercase before matching, like this : 
acl script_tag payload(0,500),lower -m reg script


This doesn't work for detecting the text script  as its hex 
equivalent should be there, also if less than 500 bytes are send in the 
initial request it doesn't match at all.


So seems like this part of the manual could use a little more 
clarification. (Praise though for the overall completeness/clarity of 
the manual!)
Though if implementation now changes to match the manual, and possibly a 
additional tohex option that would be great. As its used on mode tcp 
certainly the option should exist to match binary/hex values that cannot 
be easily expressed with normal text. So the original design 
implementation does make sense, just not for 'textual' protocols.


Thanks for investigating.
PiBa-NL

Willy Tarreau schreef op 4-3-2014 17:28:

On Tue, Mar 04, 2014 at 04:51:56PM +0100, Thierry FOURNIER wrote:

The match bin get the configuration string 474554 and convert it as
the binary sequence GET. The match str get the configuration string
GET and use it as is.

The fetch req.payload() returns a binary content. When you try to
match with str method, the binary content is converted as string. The
converter produce string representing hexadecimal content: 474554.

If you write

acl PAYLOADcheck req.payload(0,3) -m str 474554

The system works perfectly.

This behavior is not intuitive. Maybe it can be change later.

Indeed, thank you for diagnosing this. Originally we chose to cast
bin to str as hex dump because it was only used in stick tables. But
now that we support other storage and usages, it becomes less and
less natural. I think we'll change this before the final release so
that bin automatically casts to str as-is and we'll add a tohex
converter for people who want to explicitly convert a bin to an hex
string.

Willy






Re: inspecting incoming tcp content

2014-03-04 Thread Willy Tarreau
On Wed, Mar 05, 2014 at 12:55:47AM +0100, PiBa-NL wrote:
 Ok seems to work now knowing this. Though it hase some side affects.
 
 i could now match param=TEST using the following acl:
 acl PAYLOADcheck req.payload(0,0) -m reg -i 706172616d3D54455354
 
 Case insensitive matching works 'perfectly', but for the hex code (see 
 the D and d above), but doesnt match different cases of letters which 
 one would probably expect. So even though i use -i, if i use the word 
 TEST in lower case it doesn't match anymore.

Indeed, you'd have to match it this way in order to match the
input bytes, not the hex string :

 acl PAYLOADcheck req.payload(0,0) -m reg -i 
[57]0[46]1[57]2[46]1[46]d3D[57]4[46]5[57]3[57]4

 There might be a workaround for that with the ,lower option (i didnt 
 confirm if that is applied before the hex conversion.)

Yes it would be much easier. The way the match is done is :

  1) sample fetch function. Here, it is req.payload().
  2) converters. Here none, unless you add ,lower
  3) cast to the input type of the ACL match (here, reg takes a string
 so it remains the same)
  4) execution of the match function (here reg) for all patterns.

 Also the current documentation gives several examples which indicate a 
 different working:
 
 On systems where the regex library is much slower when using -i, it is 
 possible to convert the sample to lowercase before matching, like this : 
 acl script_tag payload(0,500),lower -m reg script
 
 This doesn't work for detecting the text script  as its hex 
 equivalent should be there, also if less than 500 bytes are send in the 
 initial request it doesn't match at all.

You're absolutely right. We really need to change this confusing behaviour
before the release. I'm sure we'll break one or two setups, but it we're
still in the development phase until we release, and the fix will be
trivial.

 So seems like this part of the manual could use a little more 
 clarification. (Praise though for the overall completeness/clarity of 
 the manual!)

I tend to consider that the doc is the reference which people use to
write their confs. So when something has never been working properly,
I prefer to make the code work as documented than fix the doc.

 Though if implementation now changes to match the manual, and possibly a 
 additional tohex option that would be great.

Yes it will be necessary so that the very few users (if any) who rely on
the current behaviour can fix their configs.

 As its used on mode tcp 
 certainly the option should exist to match binary/hex values that cannot 
 be easily expressed with normal text. So the original design 
 implementation does make sense, just not for 'textual' protocols.

I agree.

Thanks,
Willy




Re: inspecting incoming tcp content

2014-03-03 Thread PiBa-NL

Hi,

Im not sure if this is the exact issue that Anup was having, and maybe 
i'm hijacking his thread, if so i'm sorry for that, but when try to 
check how it works i also having difficulties getting it to work as i 
expected it to.


I'm using HAProxy v1.5dev21 on FreeBSD 8.3.

Ive written in a frontend the following which checks for a GET web 
request to determine which backend to use, this works..:

mode tcp
tcp-request inspect-delay 5s
acl PAYLOADcheck req.payload(0,3) -m bin 474554
use_backend web_80_tcp if PAYLOADcheck
tcp-request content accept if PAYLOADcheck

However when changing the match line to the following it fails:
acl PAYLOADcheck req.payload(0,3) -m str GET
or
acl PAYLOADcheck req.payload(0,3) -m sub GET
or
acl PAYLOADcheck req.payload(0,3) -m reg -i GET

The req.payload returns a piece of 'binary' data, but the 'compatibility 
matrix' seems to say that converting for use with sub/reg/others should 
not be an issue.


Then the next step is of course to not match only the first 3 characters 
but some content further in the 'middle' of the data stream..


Am i missing something ? Or might there be an issue with the implementation?

This is currently only for finding if and how that req.payload check can 
be used. Of course using 'mode http' would be much better for this 
purpose when running http traffic, but that isn't the purpose of this 
question..


Ive spoken on irc with mculp who was trying something similar but 
couldnt get it to work either, and seen a previous question 
http://comments.gmane.org/gmane.comp.web.haproxy/11942 which seems to 
have gone without a final solution as well.


So the question is, is this possible or might there be some issues in 
'converting' the checks?

Thanks for your time.

Greets PiBa-NL

Baptiste schreef op 28-2-2014 10:57:

Hi,

and where is your problem exactly?

Baptiste

On Tue, Feb 25, 2014 at 7:39 AM, anup katariya anup.katar...@gmail.com wrote:

Hi,

I wanted to inspect incoming tcp request. I wanted to something like below

payload(0, 100) match with string like 49=ABC.

Thanks,
Anup








Re: inspecting incoming tcp content

2014-03-03 Thread Willy Tarreau
Hi,

On Mon, Mar 03, 2014 at 09:12:27PM +0100, PiBa-NL wrote:
 Hi,
 
 Im not sure if this is the exact issue that Anup was having, and maybe 
 i'm hijacking his thread, if so i'm sorry for that, but when try to 
 check how it works i also having difficulties getting it to work as i 
 expected it to.
 
 I'm using HAProxy v1.5dev21 on FreeBSD 8.3.
 
 Ive written in a frontend the following which checks for a GET web 
 request to determine which backend to use, this works..:
 mode tcp
 tcp-request inspect-delay 5s
 acl PAYLOADcheck req.payload(0,3) -m bin 474554
 use_backend web_80_tcp if PAYLOADcheck
 tcp-request content accept if PAYLOADcheck
 
 However when changing the match line to the following it fails:
 acl PAYLOADcheck req.payload(0,3) -m str GET
 or
 acl PAYLOADcheck req.payload(0,3) -m sub GET
 or
 acl PAYLOADcheck req.payload(0,3) -m reg -i GET
 
 The req.payload returns a piece of 'binary' data, but the 'compatibility 
 matrix' seems to say that converting for use with sub/reg/others should 
 not be an issue.
 
 Then the next step is of course to not match only the first 3 characters 
 but some content further in the 'middle' of the data stream..
 
 Am i missing something ? Or might there be an issue with the implementation?

What you've done is absolutely correct. It is possible that there's a
bug somewhere in the cast. I'm CCing Thierry who has a pending patch
set of about 50 patches to rework ACLs (merge ACL+map and allow to update
them on-the-fly) to ensure he checks this case.

Thanks,
Willy




Re: inspecting incoming tcp content

2014-02-28 Thread Baptiste
Hi,

and where is your problem exactly?

Baptiste

On Tue, Feb 25, 2014 at 7:39 AM, anup katariya anup.katar...@gmail.com wrote:
 Hi,

 I wanted to inspect incoming tcp request. I wanted to something like below

 payload(0, 100) match with string like 49=ABC.

 Thanks,
 Anup