Hi.
Looks like it works as designed because currently are used the "str*cmp" functions for matching. Your solution with hex convert looks like how the '\0' byte issue could be fixed. http://git.haproxy.org/?p=haproxy-2.0.git;a=blob;f=src/pattern.c;hb=6d9a455da17251c34d3c552f2a963447f52fdd80#l724 HAProxy can trie to use memcmp but the "sub" match will then work different. https://stackoverflow.com/questions/13095513/what-is-the-difference-between-memcmp-strcmp-and-strncmp-in-c/13095574#13095574 >From binary point of view is '\0<tag>' not the same as '<tag>'? Maybe the mentioned https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm in the code will fix this behavior but as far as I known no one works on such a patch. It would be nice when you send us a patch to fix the doc. Regards Aleks Nov 30, 2019 11:35:24 AM Mathias Weiersmüller (cyberheads GmbH) <[email protected]>: > (CCing Thierry Fournier as maintainer of the pattern matching part) > > > > We use HAProxy in TCP Mode for non-HTTP protocols. > > > > The request of one particular protocol looks like this: > > > > - length of message (binary value, 4 bytes long) > > > - binary part (40-200 bytes) > > - XML part > > > > Goal: We want to use a particular backend when the XML part of the request > > contains the string "<tag>". > > > > We used this ACL: > > acl tag_found req.payload(0,0) -m sub <tag> > > > > The problem: > > The substring matching stops on a Null byte (\0) in a binary fetch. We > > always have this case (the request normally starts with Null > > bytes). Therefore, the match never succeeds. As there might be null bytes > > in the binary part too, we cannot just start the payload > > fetch > > after byte 4. > > > > ========================== > > frontend fe_test > > bind *:3000 > > > > tcp-request inspect-delay 5s > > > > acl content_present req_len gt 0 > > acl tag_found req.payload(0,0) -m sub <tag> > > > > tcp-request content accept if content_present > > tcp-request content reject > > > > # depending on if the payload contains the string "<tag>", we use different > > backends > > # right now, the two backends are exactly the same. > > use_backend be_tag if tag_found > > default_backend be_default > > > > backend be_tag > > server srv_1:4000 > > > > backend be_default > > server srv_1:4000 > > > > Test cases: > > (tested on versions 2.0.10, 1.5.18) > > echo -e '<tag>' | nc 127.0.0.1 3000 # will use backend be_tag > > echo -e '\0<tag>' | nc 127.0.0.1 3000 # will use backend be_default, but > > should use be_tag > > ========================== > > > > Workaround: > > =>convert payload into hexified string, parse against hex: > > acl tag_found req.payload(0,0),hex -m sub 3C7461673E # this is <tag> in > > hexadecimal > > > > Dear list members, these are the questions I am twisting my mind with. Do > > you have a good take one these? > > > > - Is there another (better) way to do a substring match on a payload which > > contains Null bytes? > > - Would another, new match method make sense here (something like sub_bin ? > > ) > > - Do we run into a problem with the hex conversion because the size of the > > sample has double the size than the original (maybe > > bigger than bufsize?) > > > > > > If this behavior is intended, then the configuration manual (7.1.3 Matching > strings) should be updated to reflect this: > > Do not use string matches for binary fetches which might contain null bytes > (0x00), > as the comparison stops at the occurrence of the first null byte. Instead, > convert > the binary fetch to a hex string with the hex converter first. > > Example: > acl tag_found req.payload(0,0),hex -m sub 3C7461673E # this is <tag> in > hexadecimal > > Does that make sense? > > Best regards > > Mathias >

