Re: Bug when passing variable to mapping function

2018-08-01 Thread Jarno Huuskonen
Hi,

On Tue, Jul 17, Emeric Brun wrote:
> > On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen  
> > wrote:
> >> On Thu, Jun 28, Jarno Huuskonen wrote:
> >>> I think this is the commit that breaks map_regm in this case:
> >>> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> >>> acls/maps thread safe).
> >>>
> >>> If I revert this commit from pattern.c:pattern_exec_match
> >>> then the map_regm \1 backref seems to work.
> >>
> >> I think I found what's replacing the \000 as first char:
> >> in (map.c) sample_conv_map:
> >> /* In the regm case, merge the sample with the input. */
> >> if ((long)private == PAT_MATCH_REGM) {
> >> str = get_trash_chunk();
> >> str->len = exp_replace(str->str, str->size, 
> >> smp->data.u.str.str,
> >>pat->data->u.str.str,
> >>(regmatch_t *)smp->ctx.a[0]);
> >>
> >> Before call to get_trash_chunk() smp->data.u.str.str is for example
> >> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> >> is '\000istri.com'.
> 
> Could you try the patch in attachment? i hope it will fix the issue

Sorry I've been away from keyboard. Just tested the patch w/1.8.12 and
for me the patch fixes the map_regm issue with Daniel's example config
(https://www.mail-archive.com/haproxy@formilux.org/msg30523.html).

Thanks,
-Jarno

-- 
Jarno Huuskonen



Re: Bug when passing variable to mapping function

2018-07-17 Thread Emeric Brun
Hi Jarno, and thanks Lukas

On 07/16/2018 07:27 AM, Lukas Tribus wrote:
> Hello,
> 
> 
> 
> On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen  wrote:
>>
>> Hi,
>>
>> On Thu, Jun 28, Jarno Huuskonen wrote:
>>> I think this is the commit that breaks map_regm in this case:
>>> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
>>> acls/maps thread safe).
>>>
>>> If I revert this commit from pattern.c:pattern_exec_match
>>> then the map_regm \1 backref seems to work.
>>
>> I think I found what's replacing the \000 as first char:
>> in (map.c) sample_conv_map:
>> /* In the regm case, merge the sample with the input. */
>> if ((long)private == PAT_MATCH_REGM) {
>> str = get_trash_chunk();
>> str->len = exp_replace(str->str, str->size, 
>> smp->data.u.str.str,
>>pat->data->u.str.str,
>>(regmatch_t *)smp->ctx.a[0]);
>>
>> Before call to get_trash_chunk() smp->data.u.str.str is for example
>> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
>> is '\000istri.com'.
>>
>> At the moment I don't have time to dig deeper, but hopefully this
>> helps a little bit.
> 
> Thanks for the detailed analysis, relaying to Emeric ...
> 
> 
> 
> Lukas
> 

Could you try the patch in attachment? i hope it will fix the issue

R,
Emeric
>From b6d8df3387a7ff9fe781d0315b0ee1de627679cf Mon Sep 17 00:00:00 2001
From: Emeric Brun 
Date: Tue, 17 Jul 2018 09:47:07 -0400
Subject: [PATCH] BUG/MINOR: map: fix map_regm with backref

Due to a cascade of get_trash_chunk calls the sample is
corrupted when we want to read it.

The fix consist to use a temporary chunk to copy the sample
value and use it.
---
 src/map.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/map.c b/src/map.c
index 0f1b754..3d8ec20 100644
--- a/src/map.c
+++ b/src/map.c
@@ -184,10 +184,27 @@ static int sample_conv_map(const struct arg *arg_p, struct sample *smp, void *pr
 		if (pat->data) {
 			/* In the regm case, merge the sample with the input. */
 			if ((long)private == PAT_MATCH_REGM) {
+struct chunk *tmptrash;
+
+/* Copy the content of the sample because it could
+   be scratched by incoming get_trash_chunk */
+tmptrash = alloc_trash_chunk();
+if (!tmptrash)
+	return 0;
+
+tmptrash->len = smp->data.u.str.len;
+if (tmptrash->len > (tmptrash->size-1))
+	tmptrash->len = tmptrash->size-1;
+
+memcpy(tmptrash->str, smp->data.u.str.str, tmptrash->len);
+tmptrash->str[tmptrash->len] = 0;
+
 str = get_trash_chunk();
-str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
+str->len = exp_replace(str->str, str->size, tmptrash->str,
    pat->data->u.str.str,
    (regmatch_t *)smp->ctx.a[0]);
+
+free_trash_chunk(tmptrash);
 if (str->len == -1)
 	return 0;
 smp->data.u.str = *str;
-- 
2.7.4



Re: Bug when passing variable to mapping function

2018-07-16 Thread Lukas Tribus
Hello,



On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen  wrote:
>
> Hi,
>
> On Thu, Jun 28, Jarno Huuskonen wrote:
> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size, 
> smp->data.u.str.str,
>pat->data->u.str.str,
>(regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.
>
> At the moment I don't have time to dig deeper, but hopefully this
> helps a little bit.

Thanks for the detailed analysis, relaying to Emeric ...



Lukas



Re: Bug when passing variable to mapping function

2018-07-09 Thread Daniel Schneller
Hi!

Thanks for your analysis. I was away for a few days, hence the late
response.

> AFAIK this works on 1.7.11 but seems to be broken on all 1.8.x.

Interesting. I tried the new config locally with 1.8 and found the bug.
Production servers are still on 1.7, so it would even have worked :)
I am glad, though, that I found this on 1.8, sparing me the trouble some
time down the road when they get updated.


> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size,
smp->data.u.str.str,
>pat->data->u.str.str,
>(regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.

I had a look at that code, but I must admit my understanding of the
concepts (trash chunk? some optimization, I assume?) and C as a language is
too limited to make a patch myself.
Is this on any of the developers' radar?

Thanks a lot :)

Daniel

On 29 June 2018 at 07:14, Jarno Huuskonen  wrote:

> Hi,
>
> On Thu, Jun 28, Jarno Huuskonen wrote:
> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size,
> smp->data.u.str.str,
>pat->data->u.str.str,
>(regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.
>
> At the moment I don't have time to dig deeper, but hopefully this
> helps a little bit.
>
> -Jarno
>
> --
> Jarno Huuskonen
>
>


-- 

-- 
Daniel Schneller
Principal Cloud Engineer

CenterDevice GmbH  | Hochstraße 11
   | 42697 Solingen
tel: +49 1754155711| Deutschland
daniel.schnel...@centerdevice.de   | www.centerdevice.de

Geschäftsführung: Dr. Patrick Peschlow, Dr. Lukas Pustina,
Michael Rosbach, Handelsregister-Nr.: HRB 18655,
HR-Gericht: Bonn, USt-IdNr.: DE-815299431


Re: Bug when passing variable to mapping function

2018-06-28 Thread Jarno Huuskonen
Hi,

On Thu, Jun 28, Jarno Huuskonen wrote:
> I think this is the commit that breaks map_regm in this case:
> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> acls/maps thread safe).
> 
> If I revert this commit from pattern.c:pattern_exec_match
> then the map_regm \1 backref seems to work.

I think I found what's replacing the \000 as first char:
in (map.c) sample_conv_map:
/* In the regm case, merge the sample with the input. */
if ((long)private == PAT_MATCH_REGM) {
str = get_trash_chunk();
str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
   pat->data->u.str.str,
   (regmatch_t *)smp->ctx.a[0]);

Before call to get_trash_chunk() smp->data.u.str.str is for example
'distri.com' and after get_trash_chunk() smp->data.u.str.str
is '\000istri.com'.

At the moment I don't have time to dig deeper, but hopefully this
helps a little bit.

-Jarno

-- 
Jarno Huuskonen



Re: Bug when passing variable to mapping function

2018-06-28 Thread Jarno Huuskonen
Hi,

On Mon, Jun 25, Daniel Schneller wrote:
> This is the contents of the map file:
>  hostmap.txt -
> ^(.*)\.(.*)$ \1
> --

Setting this to:
^(.*)\.(.*)$ \2

And I get
X-Distri-Mapped-From-Var: com

and with map_regm: ^(.*)\.(.*)\.(.*)$ \2.\3
(Host: www.distri.com)

I get X-Distri-Mapped-From-Var: distri.com

So it looks like only backref \1 has first char set to \000

> See the X-Distri-Mapped-From-Var header's value. It has what seems to be a 
> nul-byte
> instead of the first character of the domain name. The other X- headers
> before it are meant to narrow down where the bug actually happens.
> 
> It would appear that it is somehow related to passing a variable's value
> into the mapping function or its return from there. Interestingly, the
> issue does _not_ show when simply putting the variable value into a header
> (X-Distri-Direct-From-Var) or when calling the mapping function with the
> header lookup instead of the intermediate variable 
> (X-Distri-Mapped-From-Header).
> 
> 
> One more tidbit: If I change the mapping file to this:
> --
> ^(.*)\.(.*)$ a\1
> --
> 
> The generated header header changes to:
> --
> X-Distri-Mapped-From-Var: aaistri
> --
> 
> Looks like some off-by-one error?

AFAIK this works on 1.7.11 but seems to be broken on all 1.8.x.

I think this is the commit that breaks map_regm in this case:
b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
acls/maps thread safe).

If I revert this commit from pattern.c:pattern_exec_match
then the map_regm \1 backref seems to work.

-Jarno

-- 
Jarno Huuskonen



Bug when passing variable to mapping function

2018-06-25 Thread Daniel Schneller
Hi!

While playing around with map_regm I noticed some strange behavior when using 
variables
and map_regm. I managed to reduce it so a small test case and believe this is 
an actual bug.

It tested this on macOS, should it be relevant. haproxy is installed via 
homebrew:

- haproxy version ---
$ haproxy -vvv
HA-Proxy version 1.8.10-ec17d7a 2018/06/22
Copyright 2000-2018 Willy Tarreau 

Build options :
  TARGET  = generic
  CPU = generic
  CC  = clang
  CFLAGS  =
  OPTIONS = USE_ZLIB=1 USE_POLL=1 USE_KQUEUE=1 USE_THREAD=1 USE_OPENSSL=1 
USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2o  27 Mar 2018
Running on OpenSSL version : OpenSSL 1.0.2o  27 Mar 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using:
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.42 2018-03-20
Running on PCRE version : 8.42 2018-03-20
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), 
raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use kqueue.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
-


This is my haproxy configuration file:
--- bla.cfg -
defaults
mode http

frontend fe_test
bind 127.0.0.1:2

# Test Setup
# ---
# Remove port from Host header
http-request replace-value Host '(.*):.*' '\1'

# Store host header in variable
http-request set-var(txn.host) req.hdr(Host)
# ---


# Test cases:
# ---
# This works correctly
http-request set-var(txn.manual) str("distri")
http-request set-header X-Distri-Direct-From-Manual-Var %[var(txn.manual)]

# This works correctly
http-request set-header X-Distri-Mapped-From-Header 
%[req.hdr(Host),map_regm(hostmap.txt,"unknown"),lower]

# This works correctly
http-request set-header X-Distri-Direct-From-Var %[var(txn.host)]

# This breaks
http-request set-header X-Distri-Mapped-From-Var 
%[var(txn.host),map_regm(hostmap.txt,"unknown"),lower]

# ---

default_backend be_test

backend be_test
server s 127.0.0.1:8111
-

The sever is just a Python SimpleHTTPServer, dumping the request headers.


This is the contents of the map file:
 hostmap.txt -
^(.*)\.(.*)$ \1
--


This is the sample request I send:
-- request ---
$ curl -v http://127.0.0.1:2/example.txt -H 'Host: distri.com:1234'
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 2 (#0)
> GET /example.txt HTTP/1.1
> Host: distri.com:1234
> User-Agent: curl/7.54.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/2.7.10
< Date: Mon, 25 Jun 2018 14:15:41 GMT
< Content-type: text/plain
< Content-Length: 0
< Last-Modified: Mon, 25 Jun 2018 14:13:09 GMT
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
-


HAproxy is started in the Terminal with debug output:
- HAProxy Output 
$ haproxy -d -f bla.cfg
...
Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result FAILED
Total: 3 (2 usable), will use kqueue.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
Using kqueue() as the polling mechanism.
:fe_test.accept(0004)=0007 from [127.0.0.1:64880] ALPN=
:fe_test.clireq[0007:]: GET /example.txt HTTP/1.1
:fe_test.clihdr[0007:]: Host: distri.com:1234
:fe_test.clihdr[0007:]: User-Agent: curl/7.54.0
:fe_test.clihdr[0007:]: Accept: */*
:be_test.srvrep[0007:0008]: HTTP/1.0 200 OK
:be_test.srvhdr[0007:0008]: Server: SimpleHTTP/0.6 Python/2.7.10
:be_test.srvhdr[0007:0008]: Date: Mon, 25 Jun 2018 14:15:41 GMT
:be_test.srvhdr[0007:0008]: Content-type: text/plain
:be_test.srvhdr[0007:0008]: Content-Length: 0
:be_test.srvhdr[0007:0008]: Last-Modified: Mon, 25 Jun 2018 14:13:09 GMT
:be_test.srvcls[0007:adfd]
0001:fe_test.clicls[0007:]