Re: [PATCH] JWT payloads break b64dec convertor

2021-04-13 Thread Willy Tarreau
On Tue, Apr 13, 2021 at 04:44:38PM +0200, Moemen MHEDHBI wrote:
> > But then how about having just *your* functions
> > without relying on the other ones ? Now that you've extended the existing
> > function, you can declare it yours, remove all the configurable stuff and
> > keep the simplified version as the one you need. I'm sure it will be the
> > best tradeoff overall.
> >
> 
> Yes that makes sense to me too, the attached patch deal with it as
> suggested.

Yeah much cleaner now, indeed. I've merged it, thank you!

Willy



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-13 Thread Moemen MHEDHBI

On 13/04/2021 11:39, Willy Tarreau wrote:

>> You can find attached the patches 0001-bis and 0002-bis modifying the
>> existing functions (introducing an url flag) to see how it looks like.
>> This solution may be cleaner (no chunk allocation and we don't loop
>> twice over input string) but has the drawbacks of being intrusive with
>> the rest of the code and less clearer imo regarding how url variant is
>> different from standard base64.
> 
> I agree they're not pretty due to the change of logic around the padding,
> thanks for having tested! But then how about having just *your* functions
> without relying on the other ones ? Now that you've extended the existing
> function, you can declare it yours, remove all the configurable stuff and
> keep the simplified version as the one you need. I'm sure it will be the
> best tradeoff overall.
>

Yes that makes sense to me too, the attached patch deal with it as
suggested.


>> diff --git a/src/base64.c b/src/base64.c
>> index 53e4d65b2..f2768d980 100644
>> --- a/src/base64.c
>> +++ b/src/base64.c
>> @@ -1,5 +1,5 @@
>>  /*
>> - * ASCII <-> Base64 conversion as described in RFC1421.
>> + * ASCII <-> Base64 conversion as described in RFC4648.
>>   *
>>   * Copyright 2006-2010 Willy Tarreau 
>>   * Copyright 2009-2010 Krzysztof Piotr Oledzki 
>> @@ -17,50 +17,70 @@
>>  #include 
>>  #include 
>>  
>> -#define B64BASE '#' /* arbitrary chosen base value */
>> -#define B64CMIN '+'
>> -#define B64CMAX 'z'
>> -#define B64PADV 64  /* Base64 chosen special pad value */
>> +#define B64BASE  '#'   /* arbitrary chosen base value */
>> +#define B64CMIN  '+'
>> +#define UB64CMIN '-'
>> +#define B64CMAX  'z'
>> +#define B64PADV  64   /* Base64 chosen special pad value */
> 
> Please do not needlessly reindent code parts for no reason. It seems that
> only the "-" was added there, the rest shouldn't change.

The reason is I was following the doc/coding-style where alignment
should use spaces, but since the existing bloc was aligned via tabs, I
thought about fixing that instead of repeating the issue. I understand
though that in such case better have this in separate commit, so I have
stuck with the tabs alignment.

> By the way, contrib/ was move to dev/ during your changes so if you keep
> this comment please update it.

Done.

On 13/04/2021 08:19, Jarno Huuskonen wrote:
> Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in
> configuration.txt.

Done thanks.


-- 
Moemen
>From b526416364b98afaa2d2b421fbf27f80bc4e8732 Mon Sep 17 00:00:00 2001
From: Moemen MHEDHBI 
Date: Fri, 2 Apr 2021 01:05:07 +0200
Subject: [PATCH 2/2] CLEANUP: align samples list in sample.c

---
 src/sample.c | 54 ++--
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index 04635a91f..7337ba06a 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -4129,33 +4129,33 @@ INITCALL1(STG_REGISTER, sample_register_fetches, _kws);
 
 /* Note: must not be declared  as its list will be overwritten */
 static struct sample_conv_kw_list sample_conv_kws = {ILH, {
-	{ "debug",  sample_conv_debug, ARG2(0,STR,STR), smp_check_debug, SMP_T_ANY,  SMP_T_ANY },
-	{ "b64dec", sample_conv_base642bin,0,NULL, SMP_T_STR,  SMP_T_BIN  },
-	{ "base64", sample_conv_bin2base64,0,NULL, SMP_T_BIN,  SMP_T_STR  },
-	{ "ub64dec", sample_conv_base64url2bin,0,NULL, SMP_T_STR,  SMP_T_BIN  },
-	{ "ub64enc", sample_conv_bin2base64url,0,NULL, SMP_T_BIN,  SMP_T_STR  },
-	{ "upper",  sample_conv_str2upper, 0,NULL, SMP_T_STR,  SMP_T_STR  },
-	{ "lower",  sample_conv_str2lower, 0,NULL, SMP_T_STR,  SMP_T_STR  },
-	{ "length", sample_conv_length,0,NULL, SMP_T_STR,  SMP_T_SINT },
-	{ "hex",sample_conv_bin2hex,   0,NULL, SMP_T_BIN,  SMP_T_STR  },
-	{ "hex2i",  sample_conv_hex2int,   0,NULL, SMP_T_STR,  SMP_T_SINT },
-	{ "ipmask", sample_conv_ipmask,ARG2(1,MSK4,MSK6), NULL, SMP_T_ADDR, SMP_T_IPV4 },
-	{ "ltime",  sample_conv_ltime, ARG2(1,STR,SINT), NULL, SMP_T_SINT, SMP_T_STR },
-	{ "utime",  sample_conv_utime, ARG2(1,STR,SINT), NULL, SMP_T_SINT, SMP_T_STR },
-	{ "crc32",  sample_conv_crc32, ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "crc32c", sample_conv_crc32c,ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "djb2",   sample_conv_djb2,  ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "sdbm",   sample_conv_sdbm,  ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "wt6",sample_conv_wt6,   ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "xxh3",   sample_conv_xxh3,  ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "xxh32",  sample_conv_xxh32, ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "xxh64",  sample_conv_xxh64, ARG1(0,SINT), NULL, SMP_T_BIN,  SMP_T_SINT  },
-	{ "json",   sample_conv_json,  ARG1(1,STR),  

Re: [PATCH] JWT payloads break b64dec convertor

2021-04-13 Thread Willy Tarreau
Hi Jarno,

On Tue, Apr 13, 2021 at 06:19:47AM +, Jarno Huuskonen wrote:
> Hello,
> 
> On Tue, 2021-04-06 at 01:58 +0200, Moemen MHEDHBI wrote:
> > Thanks Willy and Tim for your feedback.
> > 
> > You can find attached the updated patches with fixed coding style (now
> > set correctly in my editor), updated commit message, entry doc in sorted
> > order, size_t instead of int in both enc/dec  and corresponding reg-test.
> 
> Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in
> configuration.txt. Something like:
> --- a/doc/configuration.txt
> +++ b/doc/configuration.txt
> @@ -15020,11 +15020,14 @@ and()
>  b64dec
>Converts (decodes) a base64 encoded input string to its binary
>representation. It performs the inverse operation of base64().
> +  For base64url("URL and Filename Safe Alphabet" (RFC 4648)) variant
> +  see "ub64dec".
>  
>  base64
>Converts a binary input sample to a base64 string. It is used to log or
>transfer binary content in a way that can be reliably transferred (e.g.
> -  an SSL ID can be copied in a header).
> +  an SSL ID can be copied in a header). For base64url("URL and Filename
> Safe
> +  Alphabet" (RFC 4648)) variant see "ub64enc".
>  
>  bool
>Returns a boolean TRUE if the input value of type signed integer is

Yes very good point indeed, and similarly the ub64 ones should refer to
these ones.

Willy



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-13 Thread Willy Tarreau
Hi Moemen,

On Tue, Apr 13, 2021 at 12:41:39AM +0200, Moemen MHEDHBI wrote:
> >> in such case should we rather use dynamic allocation ?
> >
> > No, there are two possible approaches. One of them is to use a trash
> > buffer using get_trash_chunk(). The trash buffers are "large enough"
> > for anything that comes from outside. A second, cleaner solution
> > simply consists in not using a temporary buffer but doing the conversion
> > on the fly. Indeed, looking closer, what the function does is to first
> > replace a few chars on the whole chain to then call the base64 conversion
> > function. So it doubles the work on the string and one side effect of
> > this double work is that you need a temporary storage.
> 
> The url variant is not only about a different alphabet that needs to be
> replaced but also is a non padding variant. So the straightforward
> algorithm to decoding it is to add the padding to the input encoded in
> url variant and then use the standard base64 decoder.

Ah OK I wasn't aware of this!

> Even doing this on the fly requires extending input with two more bytes
> max. Unless I am missing smth but in such case on the fly conversion
> will result in a out of bound array access.

There is never a reason for an out-of-bounds access but of course it
will require to add the checks everywhere while right now it knows that
it can read 4 bytes at once. So yeah, that woul result in two different
paths and that's a pain.

> That's why I have copied input in a "inlen+2" string.

Got it, that makes sense.

> In the end I have updated patch to handle extending input in decoding
> function via get_trash_chunk to make sure a buffer of size input+2 is
> available.
> 
> You can find attached the patches 0001 and 0002 for this implementation.

We could do with that, I just still find it a bit awkward to write so
much code to modify an input and adapt it to a parser instead of writing
a parser. That's more operations and more code to inspect in case of
trouble.

> You can find attached the patches 0001-bis and 0002-bis modifying the
> existing functions (introducing an url flag) to see how it looks like.
> This solution may be cleaner (no chunk allocation and we don't loop
> twice over input string) but has the drawbacks of being intrusive with
> the rest of the code and less clearer imo regarding how url variant is
> different from standard base64.

I agree they're not pretty due to the change of logic around the padding,
thanks for having tested! But then how about having just *your* functions
without relying on the other ones ? Now that you've extended the existing
function, you can declare it yours, remove all the configurable stuff and
keep the simplified version as the one you need. I'm sure it will be the
best tradeoff overall.

> diff --git a/src/base64.c b/src/base64.c
> index 53e4d65b2..f2768d980 100644
> --- a/src/base64.c
> +++ b/src/base64.c
> @@ -1,5 +1,5 @@
>  /*
> - * ASCII <-> Base64 conversion as described in RFC1421.
> + * ASCII <-> Base64 conversion as described in RFC4648.
>   *
>   * Copyright 2006-2010 Willy Tarreau 
>   * Copyright 2009-2010 Krzysztof Piotr Oledzki 
> @@ -17,50 +17,70 @@
>  #include 
>  #include 
>  
> -#define B64BASE  '#' /* arbitrary chosen base value */
> -#define B64CMIN  '+'
> -#define B64CMAX  'z'
> -#define B64PADV  64  /* Base64 chosen special pad value */
> +#define B64BASE  '#'   /* arbitrary chosen base value */
> +#define B64CMIN  '+'
> +#define UB64CMIN '-'
> +#define B64CMAX  'z'
> +#define B64PADV  64   /* Base64 chosen special pad value */

Please do not needlessly reindent code parts for no reason. It seems that
only the "-" was added there, the rest shouldn't change.
 
> @@ -73,29 +93,59 @@ int a2base64(char *in, int ilen, char *out, int olen)
>   *  or  to be NULL. Returns -1 if  is invalid or ilen
>   * has wrong size, -2 if  is too short.
>   * 1 to 3 output bytes are produced for 4 input bytes.
> + * The reverse tab used to decode base64 is generated via 
> /contrib/base64/base64rev-gen.c

By the way, contrib/ was move to dev/ during your changes so if you keep
this comment please update it.

Thanks,
Willy



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-13 Thread Jarno Huuskonen
Hello,

On Tue, 2021-04-06 at 01:58 +0200, Moemen MHEDHBI wrote:
> Thanks Willy and Tim for your feedback.
> 
> You can find attached the updated patches with fixed coding style (now
> set correctly in my editor), updated commit message, entry doc in sorted
> order, size_t instead of int in both enc/dec  and corresponding reg-test.

Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in
configuration.txt. Something like:
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15020,11 +15020,14 @@ and()
 b64dec
   Converts (decodes) a base64 encoded input string to its binary
   representation. It performs the inverse operation of base64().
+  For base64url("URL and Filename Safe Alphabet" (RFC 4648)) variant
+  see "ub64dec".
 
 base64
   Converts a binary input sample to a base64 string. It is used to log or
   transfer binary content in a way that can be reliably transferred (e.g.
-  an SSL ID can be copied in a header).
+  an SSL ID can be copied in a header). For base64url("URL and Filename
Safe
+  Alphabet" (RFC 4648)) variant see "ub64enc".
 
 bool
   Returns a boolean TRUE if the input value of type signed integer is


-Jarno

-- 
Jarno Huuskonen


Re: [PATCH] JWT payloads break b64dec convertor

2021-04-12 Thread Moemen MHEDHBI


On 12/04/2021 23:13, Aleksandar Lazic wrote:
> Hi Moemen,
> 
> any chance to get this feature before 2.4 will be realeased?
> 
> Regards
> Aleks
> 

Hi Aleksandar,
I have updated the patch (attached) so it gets reviewed and eventually
merged.
I know this is going to be useful with what you are trying to do with
the json converter so I will try to be more active on this.


On 06/04/2021 09:13, Willy Tarreau wrote:

>> in such case should we rather use dynamic allocation ?
>
> No, there are two possible approaches. One of them is to use a trash
> buffer using get_trash_chunk(). The trash buffers are "large enough"
> for anything that comes from outside. A second, cleaner solution
> simply consists in not using a temporary buffer but doing the conversion
> on the fly. Indeed, looking closer, what the function does is to first
> replace a few chars on the whole chain to then call the base64 conversion
> function. So it doubles the work on the string and one side effect of
> this double work is that you need a temporary storage.

The url variant is not only about a different alphabet that needs to be
replaced but also is a non padding variant. So the straightforward
algorithm to decoding it is to add the padding to the input encoded in
url variant and then use the standard base64 decoder.
Even doing this on the fly requires extending input with two more bytes
max. Unless I am missing smth but in such case on the fly conversion
will result in a out of bound array access. That's why I have copied
input in a "inlen+2" string.

In the end I have updated patch to handle extending input in decoding
function via get_trash_chunk to make sure a buffer of size input+2 is
available.

You can find attached the patches 0001 and 0002 for this implementation.


> Other approaches would consist in either reimplementing the functions
> with a different alphabet, or modifying the existing ones to take an
> extra argument for the conversion table, and make one set of functions
> making use of the current table and another set making use of your new
> table.
>
> Willy
>

You can find attached the patches 0001-bis and 0002-bis modifying the
existing functions (introducing an url flag) to see how it looks like.
This solution may be cleaner (no chunk allocation and we don't loop
twice over input string) but has the drawbacks of being intrusive with
the rest of the code and less clearer imo regarding how url variant is
different from standard base64.
Feel free to pick the one that looks better otherwise I can continue
with a different implementation if needbe.
-- 
Moemen
>From cf0a43dab4f5f88ddf5e5e736127132721b7f18e Mon Sep 17 00:00:00 2001
From: Moemen MHEDHBI 
Date: Thu, 1 Apr 2021 20:53:59 +0200
Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ub64enc converters

ub64dec and ub64enc are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
RFC1421 mention in base64.c file is deprecated so it was replaced with
RFC4648 to which existing converters, base64/b64dec, still apply.

Example:
  HAProxy:
http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec]
  Client:
Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
$ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local
{"user":"foo","key":"chae6AhXai6e"}
---
 doc/configuration.txt| 12 ++
 include/haproxy/base64.h |  2 +
 reg-tests/sample_fetches/ubase64.vtc | 45 +++
 src/base64.c | 64 +++-
 src/sample.c | 38 +
 5 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 reg-tests/sample_fetches/ubase64.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7048fb63e..c7fe416e5 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -16393,6 +16393,18 @@ table_trackers()
   connections there are from a given address for example. See also the
   sc_trackers sample fetch keyword.
 
+ub64dec
+  This converter is the base64url variant of b64dec converter. base64url
+	encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding.
+	It is also the encoding used in JWT (JSON Web Token) standard.
+
+	Example:
+	  # Decoding a JWT payload:
+	  http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec
+
+ub64enc
+  This converter is the base64url variant of base64 converter.
+
 upper
   Convert a string sample to upper case. This can only be placed after a string
   sample fetch function or after a transformation keyword returning a string
diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h
index 1756bc058..532c46a44 100644
--- a/include/haproxy/base64.h
+++ b/include/haproxy/base64.h

Re: [PATCH] JWT payloads break b64dec convertor

2021-04-12 Thread Aleksandar Lazic

Hi Moemen,

any chance to get this feature before 2.4 will be realeased?

Regards
Aleks

On 06.04.21 09:13, Willy Tarreau wrote:

Hi Moemen,

On Tue, Apr 06, 2021 at 01:58:11AM +0200, Moemen MHEDHBI wrote:

Only part unclear:
On 02/04/2021 15:04, Tim Düsterhus wrote:

+int base64urldec(const char *in, size_t ilen, char *out, size_t olen) {
+char conv[ilen+2];


This looks like a remotely triggerable stack overflow.


You mean in case ilen is too big?


Yes that's it, I didn't notice it during the first review. It's
particularly uncommon to use variable sized arrays and should never
be done. The immediate effect of this is that it will reserve some
room in the stack for a size as large as ilen+2 bytes. The problem
is that on most platforms the stack grows down, so the beginning of
the buffer is located at a point which is very far away from the
current stack. This memory is in fact not allocated, so the system
detects the first usage through a page fault and allocates the
necessary space. But in order to know that the page fault is within
the stack, it has to apply a certain margin. And this margin varies
between OSes and platforms. Some compilers will explicitly initialize
such a large stack from top to bottom to avoid a crash. Other ones
will not do and may very well crash at 64kB. On Linux, I can make the
above crash by using a 8 MB ilen, just because by default the stack
size limit is 8 MB. That's large but not overly excessive for those
who would like to perform some processing on bodies. And I recall
that some other OSes default to way smaller limits (I recall 64kB
on OpenBSD a long time ago though this might have been raised to a
megabyte or so by now).


in such case should we rather use dynamic allocation ?


No, there are two possible approaches. One of them is to use a trash
buffer using get_trash_chunk(). The trash buffers are "large enough"
for anything that comes from outside. A second, cleaner solution
simply consists in not using a temporary buffer but doing the conversion
on the fly. Indeed, looking closer, what the function does is to first
replace a few chars on the whole chain to then call the base64 conversion
function. So it doubles the work on the string and one side effect of
this double work is that you need a temporary storage.

Other approaches would consist in either reimplementing the functions
with a different alphabet, or modifying the existing ones to take an
extra argument for the conversion table, and make one set of functions
making use of the current table and another set making use of your new
table.

Willy






Re: [PATCH] JWT payloads break b64dec convertor

2021-04-06 Thread Willy Tarreau
Hi Moemen,

On Tue, Apr 06, 2021 at 01:58:11AM +0200, Moemen MHEDHBI wrote:
> Only part unclear:
> On 02/04/2021 15:04, Tim Düsterhus wrote:
> >> +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) {
> >> +char conv[ilen+2];
> >
> > This looks like a remotely triggerable stack overflow.
> 
> You mean in case ilen is too big?

Yes that's it, I didn't notice it during the first review. It's
particularly uncommon to use variable sized arrays and should never
be done. The immediate effect of this is that it will reserve some
room in the stack for a size as large as ilen+2 bytes. The problem
is that on most platforms the stack grows down, so the beginning of
the buffer is located at a point which is very far away from the
current stack. This memory is in fact not allocated, so the system
detects the first usage through a page fault and allocates the
necessary space. But in order to know that the page fault is within
the stack, it has to apply a certain margin. And this margin varies
between OSes and platforms. Some compilers will explicitly initialize
such a large stack from top to bottom to avoid a crash. Other ones
will not do and may very well crash at 64kB. On Linux, I can make the
above crash by using a 8 MB ilen, just because by default the stack
size limit is 8 MB. That's large but not overly excessive for those
who would like to perform some processing on bodies. And I recall
that some other OSes default to way smaller limits (I recall 64kB
on OpenBSD a long time ago though this might have been raised to a
megabyte or so by now).

> in such case should we rather use dynamic allocation ?

No, there are two possible approaches. One of them is to use a trash
buffer using get_trash_chunk(). The trash buffers are "large enough"
for anything that comes from outside. A second, cleaner solution
simply consists in not using a temporary buffer but doing the conversion
on the fly. Indeed, looking closer, what the function does is to first
replace a few chars on the whole chain to then call the base64 conversion
function. So it doubles the work on the string and one side effect of
this double work is that you need a temporary storage.

Other approaches would consist in either reimplementing the functions
with a different alphabet, or modifying the existing ones to take an
extra argument for the conversion table, and make one set of functions
making use of the current table and another set making use of your new
table.

Willy



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-05 Thread Moemen MHEDHBI
Thanks Willy and Tim for your feedback.

You can find attached the updated patches with fixed coding style (now
set correctly in my editor), updated commit message, entry doc in sorted
order, size_t instead of int in both enc/dec  and corresponding reg-test.

Only part unclear:
On 02/04/2021 15:04, Tim Düsterhus wrote:
>> +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) {
>> +char conv[ilen+2];
>
> This looks like a remotely triggerable stack overflow.

You mean in case ilen is too big? in such case should we rather use
dynamic allocation ?

-- 
Moemen MHEDHBI
>From bae8d3890be6d2f5a58697bf7b8e9f01f4589d3b Mon Sep 17 00:00:00 2001
From: Moemen MHEDHBI 
Date: Thu, 1 Apr 2021 20:53:59 +0200
Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ub64enc converters

ub64dec and ub64enc are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
RFC1421 mention in base64.c file is deprecated so it was replaced with
RFC4648 to which existing converters, base64/b64dec, still apply.

Example:
  HAProxy:
http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec]
  Client:
Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
$ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local
{"user":"foo","key":"chae6AhXai6e"}
---
 doc/configuration.txt| 12 ++
 include/haproxy/base64.h |  2 +
 reg-tests/sample_fetches/ubase64.vtc | 24 +++
 src/base64.c | 59 +++-
 src/sample.c | 38 ++
 5 files changed, 134 insertions(+), 1 deletion(-)
 create mode 100644 reg-tests/sample_fetches/ubase64.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7048fb63e..c7fe416e5 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -16393,6 +16393,18 @@ table_trackers()
   connections there are from a given address for example. See also the
   sc_trackers sample fetch keyword.
 
+ub64dec
+  This converter is the base64url variant of b64dec converter. base64url
+	encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding.
+	It is also the encoding used in JWT (JSON Web Token) standard.
+
+	Example:
+	  # Decoding a JWT payload:
+	  http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec
+
+ub64enc
+  This converter is the base64url variant of base64 converter.
+
 upper
   Convert a string sample to upper case. This can only be placed after a string
   sample fetch function or after a transformation keyword returning a string
diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h
index 1756bc058..532c46a44 100644
--- a/include/haproxy/base64.h
+++ b/include/haproxy/base64.h
@@ -17,7 +17,9 @@
 #include 
 
 int a2base64(char *in, int ilen, char *out, int olen);
+int a2base64url(char *in, size_t ilen, char *out, size_t olen);
 int base64dec(const char *in, size_t ilen, char *out, size_t olen);
+int base64urldec(const char *in, size_t ilen, char *out, size_t olen);
 const char *s30tob64(int in, char *out);
 int b64tos30(const char *in);
 
diff --git a/reg-tests/sample_fetches/ubase64.vtc b/reg-tests/sample_fetches/ubase64.vtc
new file mode 100644
index 0..a273321d2
--- /dev/null
+++ b/reg-tests/sample_fetches/ubase64.vtc
@@ -0,0 +1,24 @@
+varnishtest "ub64dec sample fetche Test"
+
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+haproxy h1 -conf {
+defaults
+mode http
+timeout connect 1s
+timeout client  1s
+timeout server  1s
+
+frontend fe
+bind "fd@${fe}"
+http-request return content-type text/plain hdr encode %[hdr(input),ub64enc] lf-string %[req.hdr(Authorization),word(2,.),ub64dec]
+} -start
+
+client c1 -connect ${h1_fe_sock} {
+txreq -url "/" -hdr "input: biduule" -hdr "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg"
+rxresp
+expect resp.http.encode == "YmlkdXVsZQ"
+expect resp.body == "{\"user\":\"foo\",\"key\":\"chae6AhXai6e\"}"
+} -run
diff --git a/src/base64.c b/src/base64.c
index 53e4d65b2..c53c8b076 100644
--- a/src/base64.c
+++ b/src/base64.c
@@ -1,5 +1,5 @@
 /*
- * ASCII <-> Base64 conversion as described in RFC1421.
+ * ASCII <-> Base64 conversion as described in RFC4648.
  *
  * Copyright 2006-2010 Willy Tarreau 
  * Copyright 2009-2010 Krzysztof Piotr Oledzki 
@@ -138,6 +138,63 @@ int base64dec(const char *in, size_t ilen, char *out, size_t olen) {
 	return convlen;
 }
 
+/* url variant of a2base64 */
+int a2base64url(char *in, size_t ilen, char *out, size_t olen)
+{
+	int convlen, i;
+
+	convlen = a2base64(in, ilen, out, olen);
+	while 

Re: [PATCH] JWT payloads break b64dec convertor

2021-04-02 Thread Tim Düsterhus

Moemen,

On 4/2/21 1:38 AM, Moemen MHEDHBI wrote:

Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ubase64 converters

ub64dec and ubase64 are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
---
 doc/configuration.txt| 11 
 include/haproxy/base64.h |  2 ++
 src/base64.c | 54 +++-
 src/sample.c | 38 
 4 files changed, 104 insertions(+), 1 deletion(-)


Consider adding a reg-test.


diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7048fb63e..10098adef 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15494,6 +15494,17 @@ base64
   transfer binary content in a way that can be reliably transferred (e.g.
   an SSL ID can be copied in a header).
 
+ub64dec

+  This converter is the base64url variant of b64dec converter. base64url
+   encoding is the "URL and Filename Safe Alphabet" variant of base64 
encoding.
+   It is also the encoding used in JWT (JSON Web Token) standard.
+
+   Example:
+ http-request set-var(txn.token_payload) 
req.hdr(Authorization),word(2,.),ub64dec
+
+ubase64
+  This converter is the base64url variant of base64 converter.


This is not alphabetically sorted.


 bool
   Returns a boolean TRUE if the input value of type signed integer is
   non-null, otherwise returns FALSE. Used in conjunction with and(), it can be
diff --git a/src/base64.c b/src/base64.c
index 53e4d65b2..38902523f 100644
--- a/src/base64.c
+++ b/src/base64.c
@@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, 
size_t olen) {
return convlen;
 }
 
+/* url variant of a2base64 */

+int a2base64url(char *in, int ilen, char *out, int olen){


I know that the existing functions also use 'int', but I'm pretty 
annoyed by not using 'size_t' for buffer sizes :-)



[...]
+int base64urldec(const char *in, size_t ilen, char *out, size_t olen) {
+   char conv[ilen+2];


This looks like a remotely triggerable stack overflow.


[...]
 
 /* Converts the lower 30 bits of an integer to a 5-char base64 string. The

  * caller is responsible for ensuring that the output buffer can accept 6 bytes


Best regards
Tim Düsterhus



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-02 Thread Willy Tarreau
Hi Moemen,

On Fri, Apr 02, 2021 at 01:38:59AM +0200, Moemen MHEDHBI wrote:
> I have came across the same use-case as Jonathan so I gave it a try and
> implemented the converters for base64url variant.
> 
> - Regarding the converters name, I have just prefixed "u" and used
> ubase64/ub64dec. Let me know if the names are not appropriate or if you
> rather prefer to add an argument to existing converters.

Thanks for this. I know it's not your fault but the initial names
"base64" and "b64dec" are quite messy because the former doesn't
indicate the direction. Thus I'd rather not replicate this error,
and at least make sure that "enc" or "dec" is explicit in both
keywords. Maybe "ub64enc" and "ub64dec" or ubase64enc/ubase64dec
are better (though shorter forms are often preferred for converters).

> - RFC1421 mention in base64.c is deprecated so I replaced it with
> RFC4648 to which base64/b64dec converters seem to still apply.

OK! Please do mention this in your commit message as well, after
all you made the effort of justifying it here, but the commit
message is the place for justifying a change :-)

> - not sure if samples list in sample.c should be formatted/aligned after
> the converters added in this patch. They seemed to be already not
> completely aligned. Anyway I have done the aligning on a separate patch
> so you can squash it or drop it at your convenience.

It's often the problem with such lists. Nobody wants to realign
everything when they're the first to break alignment, and at some point
too much is too much and some cleanup is needed. Thanks for doign this
one!

> Testing Example:
>  haproxy.cfg:
>http-request return content-type text/plain lf-string
> %[req.hdr(Authorization),word(2,.),ub64dec]
> 
>  client:
>TOKEN =
> eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
>$ curl -H "Authorization: Bearer ${TOKEN}" 127.0.0.1:8080
>{"user":"foo","key":"chae6AhXai6e"}

You can put that in the commit message, it's extremely useful when
bissecting or even for those who would like to backport it.

I'm having some comments below about style issues however.

> --- a/src/base64.c
> +++ b/src/base64.c
> @@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, 
> size_t olen) {
>   return convlen;
>  }
>  
> +/* url variant of a2base64 */
> +int a2base64url(char *in, int ilen, char *out, int olen){
   ^
Opening brace on the next line please.

> + int convlen;

^ empty line after variables declarations

> + convlen = a2base64(in,ilen,out,olen);
 ^^   ^
Spaces after commas in function arguments

> + while (out[convlen-1]=='='){
  ^  ^ ^
spaces around operators

> + convlen--;
> + out[convlen]='\0';
^^
> + }
> + for(int i=0;i +/* url variant of base64dec */
> +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) {
(...)
> + switch (ilen%4){
   ^
space before opening brace

> + case 0:
> + break;
> + case 2:
> + conv[ilen]= '=';
> + conv[ilen+1]= '=';
> + conv[ilen+2]= '\0';
> + ilen +=2;
> + break;

Please keep your indentation consistent.

> + case 3:
> + conv[ilen]= '=';
> + conv[ilen+1]= '\0';
> + ilen +=1;
> + break;
> + default:
> + return -1;
> + }
> +  return base64dec(conv,ilen,out,olen);
   ^^
indentation issue here as well

Thanks!
Willy



Re: [PATCH] JWT payloads break b64dec convertor

2021-04-01 Thread Moemen MHEDHBI
> On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote:
>> Improvements and suggestions welcome; flames and horror -> /dev/null ;-)
>
> Would anyone be interested in adding two new converters for this,
> working exactly like base64/b64dec but with the URL-compatible
> base64 encoding instead ? We could call them :
>
>  u64dec
>  u64enc
>
> Note that it can be a nice and useful exercise for a first-time
contribution,
> don't be ashamed guys!
>
> Willy


Hi,
I have came across the same use-case as Jonathan so I gave it a try and
implemented the converters for base64url variant.

- Regarding the converters name, I have just prefixed "u" and used
ubase64/ub64dec. Let me know if the names are not appropriate or if you
rather prefer to add an argument to existing converters.

- RFC1421 mention in base64.c is deprecated so I replaced it with
RFC4648 to which base64/b64dec converters seem to still apply.

- not sure if samples list in sample.c should be formatted/aligned after
the converters added in this patch. They seemed to be already not
completely aligned. Anyway I have done the aligning on a separate patch
so you can squash it or drop it at your convenience.

Testing Example:
 haproxy.cfg:
   http-request return content-type text/plain lf-string
%[req.hdr(Authorization),word(2,.),ub64dec]

 client:
   TOKEN =
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
   $ curl -H "Authorization: Bearer ${TOKEN}" 127.0.0.1:8080
   {"user":"foo","key":"chae6AhXai6e"}


-- 
Moemen MHEDHBI
>From e599ada315d01513e21f11cdff176cff1639b25c Mon Sep 17 00:00:00 2001
From: Moemen MHEDHBI 
Date: Thu, 1 Apr 2021 20:53:59 +0200
Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ubase64 converters

ub64dec and ubase64 are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
---
 doc/configuration.txt| 11 
 include/haproxy/base64.h |  2 ++
 src/base64.c | 54 +++-
 src/sample.c | 38 
 4 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7048fb63e..10098adef 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15494,6 +15494,17 @@ base64
   transfer binary content in a way that can be reliably transferred (e.g.
   an SSL ID can be copied in a header).
 
+ub64dec
+  This converter is the base64url variant of b64dec converter. base64url
+	encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding.
+	It is also the encoding used in JWT (JSON Web Token) standard.
+
+	Example:
+	  http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec
+
+ubase64
+  This converter is the base64url variant of base64 converter.
+
 bool
   Returns a boolean TRUE if the input value of type signed integer is
   non-null, otherwise returns FALSE. Used in conjunction with and(), it can be
diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h
index 1756bc058..aea4e7a73 100644
--- a/include/haproxy/base64.h
+++ b/include/haproxy/base64.h
@@ -18,6 +18,8 @@
 
 int a2base64(char *in, int ilen, char *out, int olen);
 int base64dec(const char *in, size_t ilen, char *out, size_t olen);
+int a2base64url(char *in, int ilen, char *out, int olen);
+int base64urldec(const char *in, size_t ilen, char *out, size_t olen);
 const char *s30tob64(int in, char *out);
 int b64tos30(const char *in);
 
diff --git a/src/base64.c b/src/base64.c
index 53e4d65b2..38902523f 100644
--- a/src/base64.c
+++ b/src/base64.c
@@ -1,5 +1,5 @@
 /*
- * ASCII <-> Base64 conversion as described in RFC1421.
+ * ASCII <-> Base64 conversion as described in RFC4648.
  *
  * Copyright 2006-2010 Willy Tarreau 
  * Copyright 2009-2010 Krzysztof Piotr Oledzki 
@@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, size_t olen) {
 	return convlen;
 }
 
+/* url variant of a2base64 */
+int a2base64url(char *in, int ilen, char *out, int olen){
+	int convlen;
+	convlen = a2base64(in,ilen,out,olen);
+	while (out[convlen-1]=='='){
+		convlen--;
+		out[convlen]='\0';
+	}
+	for(int i=0;idata = 0;
+	bin_len = base64urldec(smp->data.u.str.area, smp->data.u.str.data,
+			trash->area, trash->size);
+	if (bin_len < 0)
+		return 0;
+
+	trash->data = bin_len;
+	smp->data.u.str = *trash;
+	smp->data.type = SMP_T_BIN;
+	smp->flags &= ~SMP_F_CONST;
+	return 1;
+}
+
 static int sample_conv_bin2base64(const struct arg *arg_p, struct sample *smp, void *private)
 {
 	struct buffer *trash = get_trash_chunk();
@@ -1585,6 +1603,24 @@ static int sample_conv_bin2base64(const struct arg *arg_p, struct sample *smp, v
 	return 1;
 }
 
+static int sample_conv_bin2base64url(const struct arg *arg_p, struct sample *smp, void *private)
+{
+	struct buffer *trash =