Re: [PATCH] JWT payloads break b64dec convertor
On Tue, Apr 13, 2021 at 04:44:38PM +0200, Moemen MHEDHBI wrote: > > But then how about having just *your* functions > > without relying on the other ones ? Now that you've extended the existing > > function, you can declare it yours, remove all the configurable stuff and > > keep the simplified version as the one you need. I'm sure it will be the > > best tradeoff overall. > > > > Yes that makes sense to me too, the attached patch deal with it as > suggested. Yeah much cleaner now, indeed. I've merged it, thank you! Willy
Re: [PATCH] JWT payloads break b64dec convertor
On 13/04/2021 11:39, Willy Tarreau wrote: >> You can find attached the patches 0001-bis and 0002-bis modifying the >> existing functions (introducing an url flag) to see how it looks like. >> This solution may be cleaner (no chunk allocation and we don't loop >> twice over input string) but has the drawbacks of being intrusive with >> the rest of the code and less clearer imo regarding how url variant is >> different from standard base64. > > I agree they're not pretty due to the change of logic around the padding, > thanks for having tested! But then how about having just *your* functions > without relying on the other ones ? Now that you've extended the existing > function, you can declare it yours, remove all the configurable stuff and > keep the simplified version as the one you need. I'm sure it will be the > best tradeoff overall. > Yes that makes sense to me too, the attached patch deal with it as suggested. >> diff --git a/src/base64.c b/src/base64.c >> index 53e4d65b2..f2768d980 100644 >> --- a/src/base64.c >> +++ b/src/base64.c >> @@ -1,5 +1,5 @@ >> /* >> - * ASCII <-> Base64 conversion as described in RFC1421. >> + * ASCII <-> Base64 conversion as described in RFC4648. >> * >> * Copyright 2006-2010 Willy Tarreau >> * Copyright 2009-2010 Krzysztof Piotr Oledzki >> @@ -17,50 +17,70 @@ >> #include >> #include >> >> -#define B64BASE '#' /* arbitrary chosen base value */ >> -#define B64CMIN '+' >> -#define B64CMAX 'z' >> -#define B64PADV 64 /* Base64 chosen special pad value */ >> +#define B64BASE '#' /* arbitrary chosen base value */ >> +#define B64CMIN '+' >> +#define UB64CMIN '-' >> +#define B64CMAX 'z' >> +#define B64PADV 64 /* Base64 chosen special pad value */ > > Please do not needlessly reindent code parts for no reason. It seems that > only the "-" was added there, the rest shouldn't change. The reason is I was following the doc/coding-style where alignment should use spaces, but since the existing bloc was aligned via tabs, I thought about fixing that instead of repeating the issue. I understand though that in such case better have this in separate commit, so I have stuck with the tabs alignment. > By the way, contrib/ was move to dev/ during your changes so if you keep > this comment please update it. Done. On 13/04/2021 08:19, Jarno Huuskonen wrote: > Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in > configuration.txt. Done thanks. -- Moemen >From b526416364b98afaa2d2b421fbf27f80bc4e8732 Mon Sep 17 00:00:00 2001 From: Moemen MHEDHBI Date: Fri, 2 Apr 2021 01:05:07 +0200 Subject: [PATCH 2/2] CLEANUP: align samples list in sample.c --- src/sample.c | 54 ++-- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/src/sample.c b/src/sample.c index 04635a91f..7337ba06a 100644 --- a/src/sample.c +++ b/src/sample.c @@ -4129,33 +4129,33 @@ INITCALL1(STG_REGISTER, sample_register_fetches, &smp_kws); /* Note: must not be declared as its list will be overwritten */ static struct sample_conv_kw_list sample_conv_kws = {ILH, { - { "debug", sample_conv_debug, ARG2(0,STR,STR), smp_check_debug, SMP_T_ANY, SMP_T_ANY }, - { "b64dec", sample_conv_base642bin,0,NULL, SMP_T_STR, SMP_T_BIN }, - { "base64", sample_conv_bin2base64,0,NULL, SMP_T_BIN, SMP_T_STR }, - { "ub64dec", sample_conv_base64url2bin,0,NULL, SMP_T_STR, SMP_T_BIN }, - { "ub64enc", sample_conv_bin2base64url,0,NULL, SMP_T_BIN, SMP_T_STR }, - { "upper", sample_conv_str2upper, 0,NULL, SMP_T_STR, SMP_T_STR }, - { "lower", sample_conv_str2lower, 0,NULL, SMP_T_STR, SMP_T_STR }, - { "length", sample_conv_length,0,NULL, SMP_T_STR, SMP_T_SINT }, - { "hex",sample_conv_bin2hex, 0,NULL, SMP_T_BIN, SMP_T_STR }, - { "hex2i", sample_conv_hex2int, 0,NULL, SMP_T_STR, SMP_T_SINT }, - { "ipmask", sample_conv_ipmask,ARG2(1,MSK4,MSK6), NULL, SMP_T_ADDR, SMP_T_IPV4 }, - { "ltime", sample_conv_ltime, ARG2(1,STR,SINT), NULL, SMP_T_SINT, SMP_T_STR }, - { "utime", sample_conv_utime, ARG2(1,STR,SINT), NULL, SMP_T_SINT, SMP_T_STR }, - { "crc32", sample_conv_crc32, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "crc32c", sample_conv_crc32c,ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "djb2", sample_conv_djb2, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "sdbm", sample_conv_sdbm, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "wt6",sample_conv_wt6, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "xxh3", sample_conv_xxh3, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "xxh32", sample_conv_xxh32, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "xxh64", sample_conv_xxh64, ARG1(0,SINT), NULL, SMP_T_BIN, SMP_T_SINT }, - { "json", sample_conv_json, ARG1(1,STR), sample_conv_json_ch
Re: [PATCH] JWT payloads break b64dec convertor
Hi Jarno, On Tue, Apr 13, 2021 at 06:19:47AM +, Jarno Huuskonen wrote: > Hello, > > On Tue, 2021-04-06 at 01:58 +0200, Moemen MHEDHBI wrote: > > Thanks Willy and Tim for your feedback. > > > > You can find attached the updated patches with fixed coding style (now > > set correctly in my editor), updated commit message, entry doc in sorted > > order, size_t instead of int in both enc/dec and corresponding reg-test. > > Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in > configuration.txt. Something like: > --- a/doc/configuration.txt > +++ b/doc/configuration.txt > @@ -15020,11 +15020,14 @@ and() > b64dec >Converts (decodes) a base64 encoded input string to its binary >representation. It performs the inverse operation of base64(). > + For base64url("URL and Filename Safe Alphabet" (RFC 4648)) variant > + see "ub64dec". > > base64 >Converts a binary input sample to a base64 string. It is used to log or >transfer binary content in a way that can be reliably transferred (e.g. > - an SSL ID can be copied in a header). > + an SSL ID can be copied in a header). For base64url("URL and Filename > Safe > + Alphabet" (RFC 4648)) variant see "ub64enc". > > bool >Returns a boolean TRUE if the input value of type signed integer is Yes very good point indeed, and similarly the ub64 ones should refer to these ones. Willy
Re: [PATCH] JWT payloads break b64dec convertor
Hi Moemen, On Tue, Apr 13, 2021 at 12:41:39AM +0200, Moemen MHEDHBI wrote: > >> in such case should we rather use dynamic allocation ? > > > > No, there are two possible approaches. One of them is to use a trash > > buffer using get_trash_chunk(). The trash buffers are "large enough" > > for anything that comes from outside. A second, cleaner solution > > simply consists in not using a temporary buffer but doing the conversion > > on the fly. Indeed, looking closer, what the function does is to first > > replace a few chars on the whole chain to then call the base64 conversion > > function. So it doubles the work on the string and one side effect of > > this double work is that you need a temporary storage. > > The url variant is not only about a different alphabet that needs to be > replaced but also is a non padding variant. So the straightforward > algorithm to decoding it is to add the padding to the input encoded in > url variant and then use the standard base64 decoder. Ah OK I wasn't aware of this! > Even doing this on the fly requires extending input with two more bytes > max. Unless I am missing smth but in such case on the fly conversion > will result in a out of bound array access. There is never a reason for an out-of-bounds access but of course it will require to add the checks everywhere while right now it knows that it can read 4 bytes at once. So yeah, that woul result in two different paths and that's a pain. > That's why I have copied input in a "inlen+2" string. Got it, that makes sense. > In the end I have updated patch to handle extending input in decoding > function via get_trash_chunk to make sure a buffer of size input+2 is > available. > > You can find attached the patches 0001 and 0002 for this implementation. We could do with that, I just still find it a bit awkward to write so much code to modify an input and adapt it to a parser instead of writing a parser. That's more operations and more code to inspect in case of trouble. > You can find attached the patches 0001-bis and 0002-bis modifying the > existing functions (introducing an url flag) to see how it looks like. > This solution may be cleaner (no chunk allocation and we don't loop > twice over input string) but has the drawbacks of being intrusive with > the rest of the code and less clearer imo regarding how url variant is > different from standard base64. I agree they're not pretty due to the change of logic around the padding, thanks for having tested! But then how about having just *your* functions without relying on the other ones ? Now that you've extended the existing function, you can declare it yours, remove all the configurable stuff and keep the simplified version as the one you need. I'm sure it will be the best tradeoff overall. > diff --git a/src/base64.c b/src/base64.c > index 53e4d65b2..f2768d980 100644 > --- a/src/base64.c > +++ b/src/base64.c > @@ -1,5 +1,5 @@ > /* > - * ASCII <-> Base64 conversion as described in RFC1421. > + * ASCII <-> Base64 conversion as described in RFC4648. > * > * Copyright 2006-2010 Willy Tarreau > * Copyright 2009-2010 Krzysztof Piotr Oledzki > @@ -17,50 +17,70 @@ > #include > #include > > -#define B64BASE '#' /* arbitrary chosen base value */ > -#define B64CMIN '+' > -#define B64CMAX 'z' > -#define B64PADV 64 /* Base64 chosen special pad value */ > +#define B64BASE '#' /* arbitrary chosen base value */ > +#define B64CMIN '+' > +#define UB64CMIN '-' > +#define B64CMAX 'z' > +#define B64PADV 64 /* Base64 chosen special pad value */ Please do not needlessly reindent code parts for no reason. It seems that only the "-" was added there, the rest shouldn't change. > @@ -73,29 +93,59 @@ int a2base64(char *in, int ilen, char *out, int olen) > * or to be NULL. Returns -1 if is invalid or ilen > * has wrong size, -2 if is too short. > * 1 to 3 output bytes are produced for 4 input bytes. > + * The reverse tab used to decode base64 is generated via > /contrib/base64/base64rev-gen.c By the way, contrib/ was move to dev/ during your changes so if you keep this comment please update it. Thanks, Willy
Re: [PATCH] JWT payloads break b64dec convertor
Hello, On Tue, 2021-04-06 at 01:58 +0200, Moemen MHEDHBI wrote: > Thanks Willy and Tim for your feedback. > > You can find attached the updated patches with fixed coding style (now > set correctly in my editor), updated commit message, entry doc in sorted > order, size_t instead of int in both enc/dec and corresponding reg-test. Could you add a cross reference from b64dec/base64 to ub64dec/ub64enc in configuration.txt. Something like: --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -15020,11 +15020,14 @@ and() b64dec Converts (decodes) a base64 encoded input string to its binary representation. It performs the inverse operation of base64(). + For base64url("URL and Filename Safe Alphabet" (RFC 4648)) variant + see "ub64dec". base64 Converts a binary input sample to a base64 string. It is used to log or transfer binary content in a way that can be reliably transferred (e.g. - an SSL ID can be copied in a header). + an SSL ID can be copied in a header). For base64url("URL and Filename Safe + Alphabet" (RFC 4648)) variant see "ub64enc". bool Returns a boolean TRUE if the input value of type signed integer is -Jarno -- Jarno Huuskonen
Re: [PATCH] JWT payloads break b64dec convertor
On 12/04/2021 23:13, Aleksandar Lazic wrote: > Hi Moemen, > > any chance to get this feature before 2.4 will be realeased? > > Regards > Aleks > Hi Aleksandar, I have updated the patch (attached) so it gets reviewed and eventually merged. I know this is going to be useful with what you are trying to do with the json converter so I will try to be more active on this. On 06/04/2021 09:13, Willy Tarreau wrote: >> in such case should we rather use dynamic allocation ? > > No, there are two possible approaches. One of them is to use a trash > buffer using get_trash_chunk(). The trash buffers are "large enough" > for anything that comes from outside. A second, cleaner solution > simply consists in not using a temporary buffer but doing the conversion > on the fly. Indeed, looking closer, what the function does is to first > replace a few chars on the whole chain to then call the base64 conversion > function. So it doubles the work on the string and one side effect of > this double work is that you need a temporary storage. The url variant is not only about a different alphabet that needs to be replaced but also is a non padding variant. So the straightforward algorithm to decoding it is to add the padding to the input encoded in url variant and then use the standard base64 decoder. Even doing this on the fly requires extending input with two more bytes max. Unless I am missing smth but in such case on the fly conversion will result in a out of bound array access. That's why I have copied input in a "inlen+2" string. In the end I have updated patch to handle extending input in decoding function via get_trash_chunk to make sure a buffer of size input+2 is available. You can find attached the patches 0001 and 0002 for this implementation. > Other approaches would consist in either reimplementing the functions > with a different alphabet, or modifying the existing ones to take an > extra argument for the conversion table, and make one set of functions > making use of the current table and another set making use of your new > table. > > Willy > You can find attached the patches 0001-bis and 0002-bis modifying the existing functions (introducing an url flag) to see how it looks like. This solution may be cleaner (no chunk allocation and we don't loop twice over input string) but has the drawbacks of being intrusive with the rest of the code and less clearer imo regarding how url variant is different from standard base64. Feel free to pick the one that looks better otherwise I can continue with a different implementation if needbe. -- Moemen >From cf0a43dab4f5f88ddf5e5e736127132721b7f18e Mon Sep 17 00:00:00 2001 From: Moemen MHEDHBI Date: Thu, 1 Apr 2021 20:53:59 +0200 Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ub64enc converters ub64dec and ub64enc are the base64url equivalent of b64dec and base64 converters. base64url encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. It is also used in in JWT (JSON Web Token) standard. RFC1421 mention in base64.c file is deprecated so it was replaced with RFC4648 to which existing converters, base64/b64dec, still apply. Example: HAProxy: http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec] Client: Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg $ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local {"user":"foo","key":"chae6AhXai6e"} --- doc/configuration.txt| 12 ++ include/haproxy/base64.h | 2 + reg-tests/sample_fetches/ubase64.vtc | 45 +++ src/base64.c | 64 +++- src/sample.c | 38 + 5 files changed, 160 insertions(+), 1 deletion(-) create mode 100644 reg-tests/sample_fetches/ubase64.vtc diff --git a/doc/configuration.txt b/doc/configuration.txt index 7048fb63e..c7fe416e5 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -16393,6 +16393,18 @@ table_trackers() connections there are from a given address for example. See also the sc_trackers sample fetch keyword. +ub64dec + This converter is the base64url variant of b64dec converter. base64url + encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. + It is also the encoding used in JWT (JSON Web Token) standard. + + Example: + # Decoding a JWT payload: + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec + +ub64enc + This converter is the base64url variant of base64 converter. + upper Convert a string sample to upper case. This can only be placed after a string sample fetch function or after a transformation keyword returning a string diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h index 1756bc058..532c46a44 100644 --- a/include/haproxy/base64.h +++ b/include/haproxy/base64.h @
Re: [PATCH] JWT payloads break b64dec convertor
Hi Moemen, any chance to get this feature before 2.4 will be realeased? Regards Aleks On 06.04.21 09:13, Willy Tarreau wrote: Hi Moemen, On Tue, Apr 06, 2021 at 01:58:11AM +0200, Moemen MHEDHBI wrote: Only part unclear: On 02/04/2021 15:04, Tim Düsterhus wrote: +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) { +char conv[ilen+2]; This looks like a remotely triggerable stack overflow. You mean in case ilen is too big? Yes that's it, I didn't notice it during the first review. It's particularly uncommon to use variable sized arrays and should never be done. The immediate effect of this is that it will reserve some room in the stack for a size as large as ilen+2 bytes. The problem is that on most platforms the stack grows down, so the beginning of the buffer is located at a point which is very far away from the current stack. This memory is in fact not allocated, so the system detects the first usage through a page fault and allocates the necessary space. But in order to know that the page fault is within the stack, it has to apply a certain margin. And this margin varies between OSes and platforms. Some compilers will explicitly initialize such a large stack from top to bottom to avoid a crash. Other ones will not do and may very well crash at 64kB. On Linux, I can make the above crash by using a 8 MB ilen, just because by default the stack size limit is 8 MB. That's large but not overly excessive for those who would like to perform some processing on bodies. And I recall that some other OSes default to way smaller limits (I recall 64kB on OpenBSD a long time ago though this might have been raised to a megabyte or so by now). in such case should we rather use dynamic allocation ? No, there are two possible approaches. One of them is to use a trash buffer using get_trash_chunk(). The trash buffers are "large enough" for anything that comes from outside. A second, cleaner solution simply consists in not using a temporary buffer but doing the conversion on the fly. Indeed, looking closer, what the function does is to first replace a few chars on the whole chain to then call the base64 conversion function. So it doubles the work on the string and one side effect of this double work is that you need a temporary storage. Other approaches would consist in either reimplementing the functions with a different alphabet, or modifying the existing ones to take an extra argument for the conversion table, and make one set of functions making use of the current table and another set making use of your new table. Willy
Re: [PATCH] JWT payloads break b64dec convertor
Hi Moemen, On Tue, Apr 06, 2021 at 01:58:11AM +0200, Moemen MHEDHBI wrote: > Only part unclear: > On 02/04/2021 15:04, Tim Düsterhus wrote: > >> +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) { > >> +char conv[ilen+2]; > > > > This looks like a remotely triggerable stack overflow. > > You mean in case ilen is too big? Yes that's it, I didn't notice it during the first review. It's particularly uncommon to use variable sized arrays and should never be done. The immediate effect of this is that it will reserve some room in the stack for a size as large as ilen+2 bytes. The problem is that on most platforms the stack grows down, so the beginning of the buffer is located at a point which is very far away from the current stack. This memory is in fact not allocated, so the system detects the first usage through a page fault and allocates the necessary space. But in order to know that the page fault is within the stack, it has to apply a certain margin. And this margin varies between OSes and platforms. Some compilers will explicitly initialize such a large stack from top to bottom to avoid a crash. Other ones will not do and may very well crash at 64kB. On Linux, I can make the above crash by using a 8 MB ilen, just because by default the stack size limit is 8 MB. That's large but not overly excessive for those who would like to perform some processing on bodies. And I recall that some other OSes default to way smaller limits (I recall 64kB on OpenBSD a long time ago though this might have been raised to a megabyte or so by now). > in such case should we rather use dynamic allocation ? No, there are two possible approaches. One of them is to use a trash buffer using get_trash_chunk(). The trash buffers are "large enough" for anything that comes from outside. A second, cleaner solution simply consists in not using a temporary buffer but doing the conversion on the fly. Indeed, looking closer, what the function does is to first replace a few chars on the whole chain to then call the base64 conversion function. So it doubles the work on the string and one side effect of this double work is that you need a temporary storage. Other approaches would consist in either reimplementing the functions with a different alphabet, or modifying the existing ones to take an extra argument for the conversion table, and make one set of functions making use of the current table and another set making use of your new table. Willy
Re: [PATCH] JWT payloads break b64dec convertor
Thanks Willy and Tim for your feedback. You can find attached the updated patches with fixed coding style (now set correctly in my editor), updated commit message, entry doc in sorted order, size_t instead of int in both enc/dec and corresponding reg-test. Only part unclear: On 02/04/2021 15:04, Tim Düsterhus wrote: >> +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) { >> +char conv[ilen+2]; > > This looks like a remotely triggerable stack overflow. You mean in case ilen is too big? in such case should we rather use dynamic allocation ? -- Moemen MHEDHBI >From bae8d3890be6d2f5a58697bf7b8e9f01f4589d3b Mon Sep 17 00:00:00 2001 From: Moemen MHEDHBI Date: Thu, 1 Apr 2021 20:53:59 +0200 Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ub64enc converters ub64dec and ub64enc are the base64url equivalent of b64dec and base64 converters. base64url encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. It is also used in in JWT (JSON Web Token) standard. RFC1421 mention in base64.c file is deprecated so it was replaced with RFC4648 to which existing converters, base64/b64dec, still apply. Example: HAProxy: http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec] Client: Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg $ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local {"user":"foo","key":"chae6AhXai6e"} --- doc/configuration.txt| 12 ++ include/haproxy/base64.h | 2 + reg-tests/sample_fetches/ubase64.vtc | 24 +++ src/base64.c | 59 +++- src/sample.c | 38 ++ 5 files changed, 134 insertions(+), 1 deletion(-) create mode 100644 reg-tests/sample_fetches/ubase64.vtc diff --git a/doc/configuration.txt b/doc/configuration.txt index 7048fb63e..c7fe416e5 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -16393,6 +16393,18 @@ table_trackers() connections there are from a given address for example. See also the sc_trackers sample fetch keyword. +ub64dec + This converter is the base64url variant of b64dec converter. base64url + encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. + It is also the encoding used in JWT (JSON Web Token) standard. + + Example: + # Decoding a JWT payload: + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec + +ub64enc + This converter is the base64url variant of base64 converter. + upper Convert a string sample to upper case. This can only be placed after a string sample fetch function or after a transformation keyword returning a string diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h index 1756bc058..532c46a44 100644 --- a/include/haproxy/base64.h +++ b/include/haproxy/base64.h @@ -17,7 +17,9 @@ #include int a2base64(char *in, int ilen, char *out, int olen); +int a2base64url(char *in, size_t ilen, char *out, size_t olen); int base64dec(const char *in, size_t ilen, char *out, size_t olen); +int base64urldec(const char *in, size_t ilen, char *out, size_t olen); const char *s30tob64(int in, char *out); int b64tos30(const char *in); diff --git a/reg-tests/sample_fetches/ubase64.vtc b/reg-tests/sample_fetches/ubase64.vtc new file mode 100644 index 0..a273321d2 --- /dev/null +++ b/reg-tests/sample_fetches/ubase64.vtc @@ -0,0 +1,24 @@ +varnishtest "ub64dec sample fetche Test" + +#REQUIRE_VERSION=2.4 + +feature ignore_unknown_macro + +haproxy h1 -conf { +defaults +mode http +timeout connect 1s +timeout client 1s +timeout server 1s + +frontend fe +bind "fd@${fe}" +http-request return content-type text/plain hdr encode %[hdr(input),ub64enc] lf-string %[req.hdr(Authorization),word(2,.),ub64dec] +} -start + +client c1 -connect ${h1_fe_sock} { +txreq -url "/" -hdr "input: biduule" -hdr "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg" +rxresp +expect resp.http.encode == "YmlkdXVsZQ" +expect resp.body == "{\"user\":\"foo\",\"key\":\"chae6AhXai6e\"}" +} -run diff --git a/src/base64.c b/src/base64.c index 53e4d65b2..c53c8b076 100644 --- a/src/base64.c +++ b/src/base64.c @@ -1,5 +1,5 @@ /* - * ASCII <-> Base64 conversion as described in RFC1421. + * ASCII <-> Base64 conversion as described in RFC4648. * * Copyright 2006-2010 Willy Tarreau * Copyright 2009-2010 Krzysztof Piotr Oledzki @@ -138,6 +138,63 @@ int base64dec(const char *in, size_t ilen, char *out, size_t olen) { return convlen; } +/* url variant of a2base64 */ +int a2base64url(char *in, size_t ilen, char *out, size_t olen) +{ + int convlen, i; + + convlen = a2base64(in, ilen, out, olen); + while (
Re: [PATCH] JWT payloads break b64dec convertor
Moemen, On 4/2/21 1:38 AM, Moemen MHEDHBI wrote: Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ubase64 converters ub64dec and ubase64 are the base64url equivalent of b64dec and base64 converters. base64url encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. It is also used in in JWT (JSON Web Token) standard. --- doc/configuration.txt| 11 include/haproxy/base64.h | 2 ++ src/base64.c | 54 +++- src/sample.c | 38 4 files changed, 104 insertions(+), 1 deletion(-) Consider adding a reg-test. diff --git a/doc/configuration.txt b/doc/configuration.txt index 7048fb63e..10098adef 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -15494,6 +15494,17 @@ base64 transfer binary content in a way that can be reliably transferred (e.g. an SSL ID can be copied in a header). +ub64dec + This converter is the base64url variant of b64dec converter. base64url + encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. + It is also the encoding used in JWT (JSON Web Token) standard. + + Example: + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec + +ubase64 + This converter is the base64url variant of base64 converter. This is not alphabetically sorted. bool Returns a boolean TRUE if the input value of type signed integer is non-null, otherwise returns FALSE. Used in conjunction with and(), it can be diff --git a/src/base64.c b/src/base64.c index 53e4d65b2..38902523f 100644 --- a/src/base64.c +++ b/src/base64.c @@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, size_t olen) { return convlen; } +/* url variant of a2base64 */ +int a2base64url(char *in, int ilen, char *out, int olen){ I know that the existing functions also use 'int', but I'm pretty annoyed by not using 'size_t' for buffer sizes :-) [...] +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) { + char conv[ilen+2]; This looks like a remotely triggerable stack overflow. [...] /* Converts the lower 30 bits of an integer to a 5-char base64 string. The * caller is responsible for ensuring that the output buffer can accept 6 bytes Best regards Tim Düsterhus
Re: [PATCH] JWT payloads break b64dec convertor
Hi Moemen, On Fri, Apr 02, 2021 at 01:38:59AM +0200, Moemen MHEDHBI wrote: > I have came across the same use-case as Jonathan so I gave it a try and > implemented the converters for base64url variant. > > - Regarding the converters name, I have just prefixed "u" and used > ubase64/ub64dec. Let me know if the names are not appropriate or if you > rather prefer to add an argument to existing converters. Thanks for this. I know it's not your fault but the initial names "base64" and "b64dec" are quite messy because the former doesn't indicate the direction. Thus I'd rather not replicate this error, and at least make sure that "enc" or "dec" is explicit in both keywords. Maybe "ub64enc" and "ub64dec" or ubase64enc/ubase64dec are better (though shorter forms are often preferred for converters). > - RFC1421 mention in base64.c is deprecated so I replaced it with > RFC4648 to which base64/b64dec converters seem to still apply. OK! Please do mention this in your commit message as well, after all you made the effort of justifying it here, but the commit message is the place for justifying a change :-) > - not sure if samples list in sample.c should be formatted/aligned after > the converters added in this patch. They seemed to be already not > completely aligned. Anyway I have done the aligning on a separate patch > so you can squash it or drop it at your convenience. It's often the problem with such lists. Nobody wants to realign everything when they're the first to break alignment, and at some point too much is too much and some cleanup is needed. Thanks for doign this one! > Testing Example: > haproxy.cfg: >http-request return content-type text/plain lf-string > %[req.hdr(Authorization),word(2,.),ub64dec] > > client: >TOKEN = > eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg >$ curl -H "Authorization: Bearer ${TOKEN}" 127.0.0.1:8080 >{"user":"foo","key":"chae6AhXai6e"} You can put that in the commit message, it's extremely useful when bissecting or even for those who would like to backport it. I'm having some comments below about style issues however. > --- a/src/base64.c > +++ b/src/base64.c > @@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, > size_t olen) { > return convlen; > } > > +/* url variant of a2base64 */ > +int a2base64url(char *in, int ilen, char *out, int olen){ ^ Opening brace on the next line please. > + int convlen; ^ empty line after variables declarations > + convlen = a2base64(in,ilen,out,olen); ^^ ^ Spaces after commas in function arguments > + while (out[convlen-1]=='='){ ^ ^ ^ spaces around operators > + convlen--; > + out[convlen]='\0'; ^^ > + } > + for(int i=0;i +/* url variant of base64dec */ > +int base64urldec(const char *in, size_t ilen, char *out, size_t olen) { (...) > + switch (ilen%4){ ^ space before opening brace > + case 0: > + break; > + case 2: > + conv[ilen]= '='; > + conv[ilen+1]= '='; > + conv[ilen+2]= '\0'; > + ilen +=2; > + break; Please keep your indentation consistent. > + case 3: > + conv[ilen]= '='; > + conv[ilen+1]= '\0'; > + ilen +=1; > + break; > + default: > + return -1; > + } > + return base64dec(conv,ilen,out,olen); ^^ indentation issue here as well Thanks! Willy
Re: [PATCH] JWT payloads break b64dec convertor
> On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: >> Improvements and suggestions welcome; flames and horror -> /dev/null ;-) > > Would anyone be interested in adding two new converters for this, > working exactly like base64/b64dec but with the URL-compatible > base64 encoding instead ? We could call them : > > u64dec > u64enc > > Note that it can be a nice and useful exercise for a first-time contribution, > don't be ashamed guys! > > Willy Hi, I have came across the same use-case as Jonathan so I gave it a try and implemented the converters for base64url variant. - Regarding the converters name, I have just prefixed "u" and used ubase64/ub64dec. Let me know if the names are not appropriate or if you rather prefer to add an argument to existing converters. - RFC1421 mention in base64.c is deprecated so I replaced it with RFC4648 to which base64/b64dec converters seem to still apply. - not sure if samples list in sample.c should be formatted/aligned after the converters added in this patch. They seemed to be already not completely aligned. Anyway I have done the aligning on a separate patch so you can squash it or drop it at your convenience. Testing Example: haproxy.cfg: http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec] client: TOKEN = eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg $ curl -H "Authorization: Bearer ${TOKEN}" 127.0.0.1:8080 {"user":"foo","key":"chae6AhXai6e"} -- Moemen MHEDHBI >From e599ada315d01513e21f11cdff176cff1639b25c Mon Sep 17 00:00:00 2001 From: Moemen MHEDHBI Date: Thu, 1 Apr 2021 20:53:59 +0200 Subject: [PATCH 1/2] MINOR: sample: add ub64dec and ubase64 converters ub64dec and ubase64 are the base64url equivalent of b64dec and base64 converters. base64url encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. It is also used in in JWT (JSON Web Token) standard. --- doc/configuration.txt| 11 include/haproxy/base64.h | 2 ++ src/base64.c | 54 +++- src/sample.c | 38 4 files changed, 104 insertions(+), 1 deletion(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 7048fb63e..10098adef 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -15494,6 +15494,17 @@ base64 transfer binary content in a way that can be reliably transferred (e.g. an SSL ID can be copied in a header). +ub64dec + This converter is the base64url variant of b64dec converter. base64url + encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. + It is also the encoding used in JWT (JSON Web Token) standard. + + Example: + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec + +ubase64 + This converter is the base64url variant of base64 converter. + bool Returns a boolean TRUE if the input value of type signed integer is non-null, otherwise returns FALSE. Used in conjunction with and(), it can be diff --git a/include/haproxy/base64.h b/include/haproxy/base64.h index 1756bc058..aea4e7a73 100644 --- a/include/haproxy/base64.h +++ b/include/haproxy/base64.h @@ -18,6 +18,8 @@ int a2base64(char *in, int ilen, char *out, int olen); int base64dec(const char *in, size_t ilen, char *out, size_t olen); +int a2base64url(char *in, int ilen, char *out, int olen); +int base64urldec(const char *in, size_t ilen, char *out, size_t olen); const char *s30tob64(int in, char *out); int b64tos30(const char *in); diff --git a/src/base64.c b/src/base64.c index 53e4d65b2..38902523f 100644 --- a/src/base64.c +++ b/src/base64.c @@ -1,5 +1,5 @@ /* - * ASCII <-> Base64 conversion as described in RFC1421. + * ASCII <-> Base64 conversion as described in RFC4648. * * Copyright 2006-2010 Willy Tarreau * Copyright 2009-2010 Krzysztof Piotr Oledzki @@ -138,6 +138,58 @@ int base64dec(const char *in, size_t ilen, char *out, size_t olen) { return convlen; } +/* url variant of a2base64 */ +int a2base64url(char *in, int ilen, char *out, int olen){ + int convlen; + convlen = a2base64(in,ilen,out,olen); + while (out[convlen-1]=='='){ + convlen--; + out[convlen]='\0'; + } + for(int i=0;idata = 0; + bin_len = base64urldec(smp->data.u.str.area, smp->data.u.str.data, + trash->area, trash->size); + if (bin_len < 0) + return 0; + + trash->data = bin_len; + smp->data.u.str = *trash; + smp->data.type = SMP_T_BIN; + smp->flags &= ~SMP_F_CONST; + return 1; +} + static int sample_conv_bin2base64(const struct arg *arg_p, struct sample *smp, void *private) { struct buffer *trash = get_trash_chunk(); @@ -1585,6 +1603,24 @@ static int sample_conv_bin2base64(const struct arg *arg_p, struct sample *smp, v return 1; } +static int sample_conv_bin2base64url(const struct arg *arg_p, struct sample *smp, void *private) +{ + struct buffer *trash =
Re: JWT payloads break b64dec convertor
Le 28/05/2018 à 10:19, Adis Nezirovic a écrit : > On 05/26/2018 04:27 PM, Jonathan Matthews wrote: >> Hello folks, >> >> The payload (and other parts) of a JSON Web Token (JWT, a popular and >> growing auth standard: https://tools.ietf.org/html/rfc7519) is base64 >> encoded. >> >> Unfortunately, the payload encoding (specified in >> https://tools.ietf.org/html/rfc7515) is defined as the "URL safe" >> variant. This variant allows for the lossless omission of base64 >> padding ("=" or "=="), which the haproxy b64dec convertor doesn't >> appear to be able cope with. The result of > Jonathan, > > It's not just padding, urlsafe base64 replaces '+' with '-', and '/' > with '_'. For now, I guess the easiest way would be to write a simple > converter in Lua, which just returns the original string, and send > payload somewhere for further processing. For those who like "standards", these are all base64 variants : https://en.wikipedia.org/wiki/Base64#Variants_summary_table Oh, and don't forget to develop a new standard if those don't cover yours needs: https://xkcd.com/927/ > Best regards, > Adis Bunch --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: JWT payloads break b64dec convertor
On 28/05/2018 17:42, Norman Branitsky wrote: https://en.wikipedia.org/wiki/The_C_Programming_Language Oh, of course *clap on head* how embarrassing 8-O. I haven't thought to THIS Book, sorry. Best regards Aleks -Original Message- From: Aleksandar Lazic Sent: Monday, May 28, 2018 12:34 PM To: Jonathan Matthews Cc: Willy Tarreau ; haproxy Subject: Re: JWT payloads break b64dec convertor On 28/05/2018 15:10, Jonathan Matthews wrote: On Mon, 28 May 2018 at 14:26, Willy Tarreau wrote: On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: > Improvements and suggestions welcome; flames and horror -> > /dev/null ;-) Would anyone be interested in adding two new converters for this, working exactly like base64/b64dec but with the URL-compatible base64 encoding instead ? We could call them : u64dec u64enc I like that idea, and have already retrieved my K&R from the loft :-) Sorry for my ignorance but what's K&R ? J -- Jonathan Matthews London, UK http://www.jpluscplusm.com/contact.html
RE: JWT payloads break b64dec convertor
https://en.wikipedia.org/wiki/The_C_Programming_Language -Original Message- From: Aleksandar Lazic Sent: Monday, May 28, 2018 12:34 PM To: Jonathan Matthews Cc: Willy Tarreau ; haproxy Subject: Re: JWT payloads break b64dec convertor On 28/05/2018 15:10, Jonathan Matthews wrote: >On Mon, 28 May 2018 at 14:26, Willy Tarreau wrote: > >> On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: >> > Improvements and suggestions welcome; flames and horror -> >> > /dev/null ;-) >> >> Would anyone be interested in adding two new converters for this, >> working exactly like base64/b64dec but with the URL-compatible >> base64 encoding instead ? We could call them : >> >> u64dec >> u64enc > > >I like that idea, and have already retrieved my K&R from the loft :-) Sorry for my ignorance but what's K&R ? >J > >> -- >Jonathan Matthews >London, UK >http://www.jpluscplusm.com/contact.html
Re: JWT payloads break b64dec convertor
On 28/05/2018 15:10, Jonathan Matthews wrote: On Mon, 28 May 2018 at 14:26, Willy Tarreau wrote: On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: > Improvements and suggestions welcome; flames and horror -> /dev/null ;-) Would anyone be interested in adding two new converters for this, working exactly like base64/b64dec but with the URL-compatible base64 encoding instead ? We could call them : u64dec u64enc I like that idea, and have already retrieved my K&R from the loft :-) Sorry for my ignorance but what's K&R ? J -- Jonathan Matthews London, UK http://www.jpluscplusm.com/contact.html
Re: JWT payloads break b64dec convertor
On Mon, May 28, 2018 at 03:10:01PM +0100, Jonathan Matthews wrote: > On Mon, 28 May 2018 at 14:26, Willy Tarreau wrote: > > > On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: > > > Improvements and suggestions welcome; flames and horror -> /dev/null ;-) > > > > Would anyone be interested in adding two new converters for this, > > working exactly like base64/b64dec but with the URL-compatible > > base64 encoding instead ? We could call them : > > > > u64dec > > u64enc > > > I like that idea, and have already retrieved my K&R from the loft :-) By the way, another approach (cleaner?) could be to add an argument to the existing b64dec()/base64() to indicate how it should work. I think it's the utf8 or json or something like this which already does exactly this. Willy
Re: JWT payloads break b64dec convertor
On Mon, 28 May 2018 at 14:26, Willy Tarreau wrote: > On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: > > Improvements and suggestions welcome; flames and horror -> /dev/null ;-) > > Would anyone be interested in adding two new converters for this, > working exactly like base64/b64dec but with the URL-compatible > base64 encoding instead ? We could call them : > > u64dec > u64enc I like that idea, and have already retrieved my K&R from the loft :-) J > -- Jonathan Matthews London, UK http://www.jpluscplusm.com/contact.html
Re: JWT payloads break b64dec convertor
On Mon, May 28, 2018 at 01:43:41PM +0100, Jonathan Matthews wrote: > Improvements and suggestions welcome; flames and horror -> /dev/null ;-) Would anyone be interested in adding two new converters for this, working exactly like base64/b64dec but with the URL-compatible base64 encoding instead ? We could call them : u64dec u64enc Note that it can be a nice and useful exercise for a first-time contribution, don't be ashamed guys! Willy
Re: JWT payloads break b64dec convertor
On 28 May 2018 at 12:32, Jonathan Matthews wrote: > I think with your points and ccripy's sneaky (kudos!) padding > insertion, I can do something which suffices for my current audit > needs. For the list, here's my working v1 that I ended up with. I'm sure various things can be improved! :-) I couldn't get ccripy's concat() and length() converters to work, but I've stolen the basic idea - many thanks! acl ACL_jwt_payload_4x_chars_long var(txn.jwtpayload) -m reg ^(.{4})+$ http-request set-var(txn.jwtpayload) req.hdr(jwt) http-request set-var(txn.jwtpayload) var(txn.jwtpayload),regsub($,=) if !ACL_jwt_payload_4x_chars_long http-request set-var(txn.jwtpayload) var(txn.jwtpayload),regsub($,=) if !ACL_jwt_payload_4x_chars_long http-request set-var(txn.jwtpayload) var(txn.jwtpayload),regsub(-,+,g) http-request set-var(txn.jwtpayload) var(txn.jwtpayload),regsub(_,/,g) log-format " jwt-payload:%[var(txn.jwtpayload),b64dec]" Improvements and suggestions welcome; flames and horror -> /dev/null ;-) Jonathan
Re: JWT payloads break b64dec convertor
On 28 May 2018 at 09:19, Adis Nezirovic wrote: > On 05/26/2018 04:27 PM, Jonathan Matthews wrote: >> Hello folks, >> >> The payload (and other parts) of a JSON Web Token (JWT, a popular and >> growing auth standard: https://tools.ietf.org/html/rfc7519) is base64 >> encoded. >> >> Unfortunately, the payload encoding (specified in >> https://tools.ietf.org/html/rfc7515) is defined as the "URL safe" >> variant. This variant allows for the lossless omission of base64 >> padding ("=" or "=="), which the haproxy b64dec convertor doesn't >> appear to be able cope with. The result of > > Jonathan, > > It's not just padding, urlsafe base64 replaces '+' with '-', and '/' > with '_'. You're right. I'd noticed those extra substitutions but, for some reason I'd assumed they were applied after decoding. Brain fart! > For now, I guess the easiest way would be to write a simple > converter in Lua, which just returns the original string, and send > payload somewhere for further processing. One nice thing about the JWT format is that it's unambiguously formatted as "header.payload.signature", so the payload can be trivially parsed out of a sacrificial header with a http-request replace-header copy-of-jwt [^.]+\.([^.]+)\..+ \1 ... or some such manipulation. Here, for clarity, I'm double-passing it through an abns@ frontend-backend-listen chain, hence the additional header and not a variable, as per your example. I think with your points and ccripy's sneaky (kudos!) padding insertion, I can do something which suffices for my current audit needs. I suspect you're right that a Lua convertor is probably the more supportable way forwards, however. Many thanks, both! J
Re: JWT payloads break b64dec convertor
On 05/26/2018 04:27 PM, Jonathan Matthews wrote: > Hello folks, > > The payload (and other parts) of a JSON Web Token (JWT, a popular and > growing auth standard: https://tools.ietf.org/html/rfc7519) is base64 > encoded. > > Unfortunately, the payload encoding (specified in > https://tools.ietf.org/html/rfc7515) is defined as the "URL safe" > variant. This variant allows for the lossless omission of base64 > padding ("=" or "=="), which the haproxy b64dec convertor doesn't > appear to be able cope with. The result of Jonathan, It's not just padding, urlsafe base64 replaces '+' with '-', and '/' with '_'. For now, I guess the easiest way would be to write a simple converter in Lua, which just returns the original string, and send payload somewhere for further processing. Best regards, Adis
Re: JWT payloads break b64dec convertor
while probably not the most ideal solution... i found a quick method to do this using the builtin converters within the configuration to append the padding where necessary. here is an example: log-format %[var(txn.jwtpayload),b64dec] http-request set-var(txn.jwtpayload) req.hdr('x-jwtpayload') http-request set-var(txn.jwtpayload) str(),concat(,txn.jwtpayload,==) if !{ var(txn.jwtpayload),length,mod(4) -m int eq 0 } { var(txn.jwtpayload),length,mod(2) -m int eq 0 } http-request set-var(txn.jwtpayload) str(),concat(,txn.jwtpayload,=) if !{ var(txn.jwtpayload),length,mod(4) -m int eq 0 }
JWT payloads break b64dec convertor
Hello folks, The payload (and other parts) of a JSON Web Token (JWT, a popular and growing auth standard: https://tools.ietf.org/html/rfc7519) is base64 encoded. Unfortunately, the payload encoding (specified in https://tools.ietf.org/html/rfc7515) is defined as the "URL safe" variant. This variant allows for the lossless omission of base64 padding ("=" or "=="), which the haproxy b64dec convertor doesn't appear to be able cope with. The result of log-format %[,b64dec] ... when faced with such an unpadded string is just "-", which I take to mean decoding failed. I believe it's failing on line 84 of src/base64.c. I've tried and failed to use a regex convertor to add padding to the end, based on looking at the string's remainder after matching clusters with '(.{4})+'. Annoyingly I can't make this work in the regsub convertor as I believe it would require the use of grouping parentheses, which aren't permitted by the parser currently. I'm personally interested in this for logging the contents of JWT payloads for audit. Is anyone else working with JWT in haproxy, in this or any other context, and could share any tactics for dealing with this problem? Many thanks! Jonathan