Re: [PATCH] MINOR: sample: add json_string

2021-04-20 Thread Tim Düsterhus

Willy,

On 4/20/21 8:35 PM, Willy Tarreau wrote:

But who am I to complain? Updated patches attached. I left the 'trash'
one as-is, because that one really is a common pattern. I hope I did not
miss anything else :-)


I believe you might've missed my updated patches.


Oops you're right, I was a bit quick at marking this read but forgot to
apply them. Now done, sorry for this!



No need to apologize. Things happen and usually you are super quick to 
review and apply, so I can't complain :-) If I forget to remind then it 
wasn't important enough.


Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-20 Thread Willy Tarreau
Hi Tim!

On Tue, Apr 20, 2021 at 08:29:30PM +0200, Tim Düsterhus wrote:
> Willy,
> 
> On 4/16/21 8:30 PM, Tim Düsterhus wrote:
> > But who am I to complain? Updated patches attached. I left the 'trash'
> > one as-is, because that one really is a common pattern. I hope I did not
> > miss anything else :-)
> 
> I believe you might've missed my updated patches.

Oops you're right, I was a bit quick at marking this read but forgot to
apply them. Now done, sorry for this!

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-20 Thread Tim Düsterhus

Willy,

On 4/16/21 8:30 PM, Tim Düsterhus wrote:
But who am I to complain? Updated patches attached. I left the 'trash' 
one as-is, because that one really is a common pattern. I hope I did not 
miss anything else :-)


I believe you might've missed my updated patches.

Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-16 Thread Tim Düsterhus

Willy,

On 4/16/21 8:24 AM, Willy Tarreau wrote:

-   tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, 
args[0].data.str.area, , );
+   enum mjson_tok token_type = mjson_find(smp->data.u.str.area, 
smp->data.u.str.data, args[0].data.str.area, , _size);


As much as possible, please try to avoid assignments in declarations,
especially when using functions. There are several reasons that make
this annoying:
   - sometimes just changing one arg requires to reorder the declarations
   - debugging with them is painful
   - trying to comment some of them out is painful as well
   - it discourages from adding finer control (e.g. if you need to do
 this or that using and if/else block, you also have to significantly
 modify them).
   - quite a few times I've seen code like this been deleted because when
 the compiler says "unused variable foo", it makes one think it's now
 OK to delete it, except that the function possibly had a desired side
 effect

=> thus better declare the enum with the variables

Same goes for one or two other ones in the switch/case statements.



I forgot to do in this patch, but I prefer directly assigning, because I 
can mark the variables `const` that way. It also increases verbosity for 
these single-assignment variables which IMO reduces readability.


Also I really dislike the (old) C style of declaring all variables at 
the very top of the scope. Most notably not being allowed to `for 
(size_t i = 0; ...; ...)` is bothering me quite a bit.


But who am I to complain? Updated patches attached. I left the 'trash' 
one as-is, because that one really is a common pattern. I hope I did not 
miss anything else :-)


Best regards
Tim Düsterhus
>From d37a4788b6be31a7dd53a0724baaaeb4e160b85d Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:08:48 +0200
Subject: [PATCH v2 1/3] CLEANUP: sample: Improve local variables in
 sample_conv_json_query
To: haproxy@formilux.org
Cc: w...@1wt.eu

This improves the use of local variables in sample_conv_json_query:

- Use the enum type for the return value of `mjson_find`.
- Do not use single letter variables.
- Reduce the scope of variables that are only needed in a single branch.
- Add missing newlines after variable declaration.
---
 src/sample.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index 728c7c76d..c2d9beda3 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3723,16 +3723,17 @@ static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
 static int sample_conv_json_query(const struct arg *args, struct sample *smp, void *private)
 {
 	struct buffer *trash = get_trash_chunk();
-	const char *p; /* holds the temporary string from mjson_find */
-	int tok, n;/* holds the token enum and the length of the value */
-	int rc;/* holds the return code from mjson_get_string */
+	const char *token; /* holds the temporary string from mjson_find */
+	int token_size;/* holds the length of  */
 
-	tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, , );
+	enum mjson_tok token_type;
 
-	switch(tok) {
+	token_type = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, , _size);
+
+	switch (token_type) {
 		case MJSON_TOK_NUMBER:
 			if (args[1].type == ARGT_SINT) {
-smp->data.u.sint = atoll(p);
+smp->data.u.sint = atoll(token);
 
 if (smp->data.u.sint < JSON_INT_MIN || smp->data.u.sint > JSON_INT_MAX)
 	return 0;
@@ -3740,6 +3741,7 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 smp->data.type = SMP_T_SINT;
 			} else {
 double double_val;
+
 if (mjson_get_number(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, _val) == 0) {
 	return 0;
 } else {
@@ -3757,17 +3759,21 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 			smp->data.type = SMP_T_BOOL;
 			smp->data.u.sint = 0;
 			break;
-		case MJSON_TOK_STRING:
-			rc = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
-			if (rc == -1) {
+		case MJSON_TOK_STRING: {
+			int len;
+
+			len = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
+
+			if (len == -1) {
 /* invalid string */
 return 0;
 			} else {
-trash->data = rc;
+trash->data = len;
 smp->data.u.str = *trash;
 smp->data.type = SMP_T_STR;
 			}
 			break;
+		}
 		default:
 			/* no valid token found */
 			return 0;
-- 
2.31.1

>From ecc5b3f91d8fc02f1cd5ed84949cb0aef02df075 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:14:32 +0200
Subject: [PATCH v2 2/3] CLEANUP: sample: Explicitly handle all possible enum
 values from mjson
To: haproxy@formilux.org
Cc: w...@1wt.eu

This makes it easier to find bugs, because -Wswitch can help us.
---

Re: [PATCH] MINOR: sample: add json_string

2021-04-16 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 06:47:29PM +0200, Tim Düsterhus wrote:
> You know that I'm obsessed with the correct use of data types, so please
> find three CLEANUP patches attached. Feel free to drop the ones you don't
> agree with :-)

I don't disagree with them, but am just having a few comments on style
in turn :-)

>  static int sample_conv_json_query(const struct arg *args, struct sample 
> *smp, void *private)
>  {
>   struct buffer *trash = get_trash_chunk();
> - const char *p; /* holds the temporary string from mjson_find */
> - int tok, n;/* holds the token enum and the length of the value */
> - int rc;/* holds the return code from mjson_get_string */
> + const char *token; /* holds the temporary string from mjson_find */
> + int token_size;/* holds the length of  */
>  
> - tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, 
> args[0].data.str.area, , );
> + enum mjson_tok token_type = mjson_find(smp->data.u.str.area, 
> smp->data.u.str.data, args[0].data.str.area, , _size);

As much as possible, please try to avoid assignments in declarations,
especially when using functions. There are several reasons that make
this annoying:
  - sometimes just changing one arg requires to reorder the declarations
  - debugging with them is painful
  - trying to comment some of them out is painful as well
  - it discourages from adding finer control (e.g. if you need to do
this or that using and if/else block, you also have to significantly
modify them).
  - quite a few times I've seen code like this been deleted because when
the compiler says "unused variable foo", it makes one think it's now
OK to delete it, except that the function possibly had a desired side
effect

=> thus better declare the enum with the variables

Same goes for one or two other ones in the switch/case statements.

I'm otherwise OK.

Thanks!
Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-16 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 05:44:44PM +0200, Aleksandar Lazic wrote:
> Now the Statement what I wanted to say ;-)
> 
> HAProxy have now at least 4 possibilities to route traffic and
> catch some data.
> 
> HTTP fields
> GRPC fields
> FIX fields
> JSON fields
> 
> Have I missed something?

You can extract various data from TCP payloads or even HTTP bodies,
so it's just a matter of how simple you want things to be done. For
example we've long supported RDP cookies, and I even deployed a
somewhat limited but effective Corba analyser a long time ago :-)

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Tim Düsterhus

Willy,

On 4/15/21 5:09 PM, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:

#define JSON_INT_MAX ((1ULL << 53) - 1)

^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:



You know that I'm obsessed with the correct use of data types, so please 
find three CLEANUP patches attached. Feel free to drop the ones you 
don't agree with :-)


Aleks: Thanks for quickly addressing my feedback, even if you might 
think I was overly pedantic. The final version looks pretty good to me, 
my CLEANUP really is meant to be a final polishing!


Best regards
Tim Düsterhus
>From 1af53a6e73e75f48793720017fe07d0f57e8c74a Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:08:48 +0200
Subject: [PATCH 1/3] CLEANUP: sample: Improve local variables in
 sample_conv_json_query
To: haproxy@formilux.org
Cc: w...@1wt.eu

This improves the use of local variables in sample_conv_json_query:

- Use the enum type for the return value of `mjson_find`.
- Do not use single letter variables.
- Reduce the scope of variables that are only needed in a single branch.
- Add missing newlines after variable declaration.
---
 src/sample.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index 728c7c76d..a56d59245 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3723,16 +3723,15 @@ static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
 static int sample_conv_json_query(const struct arg *args, struct sample *smp, void *private)
 {
 	struct buffer *trash = get_trash_chunk();
-	const char *p; /* holds the temporary string from mjson_find */
-	int tok, n;/* holds the token enum and the length of the value */
-	int rc;/* holds the return code from mjson_get_string */
+	const char *token; /* holds the temporary string from mjson_find */
+	int token_size;/* holds the length of  */
 
-	tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, , );
+	enum mjson_tok token_type = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, , _size);
 
-	switch(tok) {
+	switch (token_type) {
 		case MJSON_TOK_NUMBER:
 			if (args[1].type == ARGT_SINT) {
-smp->data.u.sint = atoll(p);
+smp->data.u.sint = atoll(token);
 
 if (smp->data.u.sint < JSON_INT_MIN || smp->data.u.sint > JSON_INT_MAX)
 	return 0;
@@ -3740,6 +3739,7 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 smp->data.type = SMP_T_SINT;
 			} else {
 double double_val;
+
 if (mjson_get_number(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, _val) == 0) {
 	return 0;
 } else {
@@ -3757,17 +3757,19 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 			smp->data.type = SMP_T_BOOL;
 			smp->data.u.sint = 0;
 			break;
-		case MJSON_TOK_STRING:
-			rc = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
-			if (rc == -1) {
+		case MJSON_TOK_STRING: {
+			int len = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
+
+			if (len == -1) {
 /* invalid string */
 return 0;
 			} else {
-trash->data = rc;
+trash->data = len;
 smp->data.u.str = *trash;
 smp->data.type = SMP_T_STR;
 			}
 			break;
+		}
 		default:
 			/* no valid token found */
 			return 0;
-- 
2.31.1

>From 166ceb7c9a82f87f35580e3c379103d4d213fb18 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:14:32 +0200
Subject: [PATCH 2/3] CLEANUP: sample: Explicitly handle all possible enum
 values from mjson
To: haproxy@formilux.org
Cc: w...@1wt.eu

This makes it easier to find bugs, because -Wswitch can help us.
---
 src/sample.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index a56d59245..3cc571560 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3770,8 +3770,19 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 			}
 			break;
 		}
-		default:
-			/* no valid token found */
+		case MJSON_TOK_NULL:
+		case MJSON_TOK_ARRAY:
+		case MJSON_TOK_OBJECT:
+			/* We cannot handle these. */
+			return 0;
+		case MJSON_TOK_INVALID:
+			/* Nothing matches the query. */
+			return 0;
+		case MJSON_TOK_KEY:
+			/* This is not a valid return value according to the
+			 * mjson documentation, but we handle it to benefit
+			 * from '-Wswitch'.
+			 */
 			return 0;
 	}
 	return 1;
-- 
2.31.1

>From 722b41673c0d31fa99583e75947451e47f506855 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:40:06 +0200
Subject: [PATCH 3/3] CLEANUP: sample: Use explicit return 

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 17:09, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:

#define JSON_INT_MAX ((1ULL << 53) - 1)

^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:


+   if (arg[1].data.str.data != 0) {
+   if (strcmp(arg[1].data.str.area, "int") != 0) {
+   memprintf(err, "output_type only supports \"int\" as 
argument");
+   return 0;
+   } else {
+   arg[1].type = ARGT_SINT;
+   arg[1].data.sint = 0;
+   }
+   }


Thanks Aleks! You see it wasn't that hard in the end :-)


Cool ;-) :-)

Now the Statement what I wanted to say ;-)

HAProxy have now at least 4 possibilities to route traffic and
catch some data.

HTTP fields
GRPC fields
FIX fields
JSON fields

Have I missed something?

I love this Project and the community.
Thanks willy and tim for your passion and precise reviews ;-)


Willy



Best regards
Aleks



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:
> > > #define JSON_INT_MAX ((1ULL << 53) - 1)
> >^
> > Sorry I was not clear, please drop that 'U' here.
> 
> I'm also sorry, I was in a tunnel :-/
> 
> Attached now the next patches.

Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:

> + if (arg[1].data.str.data != 0) {
> + if (strcmp(arg[1].data.str.area, "int") != 0) {
> + memprintf(err, "output_type only supports 
> \"int\" as argument");
> + return 0;
> + } else {
> + arg[1].type = ARGT_SINT;
> + arg[1].data.sint = 0;
> + }
> + }

Thanks Aleks! You see it wasn't that hard in the end :-)
Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 16:09, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:05:27PM +0200, Aleksandar Lazic wrote:

Well I don't think so because 4 is still bigger then -9007199254740991 ;-)


This is because *you* think it is -9007199254740991 but the reality
is that it's not this.due to ULL:

   #define JSON_INT_MAX ((1ULL << 53) - 1)
   #define JSON_INT_MIN (-JSON_INT_MAX)

=> it's -9007199254740991ULL hence 18437736874454810625 so 4 is
definitely not larger than this.



Never the less I have changed the defines and rerun the tests.
Btw, this vtest is a great enhancement to haproxy ;-)


Yes I totally agree. And you can't imagine how many times I'm angry
at it when it detects an error after a tiny change I make, just to
realize that I did really break something and that it was right :-)
Like all tools it just needs to be reasonably used, not excessively
trusted but used as a good hint that something unexpected changed,
and it helps a lot!



```
#define JSON_INT_MAX ((1ULL << 53) - 1)

   ^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Willy



Regards
Aleks
>From 2f0673eb3e8a41e173221933021af2392d9a8ca4 Mon Sep 17 00:00:00 2001
From: Alex 
Date: Thu, 15 Apr 2021 16:45:15 +0200
Subject: [PATCH 2/2] MINOR: sample: converter: Add json_query converter

With the json_query can a JSON value be extacted from a Header
or body of the request and saved to a variable.

This converter makes it possible to handle some JSON Workload
to route requests to different backends.
---
 doc/configuration.txt  | 24 
 reg-tests/converter/json_query.vtc | 95 ++
 src/sample.c   | 88 +++
 3 files changed, 207 insertions(+)
 create mode 100644 reg-tests/converter/json_query.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index f242300e7..61c2a6dd9 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15961,6 +15961,30 @@ json([])
   Output log:
  {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
 
+json_query(,[])
+  The json_query converter supports the JSON types string, boolean and
+  number. Floating point numbers will be returned as a string. By
+  specifying the output_type 'int' the value will be converted to an
+  Integer. If conversion is not possible the json_query converter fails.
+
+   must be a valid JSON Path string as defined in
+  https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/
+
+  Example:
+ # get a integer value from the request body
+ # "{"integer":4}" => 5
+ http-request set-var(txn.pay_int) req.body,json_query('$.integer','int'),add(1)
+
+ # get a key with '.' in the name
+ # {"my.key":"myvalue"} => myvalue
+ http-request set-var(txn.pay_mykey) req.body,json_query('$.my\\.key')
+
+ # {"boolean-false":false} => 0
+ http-request set-var(txn.pay_boolean_false) req.body,json_query('$.boolean-false')
+
+ # get the value of the key 'iss' from a JWT Bearer token
+ http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
+
 language([,])
   Returns the value with the highest q-factor from a list as extracted from the
   "accept-language" header using "req.fhdr". Values with no q-factor have a
diff --git a/reg-tests/converter/json_query.vtc b/reg-tests/converter/json_query.vtc
new file mode 100644
index 0..ade7b4ccb
--- /dev/null
+++ b/reg-tests/converter/json_query.vtc
@@ -0,0 +1,95 @@
+varnishtest "JSON Query converters Test"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+server s1 {
+	rxreq
+	txresp
+} -repeat 8 -start
+
+haproxy h1 -conf {
+defaults
+	mode http
+	timeout connect 1s
+	timeout client  1s
+	timeout server  1s
+	option http-buffer-request
+
+frontend fe
+	bind "fd@${fe}"
+	tcp-request inspect-delay 1s
+
+	http-request set-var(sess.header_json) req.hdr(Authorization),json_query('$.iss')
+	http-request set-var(sess.pay_json) req.body,json_query('$.iss')
+	http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)
+	http-request set-var(sess.pay_neg_int) req.body,json_query('$.negativ-integer',"int"),add(1)
+	http-request set-var(sess.pay_double) req.body,json_query('$.double')
+	http-request set-var(sess.pay_boolean_true) req.body,json_query('$.boolean-true')
+	http-request set-var(sess.pay_boolean_false) req.body,json_query('$.boolean-false')
+	http-request set-var(sess.pay_mykey) req.body,json_query('$.my\\.key')
+
+	http-response set-header x-var_header %[var(sess.header_json)]
+	http-response set-header x-var_body %[var(sess.pay_json)]
+	http-response set-header x-var_body_int %[var(sess.pay_int)]
+	http-response set-header x-var_body_neg_int %[var(sess.pay_neg_int)]
+	http-response set-header x-var_body_double %[var(sess.pay_double)]
+	http-response set-header x-var_body_boolean_true %[var(sess.pay_boolean_true)]
+	http-response 

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 04:05:27PM +0200, Aleksandar Lazic wrote:
> Well I don't think so because 4 is still bigger then -9007199254740991 ;-)

This is because *you* think it is -9007199254740991 but the reality
is that it's not this.due to ULL:

  #define JSON_INT_MAX ((1ULL << 53) - 1)
  #define JSON_INT_MIN (-JSON_INT_MAX)

=> it's -9007199254740991ULL hence 18437736874454810625 so 4 is
   definitely not larger than this.


> Never the less I have changed the defines and rerun the tests.
> Btw, this vtest is a great enhancement to haproxy ;-)

Yes I totally agree. And you can't imagine how many times I'm angry
at it when it detects an error after a tiny change I make, just to
realize that I did really break something and that it was right :-)
Like all tools it just needs to be reasonably used, not excessively
trusted but used as a good hint that something unexpected changed,
and it helps a lot!


> ```
> #define JSON_INT_MAX ((1ULL << 53) - 1)
  ^
Sorry I was not clear, please drop that 'U' here.

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 15:55, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 03:41:18PM +0200, Aleksandar Lazic wrote:

Now when I remove the check "smp->data.u.sint < 0" every positive value is
bigger then JSON INT_MIN and returns 0.


But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

  if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

  if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates 

I think it checks if the value is negative or positive and then verify if the
value is bigger then the the max allowed value, +/-.

Maybe I thing wrong, so let us work with concrete values.

```
printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n",   
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
return 0;

```

Input is here 4.

smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<


Input is here -4.


smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<


OK I think I got it. It's just because your definitions of JSON_INT_MIN
and JSON_INT_MAX are unsigned and the comparison is made in unsigned mode.
So when you do "4 < JSON_INT_MIN" it's in fact "4 < 2^64-(1<53)-1" so it's
true. And conversely for the other one.

I'm pretty sure that if you change your constants to:

   #define JSON_INT_MAX ((1LL << 53) - 1)
   #define JSON_INT_MIN (-JSON_INT_MAX)

It will work :-)


Well I don't think so because 4 is still bigger then -9007199254740991 ;-)

Never the less I have changed the defines and rerun the tests.
Btw, this vtest is a great enhancement to haproxy ;-)

```
#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (-JSON_INT_MAX)

printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n", 
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX, (smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > JSON_INT_MAX)
return 0;
```

>> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
>> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991: if-no-check:0:<<



That's among the historical idiocies of the C language that considers
the signedness as part of the variable instead of being the mode of the
operation applied to the variable. This results in absurd combinations.

Willy






Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 03:41:18PM +0200, Aleksandar Lazic wrote:
> > > Now when I remove the check "smp->data.u.sint < 0" every positive value is
> > > bigger then JSON INT_MIN and returns 0.
> > 
> > But don't you agree that this test DOES nothing ? If it changes anything
> > it means the issue is somewhere else and is a hidden bug waiting to strike
> > and that we must address it.
> > 
> > Look:
> > 
> >  if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
> > 
> > is exactly equivalent to:
> > 
> >  if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)
> > 
> > JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
> > there is no value of sint that validates  > <0.
> 
> I think it checks if the value is negative or positive and then verify if the
> value is bigger then the the max allowed value, +/-.
> 
> Maybe I thing wrong, so let us work with concrete values.
> 
> ```
> printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: 
> if-no-check:%d:<<\n",   smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < 
> JSON_INT_MIN));
> printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: 
> if-no-check:%d:<<\n\n", smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > 
> JSON_INT_MAX));
> 
> if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
>   return 0;
> else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
>   return 0;
> 
> ```
> 
> Input is here 4.
> >> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
> >> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<
> 
> Input is here -4.
> 
> >> smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
> >> smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<

OK I think I got it. It's just because your definitions of JSON_INT_MIN
and JSON_INT_MAX are unsigned and the comparison is made in unsigned mode.
So when you do "4 < JSON_INT_MIN" it's in fact "4 < 2^64-(1<53)-1" so it's
true. And conversely for the other one.

I'm pretty sure that if you change your constants to:

  #define JSON_INT_MAX ((1LL << 53) - 1)
  #define JSON_INT_MIN (-JSON_INT_MAX)

It will work :-)

That's among the historical idiocies of the C language that considers
the signedness as part of the variable instead of being the mode of the
operation applied to the variable. This results in absurd combinations.

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 14:48, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 02:17:45PM +0200, Aleksandar Lazic wrote:

I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)


Well you know my focus is to support HAProxy and therefore it's okay.
The contribution was in the past much easier, but you know time changes.


It's not getting harder, we've always had numerous round trips,
however now there are more people participating and it's getting
increasingly difficult to maintain a constant level of quality so
it is important to take care about maintainability, which implies
being carefull about the coding style (which is really not strict)
and a good level of English in the doc (which remains achievable
as most of the contributors are not native speakers so we're not
using advanced English). In addition there's nothing wrong with
saying "I need someone to reword this part where I don't feel at
ease", it's just that nobody will force it on you as it would not
be kind nor respectful of your work.

In fact I'd say that it's got easier because most of the requirements
have been formalized by now, or are not unique to this project but
shared with other ones.


Okay, got you.


  From my point of view is it necessary to check if the value is a negative
value and only then should be checked if the max '-' range is reached.


But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.


Well then then this does not work anymore


If so it precisely shows that a problem remains somewhere else.


Hm, maybe.


http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)

with the given defines.

#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
(0-JSON_INT_MAX)

This sequence works because I check if the value is negative "smp->data.u.sint < 
0"
and only then check if the negative max border "JSON_INT_MIN"  is reached.


I'm sorry but I don't get it.


if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

The same belongs to the positive max int.

Now when I remove the check "smp->data.u.sint < 0" every positive value is
bigger then JSON INT_MIN and returns 0.


But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

 if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

 if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates 

I think it checks if the value is negative or positive and then verify if the
value is bigger then the the max allowed value, +/-.

Maybe I thing wrong, so let us work with concrete values.

```
printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n",   
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
return 0;

```

Input is here 4.
>> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
>> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<

Input is here -4.

>> smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
>> smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<

This looks to me when the comparing is done with a positive value then it
will be true for the JSON_INT_MIN and when the comparing is done with a negative
value then will it be true for JSON_INT_MAX.

So the concrete question is how to check the value in the positive and negative
range without the "smp->data.u.sint < 0" or "smp->data.u.sint > 0".

I haven't found any other solution, I'm open for any suggestion?


Willy



Regards
Aleks



Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau
On Thu, Apr 15, 2021 at 02:17:45PM +0200, Aleksandar Lazic wrote:
> > I, by far, prefer Tim's proposal here, as I do not even understand the
> > first one, sorry Aleks, please don't feel offended :-)
> 
> Well you know my focus is to support HAProxy and therefore it's okay.
> The contribution was in the past much easier, but you know time changes.

It's not getting harder, we've always had numerous round trips,
however now there are more people participating and it's getting
increasingly difficult to maintain a constant level of quality so
it is important to take care about maintainability, which implies
being carefull about the coding style (which is really not strict)
and a good level of English in the doc (which remains achievable
as most of the contributors are not native speakers so we're not
using advanced English). In addition there's nothing wrong with
saying "I need someone to reword this part where I don't feel at
ease", it's just that nobody will force it on you as it would not
be kind nor respectful of your work.

In fact I'd say that it's got easier because most of the requirements
have been formalized by now, or are not unique to this project but
shared with other ones.

> > >  From my point of view is it necessary to check if the value is a negative
> > > value and only then should be checked if the max '-' range is reached.
> > 
> > But the first one is implied by the second. It looks like a logical
> > error when read like this, it makes one think the author had something
> > different in mind. It's like writing "if (a < 0 && a < -2)". It is
> > particularly confusing.
> 
> Well then then this does not work anymore

If so it precisely shows that a problem remains somewhere else.

> http-request set-var(sess.pay_int) 
> req.body,json_query('$.integer',"int"),add(1)
> 
> with the given defines.
> 
> #define JSON_INT_MAX ((1ULL << 53) - 1)
> #define JSON_INT_MIN (0 - JSON_INT_MAX)
> 
> Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
> (0-JSON_INT_MAX)
> 
> This sequence works because I check if the value is negative 
> "smp->data.u.sint < 0"
> and only then check if the negative max border "JSON_INT_MIN"  is reached.

I'm sorry but I don't get it.

> if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
> 
> The same belongs to the positive max int.
> 
> Now when I remove the check "smp->data.u.sint < 0" every positive value is
> bigger then JSON INT_MIN and returns 0.

But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates 

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 09:08, Willy Tarreau wrote:

On Wed, Apr 14, 2021 at 09:52:31PM +0200, Aleksandar Lazic wrote:

+   - string  : This is the default search type and returns a String;
+   - boolean : If the JSON value is not a String or a Number
+   - number  : When the JSON value is a Number then will the value be
+   converted to a String. If its known that the value is a
+   integer then add 'int' to the  which helps
+   haproxy to convert the value to a integer for further usage;


I'd probably completely rephrase this as:

The json_query converter supports the JSON types string, boolean and
number. Floating point numbers will be returned as a string. By specifying
the output_type 'int' the value will be converted to an Integer. If
conversion is not possible the json_query converter fails.


Well I would like to here also some other opinions about the wording.


I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)


Well you know my focus is to support HAProxy and therefore it's okay.
The contribution was in the past much easier, but you know time changes.


+    switch(tok) {
+    case MJSON_TOK_NUMBER:
+    if (args[1].type == ARGT_SINT) {
+    smp->data.u.sint = atoll(p);
+
+    if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN) {
+    /* JSON integer too big negativ value */


This comment appears to be useless. It is implied by the 'if'. I also believe that 
the 'sint < 0' check is not needed.


Well I prefer to document in the comment what the if is doing.


OK but then please be careful about spelling, or it will force Ilya to
send yet another spell-checker patch.


 From my point of view is it necessary to check if the value is a negative
value and only then should be checked if the max '-' range is reached.


But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.


Well then then this does not work anymore

http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)

with the given defines.

#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
(0-JSON_INT_MAX)

This sequence works because I check if the value is negative "smp->data.u.sint < 
0"
and only then check if the negative max border "JSON_INT_MIN"  is reached.

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

The same belongs to the positive max int.

Now when I remove the check "smp->data.u.sint < 0" every positive value is 
bigger then JSON INT_MIN
and returns 0.

How about to add this information into the comments?


Maybe there is a better solution, I'm open for suggestions.

I can move the comment above the 'if'.


You have the choice as long as it's clear:
   - above the if, you describe what you're testing and why
   - inside the if, you describe the condition you've validated.

As it is now, it's best inside the if.

Thanks!
Willy






Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau
On Wed, Apr 14, 2021 at 09:52:31PM +0200, Aleksandar Lazic wrote:
> > > +   - string  : This is the default search type and returns a String;
> > > +   - boolean : If the JSON value is not a String or a Number
> > > +   - number  : When the JSON value is a Number then will the value be
> > > +   converted to a String. If its known that the value is a
> > > +   integer then add 'int' to the  which helps
> > > +   haproxy to convert the value to a integer for further 
> > > usage;
> > 
> > I'd probably completely rephrase this as:
> > 
> > The json_query converter supports the JSON types string, boolean and
> > number. Floating point numbers will be returned as a string. By specifying
> > the output_type 'int' the value will be converted to an Integer. If
> > conversion is not possible the json_query converter fails.
> 
> Well I would like to here also some other opinions about the wording.

I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)

> > > +  Example:
> > > + # get the value of the key 'iss' from a JWT Bearer token
> > > + http-request set-var(txn.token_payload) 
> > > req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
> > > +
> > > + # get a integer value from the request body
> > > + # "{"integer":4}" => 5
> > > + http-request set-var(txn.pay_int) 
> > > req.body,json_query('$.integer','int'),add(1)
> > > +
> > > + # get a key with '.' in the name
> > > + # {"my.key":"myvalue"} => myvalue
> > > + http-request set-var(txn.pay_mykey) 
> > > req.body,json_query('$.my\\.key')
> > > +
> > > + # {"boolean-false":false} => 0
> > > + http-request set-var(txn.pay_boolean_false) 
> > > req.body,json_query('$.boolean-false')
> > 
> > These examples look good to me. I'd just move the JWT example to the 
> > bottom, so that the simple examples come first.
> 
> I prefer to keep it like this.

I would also have preferred to start with the simpler ones but that's
not important as long as there are both simple ones and advanced ones.

> > > +    switch(tok) {
> > > +    case MJSON_TOK_NUMBER:
> > > +    if (args[1].type == ARGT_SINT) {
> > > +    smp->data.u.sint = atoll(p);
> > > +
> > > +    if (smp->data.u.sint < 0 && smp->data.u.sint < 
> > > JSON_INT_MIN) {
> > > +    /* JSON integer too big negativ value */
> > 
> > This comment appears to be useless. It is implied by the 'if'. I also 
> > believe that the 'sint < 0' check is not needed.
> 
> Well I prefer to document in the comment what the if is doing.

OK but then please be careful about spelling, or it will force Ilya to
send yet another spell-checker patch.

> From my point of view is it necessary to check if the value is a negative
> value and only then should be checked if the max '-' range is reached.

But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.

> Maybe there is a better solution, I'm open for suggestions.
> 
> I can move the comment above the 'if'.

You have the choice as long as it's clear:
  - above the if, you describe what you're testing and why
  - inside the if, you describe the condition you've validated.

As it is now, it's best inside the if.

Thanks!
Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Aleksandar Lazic

On 14.04.21 18:41, Tim Düsterhus wrote:

Aleks,

On 4/14/21 1:19 PM, Aleksandar Lazic wrote:

From 46ddac8379324b645c662e19de39d5de4ac74a77 Mon Sep 17 00:00:00 2001
From: Aleksandar Lazic 
Date: Wed, 14 Apr 2021 13:11:26 +0200
Subject: [PATCH 2/2] MINOR: sample: converter: Add json_query converter

With the json_query can a JSON value be extacted from a Header
or body of the request and saved to a variable.

This converter makes it possible to handle some JSON Workload
to route requests to differnt backends.


Typo: different.


---
 doc/configuration.txt  |  32 
 reg-tests/converter/json_query.vtc | 116 +
 src/sample.c   |  95 +++
 3 files changed, 243 insertions(+)
 create mode 100644 reg-tests/converter/json_query.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index f242300e7..374e7939b 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15961,6 +15961,38 @@ json([])
   Output log:
  {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}

+


This empty line should not be there.


+json_query(,[])
+  This converter searches for the key given by  and returns
+  the value.
+   must be a valid JSONPath String as defined in


I'd use string in lowercase.


+  https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/
+
+  A floating point value will always be returned as String.
+
+  The follwing JSON types are recognized.


Typo: following.
I'd also use a ':' instead of '.'.


+   - string  : This is the default search type and returns a String;
+   - boolean : If the JSON value is not a String or a Number
+   - number  : When the JSON value is a Number then will the value be
+   converted to a String. If its known that the value is a
+   integer then add 'int' to the  which helps
+   haproxy to convert the value to a integer for further usage;


I'd probably completely rephrase this as:

The json_query converter supports the JSON types string, boolean and number. 
Floating point numbers will be returned as a string. By specifying the 
output_type 'int' the value will be converted to an Integer. If conversion is 
not possible the json_query converter fails.


Well I would like to here also some other opinions about the wording.


+  Example:
+ # get the value of the key 'iss' from a JWT Bearer token
+ http-request set-var(txn.token_payload) 
req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
+
+ # get a integer value from the request body
+ # "{"integer":4}" => 5
+ http-request set-var(txn.pay_int) 
req.body,json_query('$.integer','int'),add(1)
+
+ # get a key with '.' in the name
+ # {"my.key":"myvalue"} => myvalue
+ http-request set-var(txn.pay_mykey) req.body,json_query('$.my\\.key')
+
+ # {"boolean-false":false} => 0
+ http-request set-var(txn.pay_boolean_false) 
req.body,json_query('$.boolean-false')


These examples look good to me. I'd just move the JWT example to the bottom, so 
that the simple examples come first.


I prefer to keep it like this.



 language([,])
   Returns the value with the highest q-factor from a list as extracted from the
   "accept-language" header using "req.fhdr". Values with no q-factor have a
diff --git a/reg-tests/converter/json_query.vtc 
b/reg-tests/converter/json_query.vtc
new file mode 100644
index 0..88ef58a0c
--- /dev/null
+++ b/reg-tests/converter/json_query.vtc
@@ -0,0 +1,116 @@
+varnishtest "JSON Query converters Test"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+server s1 {
+    rxreq
+    txresp
+
+    rxreq
+    txresp
+
+    rxreq
+    txresp
+
+ rxreq
+ txresp
+
+ rxreq
+ txresp
+
+ rxreq
+ txresp
+ + rxreq
+ txresp
+ + rxreq
+ txresp
+} -start


You can use '-repeat 8' to simplify the server definition.


Good hint, thanks.


+haproxy h1 -conf {
+    defaults
+    mode http
+    timeout connect 1s
+    timeout client  1s
+    timeout server  1s
+    option http-buffer-request
+
+    frontend fe
+    bind "fd@${fe}"
+    tcp-request inspect-delay 1s
+
+    http-request set-var(sess.header_json) 
req.hdr(Authorization),json_query('$.iss')
+ http-request set-var(sess.pay_json) req.body,json_query('$.iss')
+ http-request set-var(sess.pay_int) 
req.body,json_query('$.integer',"int"),add(1)
+ http-request set-var(sess.pay_neg_int) 
req.body,json_query('$.negativ-integer',"int"),add(1)


Inconsistent indentation here.


+    http-request set-var(sess.pay_double) req.body,json_query('$.double')
+ http-request set-var(sess.pay_boolean_true) 
req.body,json_query('$.boolean-true')
+ http-request set-var(sess.pay_boolean_false) 
req.body,json_query('$.boolean-false')
+ http-request set-var(sess.pay_mykey) req.body,json_query('$.my\\.key')
+
+    http-response set-header x-var_header %[var(sess.header_json)]
+    http-response set-header x-var_body %[var(sess.pay_json)]
+    

Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Tim Düsterhus

Aleks,

On 4/14/21 1:19 PM, Aleksandar Lazic wrote:

From 46ddac8379324b645c662e19de39d5de4ac74a77 Mon Sep 17 00:00:00 2001
From: Aleksandar Lazic 
Date: Wed, 14 Apr 2021 13:11:26 +0200
Subject: [PATCH 2/2] MINOR: sample: converter: Add json_query converter

With the json_query can a JSON value be extacted from a Header
or body of the request and saved to a variable.

This converter makes it possible to handle some JSON Workload
to route requests to differnt backends.


Typo: different.


---
 doc/configuration.txt  |  32 
 reg-tests/converter/json_query.vtc | 116 +
 src/sample.c   |  95 +++
 3 files changed, 243 insertions(+)
 create mode 100644 reg-tests/converter/json_query.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index f242300e7..374e7939b 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15961,6 +15961,38 @@ json([])
   Output log:
  {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
 
+


This empty line should not be there.


+json_query(,[])
+  This converter searches for the key given by  and returns
+  the value.
+   must be a valid JSONPath String as defined in


I'd use string in lowercase.


+  https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/
+
+  A floating point value will always be returned as String.
+
+  The follwing JSON types are recognized.


Typo: following.
I'd also use a ':' instead of '.'.


+   - string  : This is the default search type and returns a String;
+   - boolean : If the JSON value is not a String or a Number
+   - number  : When the JSON value is a Number then will the value be
+   converted to a String. If its known that the value is a
+   integer then add 'int' to the  which helps
+   haproxy to convert the value to a integer for further usage;


I'd probably completely rephrase this as:

The json_query converter supports the JSON types string, boolean and 
number. Floating point numbers will be returned as a string. By 
specifying the output_type 'int' the value will be converted to an 
Integer. If conversion is not possible the json_query converter fails.



+  Example:
+ # get the value of the key 'iss' from a JWT Bearer token
+ http-request set-var(txn.token_payload) 
req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
+
+ # get a integer value from the request body
+ # "{"integer":4}" => 5
+ http-request set-var(txn.pay_int) 
req.body,json_query('$.integer','int'),add(1)
+
+ # get a key with '.' in the name
+ # {"my.key":"myvalue"} => myvalue
+ http-request set-var(txn.pay_mykey) req.body,json_query('$.my\\.key')
+
+ # {"boolean-false":false} => 0
+ http-request set-var(txn.pay_boolean_false) 
req.body,json_query('$.boolean-false')


These examples look good to me. I'd just move the JWT example to the 
bottom, so that the simple examples come first.



 language([,])
   Returns the value with the highest q-factor from a list as extracted from the
   "accept-language" header using "req.fhdr". Values with no q-factor have a
diff --git a/reg-tests/converter/json_query.vtc 
b/reg-tests/converter/json_query.vtc
new file mode 100644
index 0..88ef58a0c
--- /dev/null
+++ b/reg-tests/converter/json_query.vtc
@@ -0,0 +1,116 @@
+varnishtest "JSON Query converters Test"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+server s1 {
+   rxreq
+   txresp
+
+   rxreq
+   txresp
+
+   rxreq
+   txresp
+
+   rxreq
+   txresp
+
+   rxreq
+   txresp
+
+   rxreq
+   txresp
+ 
+ 	rxreq

+   txresp
+ 
+ 	rxreq

+   txresp
+} -start


You can use '-repeat 8' to simplify the server definition.


+haproxy h1 -conf {
+defaults
+   mode http
+   timeout connect 1s
+   timeout client  1s
+   timeout server  1s
+   option http-buffer-request
+
+frontend fe
+   bind "fd@${fe}"
+   tcp-request inspect-delay 1s
+
+   http-request set-var(sess.header_json) 
req.hdr(Authorization),json_query('$.iss')
+   http-request set-var(sess.pay_json) req.body,json_query('$.iss')
+   http-request set-var(sess.pay_int) 
req.body,json_query('$.integer',"int"),add(1)
+http-request set-var(sess.pay_neg_int) 
req.body,json_query('$.negativ-integer',"int"),add(1)


Inconsistent indentation here.


+   http-request set-var(sess.pay_double) req.body,json_query('$.double')
+   http-request set-var(sess.pay_boolean_true) 
req.body,json_query('$.boolean-true')
+   http-request set-var(sess.pay_boolean_false) 
req.body,json_query('$.boolean-false')
+   http-request set-var(sess.pay_mykey) req.body,json_query('$.my\\.key')
+
+   http-response set-header x-var_header %[var(sess.header_json)]
+   http-response set-header x-var_body %[var(sess.pay_json)]
+   http-response set-header x-var_body_int %[var(sess.pay_int)]
+   http-response 

Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Aleksandar Lazic

Hi.

here now the current version of the patches.

Regards
Aleks.

On 14.04.21 10:45, Aleksandar Lazic wrote:

On 14.04.21 04:36, Willy Tarreau wrote:

On Wed, Apr 14, 2021 at 03:02:20AM +0200, Aleksandar Lazic wrote:

But then, could it make sense to also support "strict integers": values
that can accurately be represented as integers and which are within the
JSON valid range for integers (-2^52 to 2^52 with no decimal part) ?
This would then make the converter return nothing in case of violation
(i.e. risk of losing precision). This would also reject NaN and infinite
that the lib supports.


You mean the same check which is done in arith_add().


Not exactly because arith_add only checks for overflows after addition
and tries to cap the result, but I'd rather just say that if the decoded
number is <= -2^53 or >= 2^53 then the converter should return a no match
in case an integer was requested.


Okay got you.

There is such a check in stats.c which I copied to sample.c but this does
not look right.

Maybe I should create a include/haproxy/json-t.h and add the values there,
what do you think?


```
/* Limit JSON integer values to the range [-(2**53)+1, (2**53)-1] as per
* the recommendation for interoperable integers in section 6 of RFC 7159.
*/
#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

/* This sample function get the value from a given json string.
* The mjson library is used to parse the json struct
*/
static int sample_conv_json_query(const struct arg *args, struct sample *smp, 
void *private)
{
 struct buffer *trash = get_trash_chunk();
 const char *p; /* holds the temporay string from mjson_find */
 int tok, n;    /* holds the token enum and the length of the value */
 int rc;    /* holds the return code from mjson_get_string */

 tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, 
args[0].data.str.area, , );

 switch(tok) {
     case MJSON_TOK_NUMBER:
     if (args[1].type == ARGT_SINT) {
     smp->data.u.sint = atoll(p);

     if (smp->data.u.sint < JSON_INT_MIN || smp->data.u.sint > 
JSON_INT_MAX)
     return 0;

     smp->data.type = SMP_T_SINT;
     } else {
...
```


H that's not your fault but now I'm seeing that we already have a
converter inappropriately called "json", so we don't even know in which
direction it works by just looking at its name :-(  Same issue as for
base64.

May I suggest that you call yours "json_decode" or maybe shorter
"json_dec" so that it's more explicit that it's the decode one ? Because
for me "json_string" is the one that will emit a json string from some
input (which it is not). Then we could later create "json_enc" and warn
when "json" alone is used. Or even "jsdec" and "jsenc" which are much
shorter and still quite explicit.


How about "json_query" because it's exactly what it does :-)


I'm not familiar with the notion of "query" to decode and extract contents
but I'm not the most representative user and am aware of the "jq" command-
line utility that does this. So if it sounds natural to others I'm fine
with this.


I'm seeing that there's a very nice mjson_find() which does *exactly* what
you need:

 "In a JSON string s, len, find an element by its JSONPATH path. Save
  found element in tokptr, toklen. If not found, return JSON_TOK_INVALID.
  If found, return one of: MJSON_TOK_STRING, MJSON_TOK_NUMBER,
  MJSON_TOK_TRUE, MJSON_TOK_FALSE, MJSON_TOK_NULL, MJSON_TOK_ARRAY,
  MJSON_TOK_OBJECT.
  If a searched key contains ., [ or ] characters, they should be escaped
  by a backslash."

So you get the type in return. I think you can then call one of the
related functions depending on what is found, which is more reliable
than iterating over multiple attempts.


Oh yes, this sounds like a better approach.
I have now used this suggestion and I hope you can help me to fix the double 
parsing
issue or is it acceptable to parse the input twice?


 From what I've seen in the code in the lib you have no other option.
I thought it might be possible to call mjson_get_string() on the
resulting pointer but you would need to find a way to express that
you want to extract the immediate content, maybe by having an empty
key designation or something like this. This point is not clear to
me and the unit tests in the project all re-parse the input string
after mjson_find(), so probably this is the way to do it.


The check functions handles the int arg now as suggested.

```
/* This function checks the "json_query" converter's arguments. */
static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
    const char *file, int line, char **err)
{
if (arg[0].data.str.data == 0) { /* empty */
    memprintf(err, "json_path must not be empty");
    return 0;
}

if (arg[1].data.str.data != 0) {
    if (strncmp(arg[1].data.str.area, "int", 

Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Aleksandar Lazic

On 14.04.21 04:36, Willy Tarreau wrote:

On Wed, Apr 14, 2021 at 03:02:20AM +0200, Aleksandar Lazic wrote:

But then, could it make sense to also support "strict integers": values
that can accurately be represented as integers and which are within the
JSON valid range for integers (-2^52 to 2^52 with no decimal part) ?
This would then make the converter return nothing in case of violation
(i.e. risk of losing precision). This would also reject NaN and infinite
that the lib supports.


You mean the same check which is done in arith_add().


Not exactly because arith_add only checks for overflows after addition
and tries to cap the result, but I'd rather just say that if the decoded
number is <= -2^53 or >= 2^53 then the converter should return a no match
in case an integer was requested.


Okay got you.

There is such a check in stats.c which I copied to sample.c but this does
not look right.

Maybe I should create a include/haproxy/json-t.h and add the values there,
what do you think?


```
/* Limit JSON integer values to the range [-(2**53)+1, (2**53)-1] as per
 * the recommendation for interoperable integers in section 6 of RFC 7159.
 */
#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

/* This sample function get the value from a given json string.
 * The mjson library is used to parse the json struct
 */
static int sample_conv_json_query(const struct arg *args, struct sample *smp, 
void *private)
{
struct buffer *trash = get_trash_chunk();
const char *p; /* holds the temporay string from mjson_find */
int tok, n;/* holds the token enum and the length of the value */
int rc;/* holds the return code from mjson_get_string */

tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, 
args[0].data.str.area, , );

switch(tok) {
case MJSON_TOK_NUMBER:
if (args[1].type == ARGT_SINT) {
smp->data.u.sint = atoll(p);

if (smp->data.u.sint < JSON_INT_MIN || 
smp->data.u.sint > JSON_INT_MAX)
return 0;

smp->data.type = SMP_T_SINT;
} else {
...
```


H that's not your fault but now I'm seeing that we already have a
converter inappropriately called "json", so we don't even know in which
direction it works by just looking at its name :-(  Same issue as for
base64.

May I suggest that you call yours "json_decode" or maybe shorter
"json_dec" so that it's more explicit that it's the decode one ? Because
for me "json_string" is the one that will emit a json string from some
input (which it is not). Then we could later create "json_enc" and warn
when "json" alone is used. Or even "jsdec" and "jsenc" which are much
shorter and still quite explicit.


How about "json_query" because it's exactly what it does :-)


I'm not familiar with the notion of "query" to decode and extract contents
but I'm not the most representative user and am aware of the "jq" command-
line utility that does this. So if it sounds natural to others I'm fine
with this.


I'm seeing that there's a very nice mjson_find() which does *exactly* what
you need:

 "In a JSON string s, len, find an element by its JSONPATH path. Save
  found element in tokptr, toklen. If not found, return JSON_TOK_INVALID.
  If found, return one of: MJSON_TOK_STRING, MJSON_TOK_NUMBER,
  MJSON_TOK_TRUE, MJSON_TOK_FALSE, MJSON_TOK_NULL, MJSON_TOK_ARRAY,
  MJSON_TOK_OBJECT.
  If a searched key contains ., [ or ] characters, they should be escaped
  by a backslash."

So you get the type in return. I think you can then call one of the
related functions depending on what is found, which is more reliable
than iterating over multiple attempts.


Oh yes, this sounds like a better approach.
I have now used this suggestion and I hope you can help me to fix the double 
parsing
issue or is it acceptable to parse the input twice?


 From what I've seen in the code in the lib you have no other option.
I thought it might be possible to call mjson_get_string() on the
resulting pointer but you would need to find a way to express that
you want to extract the immediate content, maybe by having an empty
key designation or something like this. This point is not clear to
me and the unit tests in the project all re-parse the input string
after mjson_find(), so probably this is the way to do it.


The check functions handles the int arg now as suggested.

```
/* This function checks the "json_query" converter's arguments. */
static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
const char *file, int line, char **err)
{
if (arg[0].data.str.data == 0) { /* empty */
memprintf(err, "json_path must not be empty");
return 0;
}

if (arg[1].data.str.data != 0) {
  

Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Willy Tarreau
On Wed, Apr 14, 2021 at 09:05:53AM +0200, Tim Düsterhus wrote:
> Willy,
> 
> On 4/14/21 4:36 AM, Willy Tarreau wrote:
> > > How about "json_query" because it's exactly what it does :-)
> > 
> > I'm not familiar with the notion of "query" to decode and extract contents
> > but I'm not the most representative user and am aware of the "jq" command-
> > line utility that does this. So if it sounds natural to others I'm fine
> > with this.
> 
> +1 for json_query. It was among my suggestions for an improved name.
> 
> We are not decoding the full JSON into a data structure, we are querying the
> value of a single entry (as in jq or SQL). json_decode or something like
> that would be misleading.

OK, fine by me then.

Thanks,
Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-14 Thread Tim Düsterhus

Willy,

On 4/14/21 4:36 AM, Willy Tarreau wrote:

How about "json_query" because it's exactly what it does :-)


I'm not familiar with the notion of "query" to decode and extract contents
but I'm not the most representative user and am aware of the "jq" command-
line utility that does this. So if it sounds natural to others I'm fine
with this.


+1 for json_query. It was among my suggestions for an improved name.

We are not decoding the full JSON into a data structure, we are querying 
the value of a single entry (as in jq or SQL). json_decode or something 
like that would be misleading.


Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-13 Thread Willy Tarreau
On Wed, Apr 14, 2021 at 03:02:20AM +0200, Aleksandar Lazic wrote:
> > But then, could it make sense to also support "strict integers": values
> > that can accurately be represented as integers and which are within the
> > JSON valid range for integers (-2^52 to 2^52 with no decimal part) ?
> > This would then make the converter return nothing in case of violation
> > (i.e. risk of losing precision). This would also reject NaN and infinite
> > that the lib supports.
> 
> You mean the same check which is done in arith_add().

Not exactly because arith_add only checks for overflows after addition
and tries to cap the result, but I'd rather just say that if the decoded
number is <= -2^53 or >= 2^53 then the converter should return a no match
in case an integer was requested.

> > H that's not your fault but now I'm seeing that we already have a
> > converter inappropriately called "json", so we don't even know in which
> > direction it works by just looking at its name :-(  Same issue as for
> > base64.
> > 
> > May I suggest that you call yours "json_decode" or maybe shorter
> > "json_dec" so that it's more explicit that it's the decode one ? Because
> > for me "json_string" is the one that will emit a json string from some
> > input (which it is not). Then we could later create "json_enc" and warn
> > when "json" alone is used. Or even "jsdec" and "jsenc" which are much
> > shorter and still quite explicit.
> 
> How about "json_query" because it's exactly what it does :-)

I'm not familiar with the notion of "query" to decode and extract contents
but I'm not the most representative user and am aware of the "jq" command-
line utility that does this. So if it sounds natural to others I'm fine
with this.

> > I'm seeing that there's a very nice mjson_find() which does *exactly* what
> > you need:
> > 
> > "In a JSON string s, len, find an element by its JSONPATH path. Save
> >  found element in tokptr, toklen. If not found, return JSON_TOK_INVALID.
> >  If found, return one of: MJSON_TOK_STRING, MJSON_TOK_NUMBER,
> >  MJSON_TOK_TRUE, MJSON_TOK_FALSE, MJSON_TOK_NULL, MJSON_TOK_ARRAY,
> >  MJSON_TOK_OBJECT.
> >  If a searched key contains ., [ or ] characters, they should be escaped
> >  by a backslash."
> > 
> > So you get the type in return. I think you can then call one of the
> > related functions depending on what is found, which is more reliable
> > than iterating over multiple attempts.
> 
> Oh yes, this sounds like a better approach.
> I have now used this suggestion and I hope you can help me to fix the double 
> parsing
> issue or is it acceptable to parse the input twice?

>From what I've seen in the code in the lib you have no other option.
I thought it might be possible to call mjson_get_string() on the
resulting pointer but you would need to find a way to express that
you want to extract the immediate content, maybe by having an empty
key designation or something like this. This point is not clear to
me and the unit tests in the project all re-parse the input string
after mjson_find(), so probably this is the way to do it.

> The check functions handles the int arg now as suggested.
> 
> ```
> /* This function checks the "json_query" converter's arguments. */
> static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
>const char *file, int line, char **err)
> {
>   if (arg[0].data.str.data == 0) { /* empty */
>   memprintf(err, "json_path must not be empty");
>   return 0;
>   }
> 
>   if (arg[1].data.str.data != 0) {
>   if (strncmp(arg[1].data.str.area, "int", sizeof("int")) 
> != 0) {
>   memprintf(err, "output_type only supports 
> \"int\" as argument");
>   return 0;
>   } else {
>   arg[1].type = ARGT_SINT;
>   arg[1].data.sint = 0;

OK!

> I use now the token enum but I have some troubles to handle the c types 
> properly.

OK that's better, but be careful about this:

>   trash->data = mjson_get_string(smp->data.u.str.area, 
> smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
>   if (trash->data == -1 ) {

You're sending the result into trash->data whose type you don't know,
and expect it to match against -1. I'd rather store the result into
a temporary variable of the same type as the function's return (int),
test it, and only if equal, I'd assign this value to the return sample's
length.

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-13 Thread Aleksandar Lazic

On 13.04.21 11:26, Willy Tarreau wrote:

Hi Aleks,

On Mon, Apr 12, 2021 at 10:09:08PM +0200, Aleksandar Lazic wrote:

Hi.

another patch which honer the feedback.


Thank you. FWIW I agree with all the points reported by Tim. I'll add
a few comments and/or suggestions below. On a general note, please be
careful about your indenting, as it very can quickly become a total
mess. Similarly please pay attention not to leave trailing spaces
that may Git complain:

   Applying: MINOR: sample: converter: add JSON Path handling
   .git/rebase-apply/patch:39: trailing whitespace.
  - number  : When the JSON value is a number then will the value be
   .git/rebase-apply/patch:40: trailing whitespace.
  converted to a string. If you know that the value is a
   .git/rebase-apply/patch:41: trailing whitespace.
  integer then can you help haproxy to convert the value
   .git/rebase-apply/patch:46: trailing whitespace.
 This converter extracts the value located at  from the JSON
   .git/rebase-apply/patch:47: trailing whitespace.
 string in the input value.
   warning: squelched 10 whitespace errors
   warning: 15 lines add whitespace errors.

All these lines are easily noticed this way:

 $ git show | grep -c '^+.*\s$'
 15

A good way to avoid this once for all it to enable colors in Git and to
always make sure not to leave red areas in "git diff" or "git show" :

 $ git config --global color.ui true


Cool tip, I have set it now.


And even if it's of low importance for the code itself, it's particularly
important in a review because such cosmetic issues constantly remind the
reader that the patch is far from being final, so it's possibly not yet
the moment to focus on not critically important stuff. Thus in the end
they increase the number of round trips.


Thanks. I will take care about it.


The doc will be enhanced but I have a question about that sequence.
This should write the double value to the string but I think I have here some
issue.

```
printf("\n>>>DOUBLE rc:%d: double:%f:\n",rc, 
double_val);
trash->size = snprintf(trash->area,
trash->data,

"%g",double_val);
smp->data.u.str = *trash;
smp->data.type = SMP_T_STR;
```


Yeah, as Tim mentioned, you mixed size and data. "data" is the amount of
data bytes used in a chunk. "size" is its allocated size.


Fixed now, you can see it in then snipplet below.


>From 8cb1bc4aaedd17c7189d4985a57f662ab1b533a4 Mon Sep 17 00:00:00 2001
From: Aleksandar Lazic 
Date: Mon, 12 Apr 2021 22:01:04 +0200
Subject: [PATCH] MINOR: sample: converter: add JSON Path handling

With json_path can a JSON value be extacted from a Header
or body


In the final version, please add a few more lines to describe the name
of the added converter and what it's used for. As a reminder, think that
you're trying to sell your artwork to me or anyone else who would make
you proud bybackporting your work into their version :-)


Will do it :-)


+json_query(,[])
+  The  is mandatory.
+  By default will the follwing JSON types recognized.
+   - string  : This is the default search type and returns a string;
+   - number  : When the JSON value is a number then will the value be
+   converted to a string. If you know that the value is a
+   integer then can you help haproxy to convert the value
+   to a integer when you add "sint" to the ;


Just thinking loud, I looked at the rest of the doc and noticed we never
mention "sint" anywhere else, so I think it's entirely an internal type.
However we do mention "int" which is used as the matching method for
integers, so we could have:

  ... json_query("blah",sint) -m int 12

As such I would find it more natural to call this type "int" so that it
matches the same as the one used in the match. Maps already use "int" as
the output type name by the way.

In any case, I too am a bit confused by the need to force an output type.
As a user, I'd expect the type to be implicit and not to have to know
about it in the configuration. Of course we can imagine situations where
we'd want to force the type (like we sometimes do by adding 0 or
concatenating an empty string for example) but this is still not very
clear to me if we want it by default. Or maybe when dealing with floats
where we'd have to decide whether to emit them verbatim as strings or
to convert them to integers.

But then, could it make sense to also support "strict integers": values
that can accurately be represented as integers and which are within the
JSON valid range for integers (-2^52 to 2^52 with no decimal part) ?
This would then make the converter return nothing in case of violation
(i.e. risk of losing precision). This would also reject NaN and infinite
that the lib supports.


You 

Re: [PATCH] MINOR: sample: add json_string

2021-04-13 Thread Willy Tarreau
Hi Aleks,

On Mon, Apr 12, 2021 at 10:09:08PM +0200, Aleksandar Lazic wrote:
> Hi.
> 
> another patch which honer the feedback.

Thank you. FWIW I agree with all the points reported by Tim. I'll add
a few comments and/or suggestions below. On a general note, please be
careful about your indenting, as it very can quickly become a total
mess. Similarly please pay attention not to leave trailing spaces
that may Git complain:

  Applying: MINOR: sample: converter: add JSON Path handling
  .git/rebase-apply/patch:39: trailing whitespace.
 - number  : When the JSON value is a number then will the value be 
  .git/rebase-apply/patch:40: trailing whitespace.
 converted to a string. If you know that the value is a 
  .git/rebase-apply/patch:41: trailing whitespace.
 integer then can you help haproxy to convert the value 
  .git/rebase-apply/patch:46: trailing whitespace.
This converter extracts the value located at  from the JSON 
  .git/rebase-apply/patch:47: trailing whitespace.
string in the input value. 
  warning: squelched 10 whitespace errors
  warning: 15 lines add whitespace errors.

All these lines are easily noticed this way:

$ git show | grep -c '^+.*\s$'
15

A good way to avoid this once for all it to enable colors in Git and to
always make sure not to leave red areas in "git diff" or "git show" :

$ git config --global color.ui true

And even if it's of low importance for the code itself, it's particularly
important in a review because such cosmetic issues constantly remind the
reader that the patch is far from being final, so it's possibly not yet
the moment to focus on not critically important stuff. Thus in the end
they increase the number of round trips.

> The doc will be enhanced but I have a question about that sequence.
> This should write the double value to the string but I think I have here some
> issue.
> 
> ```
>   printf("\n>>>DOUBLE rc:%d: double:%f:\n",rc, 
> double_val);
>   trash->size = snprintf(trash->area,
>   trash->data,
>   
> "%g",double_val);
>   smp->data.u.str = *trash;
>   smp->data.type = SMP_T_STR;
> ```

Yeah, as Tim mentioned, you mixed size and data. "data" is the amount of
data bytes used in a chunk. "size" is its allocated size.


> >From 8cb1bc4aaedd17c7189d4985a57f662ab1b533a4 Mon Sep 17 00:00:00 2001
> From: Aleksandar Lazic 
> Date: Mon, 12 Apr 2021 22:01:04 +0200
> Subject: [PATCH] MINOR: sample: converter: add JSON Path handling
> 
> With json_path can a JSON value be extacted from a Header
> or body

In the final version, please add a few more lines to describe the name
of the added converter and what it's used for. As a reminder, think that
you're trying to sell your artwork to me or anyone else who would make
you proud bybackporting your work into their version :-)

> +json_query(,[])
> +  The  is mandatory.
> +  By default will the follwing JSON types recognized.
> +   - string  : This is the default search type and returns a string;
> +   - number  : When the JSON value is a number then will the value be 
> +   converted to a string. If you know that the value is a 
> +   integer then can you help haproxy to convert the value 
> +   to a integer when you add "sint" to the ;

Just thinking loud, I looked at the rest of the doc and noticed we never
mention "sint" anywhere else, so I think it's entirely an internal type.
However we do mention "int" which is used as the matching method for
integers, so we could have:

 ... json_query("blah",sint) -m int 12

As such I would find it more natural to call this type "int" so that it
matches the same as the one used in the match. Maps already use "int" as
the output type name by the way.

In any case, I too am a bit confused by the need to force an output type.
As a user, I'd expect the type to be implicit and not to have to know
about it in the configuration. Of course we can imagine situations where
we'd want to force the type (like we sometimes do by adding 0 or
concatenating an empty string for example) but this is still not very
clear to me if we want it by default. Or maybe when dealing with floats
where we'd have to decide whether to emit them verbatim as strings or
to convert them to integers.

But then, could it make sense to also support "strict integers": values
that can accurately be represented as integers and which are within the
JSON valid range for integers (-2^52 to 2^52 with no decimal part) ?
This would then make the converter return nothing in case of violation
(i.e. risk of losing precision). This would also reject NaN and infinite
that the lib supports.

A small detail, in general, prefer a passive form in the text rather than
speaking to the reader using "you". You'll notice that using this more

Re: [PATCH] MINOR: sample: add json_string

2021-04-12 Thread Willy Tarreau
On Mon, Apr 12, 2021 at 10:36:30PM +0200, Tim Düsterhus wrote:
> Aleks,
> 
> [Willy: I believe the patch is in a state that warrants you taking a look at
> it!]

Yeah, I'll have a look today, thanks for your exchanges on this.

Willy



Re: [PATCH] MINOR: sample: add json_string

2021-04-12 Thread Moemen MHEDHBI



On 08/04/2021 21:55, Aleksandar Lazic wrote:
> Hi.
> 
> Attached the patch to add the json_string sample.
> 
> In combination with the JWT patch is a pre-validation of a bearer token
> part possible.
> 
> I have something like this in mind.
> 
> http-request set-var(sess.json)
> req.hdr(Authorization),word(2,.),ub64dec,json_string('$.iss')
> http-request deny unless { var(sess.json) -m str
> 'kubernetes/serviceaccount' }
> 
> Regards
> Aleks

Hi,
I have also thought about something similar.
However I am not sure using a third party library is encouraged because
it may make the code less portable. Also using a third party library by
directly importing its code may be hard to maintain later.
In the end I am wondering if it is not easier to handle json parsing via
a LUA module for example.

-- 
Moemen



Re: [PATCH] MINOR: sample: add json_string

2021-04-12 Thread Tim Düsterhus

Aleks,

[Willy: I believe the patch is in a state that warrants you taking a 
look at it!]


thank you. This looks *much* better now. I primarily have some style 
complaints. I'll probably also complain about the documentation a bit 
more, but you already said that you are still working on it.


On 4/12/21 10:09 PM, Aleksandar Lazic wrote:
This should write the double value to the string but I think I have here 
some

issue.


I've responded inline to that.

Patch feedback:


From 8cb1bc4aaedd17c7189d4985a57f662ab1b533a4 Mon Sep 17 00:00:00 2001
From: Aleksandar Lazic 
Date: Mon, 12 Apr 2021 22:01:04 +0200
Subject: [PATCH] MINOR: sample: converter: add JSON Path handling

With json_path can a JSON value be extacted from a Header
or body
---
 Makefile   |3 +-
 doc/configuration.txt  |   29 +
 include/import/mjson.h |  209 ++
 reg-tests/converter/json_query.vtc |   94 +++
 src/mjson.c| 1048 
 src/sample.c   |   94 +++
 6 files changed, 1476 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/converter/json_query.vtc
 create mode 100644 src/mjson.c

diff --git a/Makefile b/Makefile
index 9b22fe4be..56d0aa28d 100644
--- a/Makefile
+++ b/Makefile
@@ -883,7 +883,8 @@ OBJS += src/mux_h2.o src/mux_fcgi.o src/http_ana.o 
src/stream.o\
 src/ebistree.o src/auth.o src/wdt.o src/http_acl.o 
\
 src/hpack-enc.o src/hpack-huff.o src/ebtree.o src/base64.o 
\
 src/hash.o src/dgram.o src/version.o src/fix.o src/mqtt.o src/dns.o
\
-src/server_state.o src/proto_uxdg.o src/init.o src/cfgdiag.o
+src/server_state.o src/proto_uxdg.o src/init.o src/cfgdiag.o   
\
+   src/mjson.o


Incorrect indentation here.


 ifneq ($(TRACE),)
 OBJS += src/calltrace.o
diff --git a/doc/configuration.txt b/doc/configuration.txt
index f21a29a68..4393d5c1f 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15958,6 +15958,35 @@ json([])
   Output log:
  {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
 
+json_query(,[])

+  The  is mandatory.


This is redundant. It is implied by the method signature (no square 
brackets around json_path).



+  By default will the follwing JSON types recognized.
+   - string  : This is the default search type and returns a string;
+   - number  : When the JSON value is a number then will the value be 
+   converted to a string. If you know that the value is a 
+   integer then can you help haproxy to convert the value 
+   to a integer when you add "sint" to the ;

+   - boolean : If the JSON value is not a String
+
+  This converter uses the mjson library https://github.com/cesanta/mjson
+  This converter extracts the value located at  from the JSON 
+  string in the input value. 
+   must be a valid JsonPath string as defined at 
+  https://goessner.net/articles/JsonPath/

+
+  A floating point value will always be returned as string!
+  
+  Example:

+ # get the value of the key kubernetes.io/serviceaccount/namespace
+ # => openshift-logging
+ http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+ 
+ # get the value of the key 'iss' from a JWT

+ # => kubernetes/serviceaccount
+ http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')
+
+
+
 language([,])
   Returns the value with the highest q-factor from a list as extracted from the
   "accept-language" header using "req.fhdr". Values with no q-factor have a
diff --git a/reg-tests/converter/json_query.vtc 
b/reg-tests/converter/json_query.vtc
new file mode 100644
index 0..b7c0e2c3a
--- /dev/null
+++ b/reg-tests/converter/json_query.vtc
@@ -0,0 +1,94 @@
+varnishtest "JSON Query converters Test"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+server s1 {
+   rxreq
+   txresp
+
+rxreq
+expect req.url == "/"
+txresp  -body "{\"iss\":\"kubernetes/serviceaccount\"}"
+
+rxreq
+expect req.url == "/"
+txresp  -body "{\"integer\":4}"
+
+rxreq
+   txresp -body "{\"boolean-true\":true}"
+
+rxreq
+   txresp -body "{\"boolean-false\":false}"
+
+rxreq
+   txresp -body "{\"my.key\":\"myvalue\"}"


Incorrect indentation above. You are mixing tabs and spaces.


+} -start
+
+haproxy h1 -conf {
+defaults
+   mode http
+   timeout connect 1s
+   timeout client  1s
+   timeout server  1s
+
+frontend fe
+   bind "fd@${fe}"
+
+http-request set-var(sess.header_json) 
req.hdr(Authorization),json_string('$.iss')
+http-request set-var(sess.pay_json) req.body,json_string('$.iss')
+http-request set-var(sess.pay_int) 
req.body,json_string('$.integer',"sint"),add(1)
+http-request set-var(sess.pay_boolean_true) 

Re: [PATCH] MINOR: sample: add json_string

2021-04-12 Thread Aleksandar Lazic

Hi.

another patch which honer the feedback.

The doc will be enhanced but I have a question about that sequence.
This should write the double value to the string but I think I have here some
issue.

```
printf("\n>>>DOUBLE rc:%d: double:%f:\n",rc, 
double_val);
trash->size = snprintf(trash->area,
trash->data,

"%g",double_val);
smp->data.u.str = *trash;
smp->data.type = SMP_T_STR;
```

I have also add more tests with some specific JSON types.

Regards
Aleks

On 11.04.21 13:04, Tim Düsterhus wrote:

Aleks,

On 4/11/21 12:28 PM, Aleksandar Lazic wrote:

Agree. I have now rethink how to do it and suggest to add a output type.

```
json_query(,)
   The  and  are mandatory.
   This converter uses the mjson library https://github.com/cesanta/mjson
   This converter extracts the value located at  from the JSON
   string in the input value.
    must be a valid JsonPath string as defined at
   https://goessner.net/articles/JsonPath/

   These are the possible output types.
    - "bool"   : A boolean is expected;
    - "sint"   : A signed 64bits integer type is expected;
    - "str"    : A string is expected. This could be a simple string or
 a JSON sub-object;

   A floating point value will always be converted to sint!
```


The converter should be able to detect the type on its own. The types are part 
of the JSON after all! The output_type argument just moves the explicit type 
specification from the converter name into an argument. Not much of an 
improvement.

I don't know how the library works exactly, but after extracting the value 
something like the following should work:

If the first character is '"' -> string
If the first character is 't' -> bool(true)
If the first character is 'f' -> bool(false)
If the first character is 'n' -> null (This should probably result in the 
converter failing).
If the first character is a digit -> number


+    { "json_string", sample_conv_json_string, ARG1(1,STR), 
sample_check_json_string , SMP_T_STR, SMP_USE_CONST },


While testing something I also just notice that SMP_USE_CONST is incorrect 
here. I cannot apply e.g. the sha1 converter to the output of json_string.


Okay. I will change both to SMP_T_ANY because the return values can be bool, 
int or str.


The input type should remain as SMP_T_STR, because you are parsing a JSON 
*string*.


While implmenting the suggested options abouve I stuggle with checking the 
params.
Arg0 is quite clear but how make a efficient check for Arg1, the output type?


The efficiency of the check is less of a concern. That happens only once during 
configuration checking.



```
/* This function checks the "json_query" converter's arguments.
  */
static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
    const char *file, int line, char **err)
{
 if (arg[0].data.str.data == 0) { /* empty */
 memprintf(err, "json_path must not be empty");
 return 0;
 }

 /* this doen't work */
 int type = smp_to_type[arg[1].data.str.area];


The output_type argument should not exist. I'll answer the question 
nonetheless: You have to compare strings explicitly in C. So you would have use 
strcmp for each of the cases.


 switch (type) {
 case SMP_T_BOOL:
 case SMP_T_SINT:
 /* These type are not const. */
 break;

 case SMP_T_STR:

```

I would to the conversation from double to int like "smp->data.u.sint = (long long 
int ) double_val;"
is this efficient. I haven't done this for a long time so I would like to have a 
"2nd eye pair" on this.



I'd probably return a double as a string instead. At least that doesn't destroy 
information.

Best regards
Tim Düsterhus


>From 8cb1bc4aaedd17c7189d4985a57f662ab1b533a4 Mon Sep 17 00:00:00 2001
From: Aleksandar Lazic 
Date: Mon, 12 Apr 2021 22:01:04 +0200
Subject: [PATCH] MINOR: sample: converter: add JSON Path handling

With json_path can a JSON value be extacted from a Header
or body
---
 Makefile   |3 +-
 doc/configuration.txt  |   29 +
 include/import/mjson.h |  209 ++
 reg-tests/converter/json_query.vtc |   94 +++
 src/mjson.c| 1048 
 src/sample.c   |   94 +++
 6 files changed, 1476 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/converter/json_query.vtc
 create mode 100644 src/mjson.c

diff --git a/Makefile b/Makefile
index 9b22fe4be..56d0aa28d 100644
--- a/Makefile
+++ b/Makefile
@@ -883,7 +883,8 @@ OBJS += src/mux_h2.o src/mux_fcgi.o src/http_ana.o src/stream.o\
 src/ebistree.o src/auth.o 

Re: [PATCH] MINOR: sample: add json_string

2021-04-11 Thread Tim Düsterhus

Aleks,

On 4/11/21 12:28 PM, Aleksandar Lazic wrote:

Agree. I have now rethink how to do it and suggest to add a output type.

```
json_query(,)
   The  and  are mandatory.
   This converter uses the mjson library https://github.com/cesanta/mjson
   This converter extracts the value located at  from the JSON
   string in the input value.
    must be a valid JsonPath string as defined at
   https://goessner.net/articles/JsonPath/

   These are the possible output types.
    - "bool"   : A boolean is expected;
    - "sint"   : A signed 64bits integer type is expected;
    - "str"    : A string is expected. This could be a simple string or
     a JSON sub-object;

   A floating point value will always be converted to sint!
```


The converter should be able to detect the type on its own. The types 
are part of the JSON after all! The output_type argument just moves the 
explicit type specification from the converter name into an argument. 
Not much of an improvement.


I don't know how the library works exactly, but after extracting the 
value something like the following should work:


If the first character is '"' -> string
If the first character is 't' -> bool(true)
If the first character is 'f' -> bool(false)
If the first character is 'n' -> null (This should probably result in 
the converter failing).

If the first character is a digit -> number

+    { "json_string", sample_conv_json_string, ARG1(1,STR), 
sample_check_json_string , SMP_T_STR, SMP_USE_CONST },


While testing something I also just notice that SMP_USE_CONST is 
incorrect here. I cannot apply e.g. the sha1 converter to the output 
of json_string.


Okay. I will change both to SMP_T_ANY because the return values can be 
bool, int or str.


The input type should remain as SMP_T_STR, because you are parsing a 
JSON *string*.


While implmenting the suggested options abouve I stuggle with checking 
the params.
Arg0 is quite clear but how make a efficient check for Arg1, the output 
type?


The efficiency of the check is less of a concern. That happens only once 
during configuration checking.




```
/* This function checks the "json_query" converter's arguments.
  */
static int sample_check_json_query(struct arg *arg, struct sample_conv 
*conv,

    const char *file, int line, char **err)
{
     if (arg[0].data.str.data == 0) { /* empty */
     memprintf(err, "json_path must not be empty");
     return 0;
     }

     /* this doen't work */
     int type = smp_to_type[arg[1].data.str.area];


The output_type argument should not exist. I'll answer the question 
nonetheless: You have to compare strings explicitly in C. So you would 
have use strcmp for each of the cases.



     switch (type) {
     case SMP_T_BOOL:
     case SMP_T_SINT:
     /* These type are not const. */
     break;

     case SMP_T_STR:

```

I would to the conversation from double to int like "smp->data.u.sint = 
(long long int ) double_val;"
is this efficient. I haven't done this for a long time so I would like 
to have a "2nd eye pair" on this.




I'd probably return a double as a string instead. At least that doesn't 
destroy information.


Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-11 Thread Aleksandar Lazic

On 10.04.21 13:22, Tim Düsterhus wrote:

Aleks,

On 4/10/21 12:24 AM, Aleksandar Lazic wrote:

+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


maybe json_get_string because there could be some more getter like bool, int, 
...


The '_string' suffix does not make sense to me, because why should the user 
need to write about the expected type when using the converter? Samples already 
store their type in HAProxy and they are automatically casted to an appropriate 
type if required (i.e. there is little difference between a numeric string and 
an int).

It should be valid to do something like this.

str('{"s": "foo", "i": 1}'),json_query('$.s'),sha1,hex

and likewise

str('{"s": "foo", "i": 1}'),json_query('$.i'),add(7)


Agree. I have now rethink how to do it and suggest to add a output type.

```
json_query(,)
  The  and  are mandatory.
  This converter uses the mjson library https://github.com/cesanta/mjson
  This converter extracts the value located at  from the JSON
  string in the input value.
   must be a valid JsonPath string as defined at
  https://goessner.net/articles/JsonPath/

  These are the possible output types.
   - "bool"   : A boolean is expected;
   - "sint"   : A signed 64bits integer type is expected;
   - "str": A string is expected. This could be a simple string or
a JSON sub-object;

  A floating point value will always be converted to sint!
```


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+ +  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A more 
general example would be preferred, because it makes it easier to understand 
the concept.


The '$.iss' is the generic JWT field.
https://tools.ietf.org/html/rfc7519#section-4.1
"iss" (Issuer) Claim


But even a JWT is a very narrow use-case ...


Agree. I will ad some generic examples.


But maybe I could look for a "normal" JSON sting and only JWT.



... I suggest to use something generic like my example above (with "foo" as a 
common placeholder value). Examples should explain the concept, not a specific use case. 
Users are smart enough to understand that they can use this to extract values from a JWT 
if this is what they need to do.


diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. And 
please make sure you understand the difference between fetches and converters.


Yeah the difference between fetchers and converters in not fully clear for me.
I think when a value is fetched from any data then it's a fetcher like this
JSON "fetcher".


The use of correct terminology is important, because everything else introduces 
confusion. It is extra important if it is used in persistent documentation (vs. 
say a discussion in IRC where it can easily be clarified).

The difference is explained in configuration.txt in the introduction of section 
7 and again at the beginning of section 7.3.1:


Sample fetch methods may be combined with transformations to be appliedon top
of the fetched sample (also called "converters"). These combinations form what
is called "sample expressions" and the result is a "sample".


Fetches *fetch* data from e.g. the connection and then return a *sample*.

Converters *convert* data from an existing *sample* and then return a new 
*sample*.



That nails it down, thanks.


-


+ */
+static int sample_check_json_string(struct arg *arg, struct sample_conv *conv,
+   const char *file, int line, char **err)
+{
+    DPRINTF(stderr, "%s: arg->type=%d, arg->data.str.data=%ld\n",
+    __FUNCTION__,
+    arg->type, arg->data.str.data);


Debug code above.


This was intentionally. I asked my self why no Debug option is set.
This will only be printed with 'DEBUG=-DDEBUG_FULL'.
Maybe there should be a "DBUG_SAMPLES" like the other "DEBUG_*" options.


Imagine how the code and also the debug output would look if every converter would 
output several lines of debug output. Additionally there's not much useful 
information in the output here. arg->type is always going to be ARGT_STR, 
because HAProxy will automatically cast the argument based on the converter 
definition. The length of the string also is pretty much what you expect it to be.

There's also the 'debug' converter that effectively does what that line does.

Don't get me wrong. I also enjoy using 'printf' while debugging my code. But it 

Re: [PATCH] MINOR: sample: add json_string

2021-04-10 Thread Tim Düsterhus

Aleks,

On 4/10/21 12:24 AM, Aleksandar Lazic wrote:

+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


maybe json_get_string because there could be some more getter like bool, 
int, ...


The '_string' suffix does not make sense to me, because why should the 
user need to write about the expected type when using the converter? 
Samples already store their type in HAProxy and they are automatically 
casted to an appropriate type if required (i.e. there is little 
difference between a numeric string and an int).


It should be valid to do something like this.

str('{"s": "foo", "i": 1}'),json_query('$.s'),sha1,hex

and likewise

str('{"s": "foo", "i": 1}'),json_query('$.i'),add(7)


+  # get the value from the key 
kubernetes.io/serviceaccount/namespace

+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace') 


+ +  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A 
more general example would be preferred, because it makes it easier to 
understand the concept.


The '$.iss' is the generic JWT field.
https://tools.ietf.org/html/rfc7519#section-4.1
"iss" (Issuer) Claim


But even a JWT is a very narrow use-case ...


But maybe I could look for a "normal" JSON sting and only JWT.



... I suggest to use something generic like my example above (with "foo" 
as a common placeholder value). Examples should explain the concept, not 
a specific use case. Users are smart enough to understand that they can 
use this to extract values from a JWT if this is what they need to do.


diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc

new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. 
And please make sure you understand the difference between fetches and 
converters.


Yeah the difference between fetchers and converters in not fully clear 
for me.

I think when a value is fetched from any data then it's a fetcher like this
JSON "fetcher".


The use of correct terminology is important, because everything else 
introduces confusion. It is extra important if it is used in persistent 
documentation (vs. say a discussion in IRC where it can easily be 
clarified).


The difference is explained in configuration.txt in the introduction of 
section 7 and again at the beginning of section 7.3.1:



Sample fetch methods may be combined with transformations to be applied on top
of the fetched sample (also called "converters"). These combinations form what
is called "sample expressions" and the result is a "sample".


Fetches *fetch* data from e.g. the connection and then return a *sample*.

Converters *convert* data from an existing *sample* and then return a 
new *sample*.


-


+ */
+static int sample_check_json_string(struct arg *arg, struct 
sample_conv *conv,

+   const char *file, int line, char **err)
+{
+    DPRINTF(stderr, "%s: arg->type=%d, arg->data.str.data=%ld\n",
+    __FUNCTION__,
+    arg->type, arg->data.str.data);


Debug code above.


This was intentionally. I asked my self why no Debug option is set.
This will only be printed with 'DEBUG=-DDEBUG_FULL'.
Maybe there should be a "DBUG_SAMPLES" like the other "DEBUG_*" options.


Imagine how the code and also the debug output would look if every 
converter would output several lines of debug output. Additionally 
there's not much useful information in the output here. arg->type is 
always going to be ARGT_STR, because HAProxy will automatically cast the 
argument based on the converter definition. The length of the string 
also is pretty much what you expect it to be.


There's also the 'debug' converter that effectively does what that line 
does.


Don't get me wrong. I also enjoy using 'printf' while debugging my code. 
But it is going to be removed after I finish debugging. The parts I 
touch are usually not that complex or deep in the internals that such 
output would be useful enough in case issues arise.



+    { "json_string", sample_conv_json_string, ARG1(1,STR),  
sample_check_json_string , SMP_T_STR, SMP_USE_CONST },


While testing something I also just notice that SMP_USE_CONST is 
incorrect here. I cannot apply e.g. the sha1 converter to the output of 
json_string.


Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-09 Thread Aleksandar Lazic

Tim.

On 09.04.21 18:55, Tim Düsterhus wrote:

Aleks,


> I have taken a first look. Find my remarks below. Please note that for the 
actual
> source code there might be further remarks by Willy (put in CC) or so. I 
might have
> missed something or might have told you something incorrect. So maybe before 
making
> changes wait for their opinion.

Thank you for your feedback.

> Generally I must say that I don't like the mjson library, because it uses 
'int' for
> sizes. It doesn't really bring the point home that it is a safe library. This 
one
> looks much better to me: https://github.com/FreeRTOS/coreJSON. It does not 
support
> JSON path, though. Not sure how much of an issue that would be?

Well I have created a issue in coreJSON how to handle the "." in the key.
https://github.com/FreeRTOS/coreJSON/issues/92

I have choose the mjson library because it was small and offers the JSON path 
feature.


On 4/8/21 10:21 PM, Aleksandar Lazic wrote:

From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001
From: Alekesandar Lazic 
Date: Thu, 8 Apr 2021 21:42:00 +0200
Subject: [PATCH] MINOR: sample: add json_string


I'd add 'converter' to the subject line to make it clear that this is a 
conveter.



This sample get's the value of a JSON key


Typo: It should be 'gets'.


---
 Makefile |    3 +-
 doc/configuration.txt    |   15 +
 include/import/mjson.h   |  213 +
 reg-tests/sample_fetches/json_string.vtc |   25 +
 src/mjson.c  | 1052 ++
 src/sample.c |   63 ++
 6 files changed, 1370 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/sample_fetches/json_string.vtc
 create mode 100644 src/mjson.c

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 01a01eccc..7f2732668 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -19043,6 +19043,21 @@ http_first_req : boolean


This is the 'fetch' section. Move the documentation to the 'converter' section.

>

   from some requests when a request is not the first one, or to help grouping
   requests in the logs.

+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


maybe json_get_string because there could be some more getter like bool, int, 
...


+  Returns the string value of the given json path.


It should be "JSON" (in uppercase) here and everywhere else.


Okay and agree.


+  The  is required.
+  This sample uses the mjson library https://github.com/cesanta/mjson
+  The json path syntax is defined in this repo 
https://github.com/json-path/JsonPath


Overall the description of the converter does not read nicely / feels 
inconsistent compared to other converters / uses colloquial language.


Let me suggest something like:

Extracts the value located at  from the JSON string in the input 
value.  must be a valid JsonPath string as defined at https://goessner.net/articles/JsonPath/


I changed the link, because that appears to be the canonical reference.


Okay.


+  Example :


No space in front of the colon.


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+ +  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A more 
general example would be preferred, because it makes it easier to understand the concept.


The '$.iss' is the generic JWT field.
https://tools.ietf.org/html/rfc7519#section-4.1
"iss" (Issuer) Claim

But maybe I could look for a "normal" JSON sting and only JWT.



 method : integer + string
   Returns an integer value corresponding to the method in the HTTP request. For
   example, "GET" equals 1 (check sources to establish the matching). Value 9
diff --git a/include/import/mjson.h b/include/import/mjson.h
new file mode 100644
index 0..ff46e7950
--- /dev/null
+++ b/include/import/mjson.h
@@ -0,0 +1,213 @@
[...]
+// Aleksandar Lazic
+// git clone from 2021-08-04 because of this fix
+// 
https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee


Please don't edit third party libraries, even if it is just a comment. This 
makes updating hard.


Okay.


diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. And please make 
sure you understand the difference between fetches and converters.


Yeah the difference between fetchers and converters in 

Re: [PATCH] MINOR: sample: add json_string

2021-04-09 Thread Tim Düsterhus

Aleks,

I have taken a first look. Find my remarks below. Please note that for 
the actual source code there might be further remarks by Willy (put in 
CC) or so. I might have missed something or might have told you 
something incorrect. So maybe before making changes wait for their opinion.


Generally I must say that I don't like the mjson library, because it 
uses 'int' for sizes. It doesn't really bring the point home that it is 
a safe library. This one looks much better to me: 
https://github.com/FreeRTOS/coreJSON. It does not support JSON path, 
though. Not sure how much of an issue that would be?


On 4/8/21 10:21 PM, Aleksandar Lazic wrote:

From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001
From: Alekesandar Lazic 
Date: Thu, 8 Apr 2021 21:42:00 +0200
Subject: [PATCH] MINOR: sample: add json_string


I'd add 'converter' to the subject line to make it clear that this is a 
conveter.




This sample get's the value of a JSON key


Typo: It should be 'gets'.


---
 Makefile |3 +-
 doc/configuration.txt|   15 +
 include/import/mjson.h   |  213 +
 reg-tests/sample_fetches/json_string.vtc |   25 +
 src/mjson.c  | 1052 ++
 src/sample.c |   63 ++
 6 files changed, 1370 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/sample_fetches/json_string.vtc
 create mode 100644 src/mjson.c

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 01a01eccc..7f2732668 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -19043,6 +19043,21 @@ http_first_req : boolean


This is the 'fetch' section. Move the documentation to the 'converter' 
section.



   from some requests when a request is not the first one, or to help grouping
   requests in the logs.
 
+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


+  Returns the string value of the given json path.


It should be "JSON" (in uppercase) here and everywhere else.


+  The  is required.
+  This sample uses the mjson library https://github.com/cesanta/mjson
+  The json path syntax is defined in this repo 
https://github.com/json-path/JsonPath


Overall the description of the converter does not read nicely / feels 
inconsistent compared to other converters / uses colloquial language.


Let me suggest something like:

Extracts the value located at  from the JSON string in the 
input value.  must be a valid JsonPath string as defined at 
https://goessner.net/articles/JsonPath/


I changed the link, because that appears to be the canonical reference.


+  Example :


No space in front of the colon.


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+  
+  # get the value from the key iss

+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A 
more general example would be preferred, because it makes it easier to 
understand the concept.



 method : integer + string
   Returns an integer value corresponding to the method in the HTTP request. For
   example, "GET" equals 1 (check sources to establish the matching). Value 9
diff --git a/include/import/mjson.h b/include/import/mjson.h
new file mode 100644
index 0..ff46e7950
--- /dev/null
+++ b/include/import/mjson.h
@@ -0,0 +1,213 @@
[...]
+// Aleksandar Lazic
+// git clone from 2021-08-04 because of this fix
+// 
https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee


Please don't edit third party libraries, even if it is just a comment. 
This makes updating hard.



diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. 
And please make sure you understand the difference between fetches and 
converters.



@@ -0,0 +1,25 @@
+varnishtest "Test to get value from json sting"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+haproxy h1 -conf {
+defaults
+mode http
+timeout connect 1s
+timeout client  1s
+timeout server  1s
+
+frontend fe1
+bind "fd@${fe1}"
+http-request set-var(sess.iss) 
req.hdr(Authorization),b64dec,json_string('$.iss')
+http-request return status 200 hdr x-var "json-value=%[var(sess.iss)]"
+} -start
+
+client c1 -connect ${h1_fe1_sock} {
+txreq -req GET -url "/" \
+  -hdr "Authorization: 

Re: [PATCH] MINOR: sample: add json_string

2021-04-08 Thread Aleksandar Lazic

Hi.

Sorry I have now seen the copy paste error.
please use this patch

Regards
Alex

On 08.04.21 21:55, Aleksandar Lazic wrote:

Hi.

Attached the patch to add the json_string sample.

In combination with the JWT patch is a pre-validation of a bearer token part 
possible.

I have something like this in mind.

http-request set-var(sess.json) 
req.hdr(Authorization),word(2,.),ub64dec,json_string('$.iss')
http-request deny unless { var(sess.json) -m str 'kubernetes/serviceaccount' }

Regards
Aleks


>From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001
From: Alekesandar Lazic 
Date: Thu, 8 Apr 2021 21:42:00 +0200
Subject: [PATCH] MINOR: sample: add json_string

This sample get's the value of a JSON key
---
 Makefile |3 +-
 doc/configuration.txt|   15 +
 include/import/mjson.h   |  213 +
 reg-tests/sample_fetches/json_string.vtc |   25 +
 src/mjson.c  | 1052 ++
 src/sample.c |   63 ++
 6 files changed, 1370 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/sample_fetches/json_string.vtc
 create mode 100644 src/mjson.c

diff --git a/Makefile b/Makefile
index 9b22fe4be..559248867 100644
--- a/Makefile
+++ b/Makefile
@@ -883,7 +883,8 @@ OBJS += src/mux_h2.o src/mux_fcgi.o src/http_ana.o src/stream.o\
 src/ebistree.o src/auth.o src/wdt.o src/http_acl.o \
 src/hpack-enc.o src/hpack-huff.o src/ebtree.o src/base64.o \
 src/hash.o src/dgram.o src/version.o src/fix.o src/mqtt.o src/dns.o\
-src/server_state.o src/proto_uxdg.o src/init.o src/cfgdiag.o
+src/server_state.o src/proto_uxdg.o src/init.o src/cfgdiag.o   \
+src/mjson.o
 
 ifneq ($(TRACE),)
 OBJS += src/calltrace.o
diff --git a/doc/configuration.txt b/doc/configuration.txt
index 01a01eccc..7f2732668 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -19043,6 +19043,21 @@ http_first_req : boolean
   from some requests when a request is not the first one, or to help grouping
   requests in the logs.
 
+json_string() : string
+  Returns the string value of the given json path.
+  The  is required.
+  This sample uses the mjson library https://github.com/cesanta/mjson
+  The json path syntax is defined in this repo https://github.com/json-path/JsonPath
+
+  Example :
+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+  
+  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.iss')
+
 method : integer + string
   Returns an integer value corresponding to the method in the HTTP request. For
   example, "GET" equals 1 (check sources to establish the matching). Value 9
diff --git a/include/import/mjson.h b/include/import/mjson.h
new file mode 100644
index 0..ff46e7950
--- /dev/null
+++ b/include/import/mjson.h
@@ -0,0 +1,213 @@
+// Copyright (c) 2018-2020 Cesanta Software Limited
+// All rights reserved
+//
+// Permission is hereby granted, free of charge, to any person obtaining a copy
+// of this software and associated documentation files (the "Software"), to deal
+// in the Software without restriction, including without limitation the rights
+// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+// copies of the Software, and to permit persons to whom the Software is
+// furnished to do so, subject to the following conditions:
+//
+// The above copyright notice and this permission notice shall be included in
+// all copies or substantial portions of the Software.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+// SOFTWARE.
+
+// Aleksandar Lazic
+// git clone from 2021-08-04 because of this fix
+// https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee
+
+#ifndef MJSON_H
+#define MJSON_H
+
+#include 
+#include 
+#include 
+
+#ifndef MJSON_ENABLE_PRINT
+#define MJSON_ENABLE_PRINT 1
+#endif
+
+#ifndef MJSON_ENABLE_RPC
+#define MJSON_ENABLE_RPC 1
+#endif
+
+#ifndef MJSON_ENABLE_BASE64
+#define MJSON_ENABLE_BASE64 1
+#endif
+
+#ifndef MJSON_ENABLE_MERGE
+#define MJSON_ENABLE_MERGE 0
+#elif MJSON_ENABLE_MERGE
+#define MJSON_ENABLE_NEXT 1
+#endif
+
+#ifndef