[PATCH] BUG/MINOR: sample: Fix adjusting size in field converter

2021-04-11 Thread Thayne McCombs
Adjust the size of the sample buffer before we change the "area"
pointer. The change in size is calculated as the difference between the
original pointer and the new start pointer. But since the
`smp->data.u.str.area` assignment results in `smp->data.u.str.area` and
`start` being the same pointer, we always ended up substracting zero.
This changes it to change the size by the actual amount it changed.

I'm not entirely sure what the impact of this is, but the previous code
seemed wrong.
---
 src/sample.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index eac489538..05d78bcb9 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -2561,13 +2561,13 @@ static int sample_conv_field(const struct arg *arg_p, 
struct sample *smp, void *
if (!smp->data.u.str.data)
return 1;
 
-   smp->data.u.str.area = start;
-
/* Compute remaining size if needed
Note: smp->data.u.str.size cannot be set to 0 */
if (smp->data.u.str.size)
smp->data.u.str.size -= start - smp->data.u.str.area;
 
+   smp->data.u.str.area = start;
+
return 1;
 }
 
-- 
2.31.1




Re: [PATCH] MINOR: sample: add json_string

2021-04-11 Thread Tim Düsterhus

Aleks,

On 4/11/21 12:28 PM, Aleksandar Lazic wrote:

Agree. I have now rethink how to do it and suggest to add a output type.

```
json_query(,)
   The  and  are mandatory.
   This converter uses the mjson library https://github.com/cesanta/mjson
   This converter extracts the value located at  from the JSON
   string in the input value.
    must be a valid JsonPath string as defined at
   https://goessner.net/articles/JsonPath/

   These are the possible output types.
    - "bool"   : A boolean is expected;
    - "sint"   : A signed 64bits integer type is expected;
    - "str"    : A string is expected. This could be a simple string or
     a JSON sub-object;

   A floating point value will always be converted to sint!
```


The converter should be able to detect the type on its own. The types 
are part of the JSON after all! The output_type argument just moves the 
explicit type specification from the converter name into an argument. 
Not much of an improvement.


I don't know how the library works exactly, but after extracting the 
value something like the following should work:


If the first character is '"' -> string
If the first character is 't' -> bool(true)
If the first character is 'f' -> bool(false)
If the first character is 'n' -> null (This should probably result in 
the converter failing).

If the first character is a digit -> number

+    { "json_string", sample_conv_json_string, ARG1(1,STR), 
sample_check_json_string , SMP_T_STR, SMP_USE_CONST },


While testing something I also just notice that SMP_USE_CONST is 
incorrect here. I cannot apply e.g. the sha1 converter to the output 
of json_string.


Okay. I will change both to SMP_T_ANY because the return values can be 
bool, int or str.


The input type should remain as SMP_T_STR, because you are parsing a 
JSON *string*.


While implmenting the suggested options abouve I stuggle with checking 
the params.
Arg0 is quite clear but how make a efficient check for Arg1, the output 
type?


The efficiency of the check is less of a concern. That happens only once 
during configuration checking.




```
/* This function checks the "json_query" converter's arguments.
  */
static int sample_check_json_query(struct arg *arg, struct sample_conv 
*conv,

    const char *file, int line, char **err)
{
     if (arg[0].data.str.data == 0) { /* empty */
     memprintf(err, "json_path must not be empty");
     return 0;
     }

     /* this doen't work */
     int type = smp_to_type[arg[1].data.str.area];


The output_type argument should not exist. I'll answer the question 
nonetheless: You have to compare strings explicitly in C. So you would 
have use strcmp for each of the cases.



     switch (type) {
     case SMP_T_BOOL:
     case SMP_T_SINT:
     /* These type are not const. */
     break;

     case SMP_T_STR:

```

I would to the conversation from double to int like "smp->data.u.sint = 
(long long int ) double_val;"
is this efficient. I haven't done this for a long time so I would like 
to have a "2nd eye pair" on this.




I'd probably return a double as a string instead. At least that doesn't 
destroy information.


Best regards
Tim Düsterhus



Re: [PATCH] MINOR: sample: add json_string

2021-04-11 Thread Aleksandar Lazic

On 10.04.21 13:22, Tim Düsterhus wrote:

Aleks,

On 4/10/21 12:24 AM, Aleksandar Lazic wrote:

+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


maybe json_get_string because there could be some more getter like bool, int, 
...


The '_string' suffix does not make sense to me, because why should the user 
need to write about the expected type when using the converter? Samples already 
store their type in HAProxy and they are automatically casted to an appropriate 
type if required (i.e. there is little difference between a numeric string and 
an int).

It should be valid to do something like this.

str('{"s": "foo", "i": 1}'),json_query('$.s'),sha1,hex

and likewise

str('{"s": "foo", "i": 1}'),json_query('$.i'),add(7)


Agree. I have now rethink how to do it and suggest to add a output type.

```
json_query(,)
  The  and  are mandatory.
  This converter uses the mjson library https://github.com/cesanta/mjson
  This converter extracts the value located at  from the JSON
  string in the input value.
   must be a valid JsonPath string as defined at
  https://goessner.net/articles/JsonPath/

  These are the possible output types.
   - "bool"   : A boolean is expected;
   - "sint"   : A signed 64bits integer type is expected;
   - "str": A string is expected. This could be a simple string or
a JSON sub-object;

  A floating point value will always be converted to sint!
```


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+ +  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A more 
general example would be preferred, because it makes it easier to understand 
the concept.


The '$.iss' is the generic JWT field.
https://tools.ietf.org/html/rfc7519#section-4.1
"iss" (Issuer) Claim


But even a JWT is a very narrow use-case ...


Agree. I will ad some generic examples.


But maybe I could look for a "normal" JSON sting and only JWT.



... I suggest to use something generic like my example above (with "foo" as a 
common placeholder value). Examples should explain the concept, not a specific use case. 
Users are smart enough to understand that they can use this to extract values from a JWT 
if this is what they need to do.


diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. And 
please make sure you understand the difference between fetches and converters.


Yeah the difference between fetchers and converters in not fully clear for me.
I think when a value is fetched from any data then it's a fetcher like this
JSON "fetcher".


The use of correct terminology is important, because everything else introduces 
confusion. It is extra important if it is used in persistent documentation (vs. 
say a discussion in IRC where it can easily be clarified).

The difference is explained in configuration.txt in the introduction of section 
7 and again at the beginning of section 7.3.1:


Sample fetch methods may be combined with transformations to be appliedon top
of the fetched sample (also called "converters"). These combinations form what
is called "sample expressions" and the result is a "sample".


Fetches *fetch* data from e.g. the connection and then return a *sample*.

Converters *convert* data from an existing *sample* and then return a new 
*sample*.



That nails it down, thanks.


-


+ */
+static int sample_check_json_string(struct arg *arg, struct sample_conv *conv,
+   const char *file, int line, char **err)
+{
+    DPRINTF(stderr, "%s: arg->type=%d, arg->data.str.data=%ld\n",
+    __FUNCTION__,
+    arg->type, arg->data.str.data);


Debug code above.


This was intentionally. I asked my self why no Debug option is set.
This will only be printed with 'DEBUG=-DDEBUG_FULL'.
Maybe there should be a "DBUG_SAMPLES" like the other "DEBUG_*" options.


Imagine how the code and also the debug output would look if every converter would 
output several lines of debug output. Additionally there's not much useful 
information in the output here. arg->type is always going to be ARGT_STR, 
because HAProxy will automatically cast the argument based on the converter 
definition. The length of the string also is pretty much what you expect it to be.

There's also the 'debug' converter that effectively does what that line does.

Don't get me wrong. I also enjoy using 'printf' while debugging my code. But it