Re: [PATCH] MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments

2023-09-22 Thread Willy Tarreau
Hi Lokesh,

On Sun, Sep 17, 2023 at 09:45:31PM -0700, Lokesh Jindal wrote:
> Hello Willy
> 
> Thanks for the detailed explanations and the feedback.
> PFA the two patches based on your feedback. My apologies for the delay.

Thanks, I've applied them now.

> >   - maybe a combination of both, i.e. return 0 as long as the caller does
> > not pass SMP_OPT_FINAL, or return a truncated response.
> 
> 
> Sounds good. The revised patch uses option 3 above.

OK I also think it's the most versatile choice.

> > >   /* Try to decode a variable. */
> > > - if (vars_check_arg([0], NULL))
> > > + if (vars_check_arg([0], err))
> > >   return 1;
> 
> 
> > I can't say if this change was intentional or not, nor if the NULL had a
> > particular reason or not, but at least this is unrelated to your patch
> > and would need to be in its own patch, so that in case of regression we
> > can decide whether to fix or revert it. And if it really fixes an issue
> > maybe we'll want to backport it.
> 
> This was intentional. I thought it was a "miss" in the existing code. This
> change is now part of the second patch.

Perfect, thanks.

> I have also fixed indentation/tabs, column width, etc. based on your
> feedback.
> 
> 
> I have a Q for you. Pasting code from the revised patch.
> + // arg1 value is greater than the remaining length
> + if (smp->opt & SMP_OPT_FINAL) {
> +   // truncate to remaining length
> +   length = smp->data.u.str.data - start_idx;
> + } else {
> +   smp->data.u.str.data = 0;
> +   return 0;
> + }
> 
> I tried to write a test in the vtc file for the "else" clause above, but
> could not figure out when we will hit that condition. Here is what I tried.
> Added following the config section.
> +   http-response set-header hdr_bytes_0_7
>  "%[res.hdr(input),bytes(0,7)]"
> +   http-response set-header hdr_bytes_0_7_concat
> "%[res.hdr(input),bytes(0,7),concat()]"
> 
> Added following in the client section
> +   # since "bytes" is the non-final operator, response would be empty
> +   expect resp.http.hdr_bytes_0_7_concat == ""
> 
> But the test failed:
>  c2EXPECT resp.http.hdr_bytes_0_7_concat (012345) == "" failed
> 
> Anyway, I can add this such a test in another patch if needed and time
> permits. Let me know what you think.

You can't do these using HTTP rules because they're only evaluated once
the message is complete. The only case these are used is in TCP rules,
because TCP rules can also process HTTP by decoding it on the fly. For
example you can have a TCP frontend that inspects protocol, and if it
detects HTTP with a certain header or content, routes it to an HTTP
backend, otherwise routes it to a TCP frontend. And more generally at
the TCP level it's more common to process bytes than at the HTTP level.
We could for instance imagine that you extract bytes from an SSL Client
Hello message to figure an offset and a length, then assign these to a
pair of variables, and reuse them with bytes() with your modification.

But raw TCP is less easy to manipulate in VTC, and stuff like this is
prone to time races, which regularly cause false positives, and the best
is the ennemy of good, so no worries :-)

Thanks!
Willy



Re: [PATCH] MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments

2023-09-17 Thread Lokesh Jindal
Hello Willy

Thanks for the detailed explanations and the feedback.
PFA the two patches based on your feedback. My apologies for the delay.

Truncation can be useful but I could understand that when extracting bytes
> one would want this exact number of bytes to continue the processing. I
> just wanted to mention this to make sure this is what you had in mind as
> well

Yes, this is what I was thinking.

Here I'm wondering what we should do if the user asks for more bytes than
> available. I think we have 3 options:
>   - return 0 so that the sample's flags are consulted by the caller (e.g.
> SMP_F_MAY_CHANGE will allow the caller to come back later trying to
> get more bytes, or fail if not enough)


>   - truncate to what is available


>   - maybe a combination of both, i.e. return 0 as long as the caller does
> not pass SMP_OPT_FINAL, or return a truncated response.


Sounds good. The revised patch uses option 3 above.


> >   /* Try to decode a variable. */
> > - if (vars_check_arg([0], NULL))
> > + if (vars_check_arg([0], err))
> >   return 1;


> I can't say if this change was intentional or not, nor if the NULL had a
> particular reason or not, but at least this is unrelated to your patch
> and would need to be in its own patch, so that in case of regression we
> can decide whether to fix or revert it. And if it really fixes an issue
> maybe we'll want to backport it.

This was intentional. I thought it was a "miss" in the existing code. This
change is now part of the second patch.

I have also fixed indentation/tabs, column width, etc. based on your
feedback.


I have a Q for you. Pasting code from the revised patch.
+ // arg1 value is greater than the remaining length
+ if (smp->opt & SMP_OPT_FINAL) {
+   // truncate to remaining length
+   length = smp->data.u.str.data - start_idx;
+ } else {
+   smp->data.u.str.data = 0;
+   return 0;
+ }

I tried to write a test in the vtc file for the "else" clause above, but
could not figure out when we will hit that condition. Here is what I tried.
Added following the config section.
+   http-response set-header hdr_bytes_0_7
 "%[res.hdr(input),bytes(0,7)]"
+   http-response set-header hdr_bytes_0_7_concat
"%[res.hdr(input),bytes(0,7),concat()]"

Added following in the client section
+   # since "bytes" is the non-final operator, response would be empty
+   expect resp.http.hdr_bytes_0_7_concat == ""

But the test failed:
 c2EXPECT resp.http.hdr_bytes_0_7_concat (012345) == "" failed

Anyway, I can add this such a test in another patch if needed and time
permits. Let me know what you think.

Thanks
Lokesh

On Fri, Sep 8, 2023 at 10:05 AM Willy Tarreau  wrote:

> On Thu, Sep 07, 2023 at 10:45:58AM -0700, Lokesh Jindal wrote:
> > Resending the patch with the right email subject. Previos email
> > discussion - here
> > .
>
> I saw your previous mail, just didn't have time yet to review it.
>
> > PFA the patch file.
>
> Thanks!
>
> > I have two questions in general:
> > 1. Can you explain what this method does and why is it needed? (I used it
> > in my patch following the pattern in the converter "sub")
> > *smp_set_owner(_arg0, smp->px, smp->sess, smp->strm, smp->opt);**
>
> It just makes sure that the newly created sample properly references
> the context from which it was created. I remember that *some* functions
> retrieve some elements there, maybe ACLs or Lua actions that need to
> yield, etc.
>
> > 2. In my patch, *sample_conv_bytes* returns 0 in case of an invalid arg
> > e.g. negative offset. Can you explain when a converter should return 0 V
> 1?
> > I am not sure how that impacts the haproxy behavior.
>
> Normally when evaluating an expression, 0 is returned to indicate
> "nothing for now" but this doesn't preclude that it will not be possible
> to have more info later (depending on extra flags added such as
> SMP_F_MAY_CHANGE). 1 normally indicates that we have a result and that
> the sample is correctly filled. If a zero is returned without any flag,
> it will usually abort the evaluation of the expression (e.g. it cannot
> be converted thus processed). ACL evaluation will not go further with
> that sample for example. If 1 is returned but the sample type is not
> filled, implicitly the caller will see that SMP_T_ANY is returned and
> not compatible with the next expected type so it will stop there as well.
> Thus I think that returning 0 or 1 in case of failure to process args will
> achieve the same result but for slightly different reasons. 0 is probably
> cleaner though.
>
> > + if (smp_arg1.data.u.sint < 0 || (smp_arg1.data.u.sint >
> (smp->data.u.str.data - start_idx))) {
> > + // empty output if the arg2 value is negative or
> bigger than the remaining length
> > + smp->data.u.str.data = 0;
> > + 

Re: [PATCH] MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments

2023-09-08 Thread Willy Tarreau
On Thu, Sep 07, 2023 at 10:45:58AM -0700, Lokesh Jindal wrote:
> Resending the patch with the right email subject. Previos email
> discussion - here
> .

I saw your previous mail, just didn't have time yet to review it.

> PFA the patch file.

Thanks!

> I have two questions in general:
> 1. Can you explain what this method does and why is it needed? (I used it
> in my patch following the pattern in the converter "sub")
> *smp_set_owner(_arg0, smp->px, smp->sess, smp->strm, smp->opt);**

It just makes sure that the newly created sample properly references
the context from which it was created. I remember that *some* functions
retrieve some elements there, maybe ACLs or Lua actions that need to
yield, etc.

> 2. In my patch, *sample_conv_bytes* returns 0 in case of an invalid arg
> e.g. negative offset. Can you explain when a converter should return 0 V 1?
> I am not sure how that impacts the haproxy behavior.

Normally when evaluating an expression, 0 is returned to indicate
"nothing for now" but this doesn't preclude that it will not be possible
to have more info later (depending on extra flags added such as
SMP_F_MAY_CHANGE). 1 normally indicates that we have a result and that
the sample is correctly filled. If a zero is returned without any flag,
it will usually abort the evaluation of the expression (e.g. it cannot
be converted thus processed). ACL evaluation will not go further with
that sample for example. If 1 is returned but the sample type is not
filled, implicitly the caller will see that SMP_T_ANY is returned and
not compatible with the next expected type so it will stop there as well.
Thus I think that returning 0 or 1 in case of failure to process args will
achieve the same result but for slightly different reasons. 0 is probably
cleaner though.

> + if (smp_arg1.data.u.sint < 0 || (smp_arg1.data.u.sint > 
> (smp->data.u.str.data - start_idx))) {
> + // empty output if the arg2 value is negative or bigger 
> than the remaining length
> + smp->data.u.str.data = 0;
> + return 0;
> + }

Here I'm wondering what we should do if the user asks for more bytes than
available. I think we have 3 options:
  - return 0 so that the sample's flags are consulted by the caller (e.g.
SMP_F_MAY_CHANGE will allow the caller to come back later trying to
get more bytes, or fail if not enough)

  - truncate to what is available

  - maybe a combination of both, i.e. return 0 as long as the caller does
not pass SMP_OPT_FINAL, or return a truncated response.

Truncation can be useful but I could understand that when extracting bytes
one would want this exact number of bytes to continue the processing. I
just wanted to mention this to make sure this is what you had in mind as
well.

> + length = smp_arg1.data.u.sint;
> + } else {
> + length = smp->data.u.str.data - start_idx;
> + }
>  
> - if ((arg_p[1].type == ARGT_SINT) && (arg_p[1].data.sint < 
> smp->data.u.str.data))
> - smp->data.u.str.data = arg_p[1].data.sint;

It seems to me that previously we would truncate if less bytes were
available, so maybe we'll need to preserve this or do a mix of the
two as I mentioned above.

> + // update the output using the start_idx and length
> + smp->data.u.str.area += start_idx;
> + smp->data.u.str.data = length;
>  
>   return 1;
>  }
> @@ -3120,7 +3144,7 @@ static int check_operator(struct arg *args, struct 
> sample_conv *conv,
>   long long int i;
>  
>   /* Try to decode a variable. */
> - if (vars_check_arg([0], NULL))
> + if (vars_check_arg([0], err))
>   return 1;

I can't say if this change was intentional or not, nor if the NULL had a
particular reason or not, but at least this is unrelated to your patch
and would need to be in its own patch, so that in case of regression we
can decide whether to fix or revert it. And if it really fixes an issue
maybe we'll want to backport it.

>   /* Try to convert an integer */
> @@ -4905,7 +4929,34 @@ static int smp_fetch_conn_timers(const struct arg 
> *args, struct sample *smp, con
>   return 0;
>  }
>  
> +static int sample_conv_bytes_check(struct arg *args, struct sample_conv 
> *conv,
> +  const char *file, int line, char **err)
> +{
> + // arg0 is not optional, must be >= 0
> + if (!check_operator([0], conv, file, line, err)) {
> + return 0;
> + }
> +if (args[0].type != ARGT_VAR) {
> +if (args[0].type != ARGT_SINT || args[0].data.sint < 0) {
> +memprintf(err, "expects a non-negative integer");
> +return 0;
> +}
> +}

You're having indentation issues in the "if" block above, which only
uses 4-spaces instead of a tab.

> + // arg1 is optional, must be > 0
> + if (args[1].type != ARGT_STOP) {
> + if