Hello,

Here is the updated patch, I hope it now covers all things you have
mentioned.

Since one of the purposes of this feature was to mitigate wasting of
processing time by using 2 concat lines I decided to add the checks and
forbid the user to use rules which don't make sense like:
add_item(",",) or add_item(",",,).

Although, I've done it a bit different than suggested:

   - if there's a 3rd arg => OK
>     - else if there's a non-empty 2nd arg that resolves as a variable => OK
>     - otherwise error


because I believe, even if there is a nonempty 3rd argument, we should
check if the 2nd argument(if it's nonempty) resolves as a variable. If not
it should mean that the user made a typo. Let me know if you disagree.

Best Regards,
Nikola

On Fri, 1 Apr 2022 at 17:23, Tim Düsterhus <t...@bastelstu.be> wrote:

> Nikola,
>
> On 4/1/22 16:08, Nikola Šale wrote:
> > - The square brackets in the converter "signature" in configuration.txt
> are
> >> mismatched.
> >> - In the explanation there should not be square brackets around
> [<delim>].
> >
> >
> > Actually, there are no square brackets around <delim>:
> >
> > +add_item(<delim>,[<var>][,<suff>]])
> >
>
> See Willy's response. This was about the one in the description, not in
> the "signature".
>
> > - The documentation states "It(<suff>) may also be omitted but only if
> the
> >> <var> is supplied". You should verify this with a 'check' function
> (you're
> >> currently reusing smp_check_concat)
> >
> >
> > Since the minimum number of arguments is set to 2, maybe there is no need
> > to verify it because if both <suff> and <var> are omitted the parser will
> > complain about missing arguments. So maybe the best solution would be to
> > remove the "but only if the <var> is supplied" part from the
> documentation
> > and still use the smp_check_concat check. Let me know how this sounds to
> > you.
>
> Documentation and code should be consistent. I won't decide which is the
> best solution, I'll leave this up to you.
>
> Generally you should make sure to reject any invalid configuration with
> a 'check' function, so that it fails loudly instead of silently. So if
> the documentation says that at least one must be non-empty, then the
> code should verify this. If the documentation says nothing about this,
> then the code does not need to do anything. Not saying anything and not
> verifying anything probably is the easier solution :-)
>
> >
> > Please place it at the top to fix the alphabetical order, and feel free
> >> to move concat with it as it was obviously misplaced.
> >
> >
> > Sure, but I guess that would require a separate patch(the concat part) to
> > keep this one clean and focused on the new feature?
> >
> > Also, apologies for the cosmetic issues, I will try to take better care.
> >
>
> As Willy already said: Don't worry about your first attempt not being
> perfect. Your patch is definitely already looking great (especially
> since you followed the advice in CONTRIBUTING regarding the commit
> message and not using pull requests):
>
> Best regards
> Tim Düsterhus
>
From ac25f1bf4e7d18db4c63684a1192602d681da83f Mon Sep 17 00:00:00 2001
From: Nikola Sale <nikola.sal...@gmail.com>
Date: Sun, 3 Apr 2022 18:11:53 +0200
Subject: [PATCH] MINOR: sample: converter: Add add_item convertor

This new converter is similar to the concat converter and can be used to
build new variables made of a succession of other variables but the main
difference is that it does the checks if adding a delimiter makes sense as
wouldn't be the case if e.g the current input sample is empty. That
situation would require 2 separate rules using concat converter where the
first rule would have to check if the current sample string is empty before
adding a delimiter. This resolves GitHub Issue #1621.
---
 doc/configuration.txt            |  29 +++++++++
 reg-tests/converter/add_item.vtc |  50 ++++++++++++++++
 src/sample.c                     | 100 ++++++++++++++++++++++++++++++-
 3 files changed, 178 insertions(+), 1 deletion(-)
 create mode 100644 reg-tests/converter/add_item.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index e184f4e76..72c6ddcb9 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -16308,6 +16308,35 @@ add(<value>)
   This prefix is followed by a name. The separator is a '.'. The name may only
   contain characters 'a-z', 'A-Z', '0-9', '.' and '_'.
 
+add_item(<delim>,[<var>][,<suff>]])
+  Concatenates a minimum of 2 and up to 3 fields after the current sample which
+  is then turned into a string. The first one, <delim>, is a constant string,
+  that will be appended immediately after the existing sample if an existing
+  sample is not empty and either the <var> or the <suff> is not empty. The
+  second one, <var>, is a variable name. The variable will be looked up, its
+  contents converted to a string, and it will be appended immediately after
+  the <delim> part. If the variable is not found, nothing is appended. It is
+  optional and may optionally be followed by a constant string <suff>, however
+  if <var> is omitted, then <suff> is mandatory. This converter is similar to
+  the concat converter and can be used to build new variables made of a
+  succession of other variables but the main difference is that it does the
+  checks if adding a delimiter makes sense as wouldn't be the case if e.g. the
+  current sample is empty. That situation would require 2 separate rules using
+  concat converter where the first rule would have to check if the current
+  sample string is empty before adding a delimiter. If commas or closing
+  parenthesis are needed as delimiters, they must be protected by quotes or
+  backslashes, themselves protected so that they are not stripped by the first
+  level parser. See examples below.
+
+  Example:
+    http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score1,"(site1)") if src,in_table(site1)'
+    http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score2,"(site2)") if src,in_table(site2)'
+    http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score3,"(site3)") if src,in_table(site3)'
+    http-request set-header x-tagged %[var(req.tagged)]
+
+    http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score1),add_item(",",req.score2)'
+    http-request set-var(req.tagged) 'var(req.tagged),add_item(",",,(site1))' if src,in_table(site1)
+
 aes_gcm_dec(<bits>,<nonce>,<key>,<aead_tag>)
   Decrypts the raw byte input using the AES128-GCM, AES192-GCM or
   AES256-GCM algorithm, depending on the <bits> parameter. All other parameters
diff --git a/reg-tests/converter/add_item.vtc b/reg-tests/converter/add_item.vtc
new file mode 100644
index 000000000..fe5ed26b4
--- /dev/null
+++ b/reg-tests/converter/add_item.vtc
@@ -0,0 +1,50 @@
+varnishtest "be2dec converter Test"
+
+feature cmd "$HAPROXY_PROGRAM -cc 'version_atleast(2.5-dev0)'"
+feature ignore_unknown_macro
+
+server s1 {
+	rxreq
+	txresp
+} -repeat 3 -start
+
+haproxy h1 -conf {
+    defaults
+	mode http
+	timeout connect "${HAPROXY_TEST_TIMEOUT-5s}"
+	timeout client  "${HAPROXY_TEST_TIMEOUT-5s}"
+	timeout server  "${HAPROXY_TEST_TIMEOUT-5s}"
+
+    frontend fe
+	bind "fd@${fe}"
+
+	#### requests
+	http-request  set-var(txn.input) req.hdr(input)
+	http-request  set-var(txn.var) str("var_content")
+
+	http-response set-header add_item-1   "%[var(txn.input),add_item(',',txn.var,_suff_)]"
+	http-response set-header add_item-2   "%[var(txn.input),add_item(',',txn.var)]"
+	http-response set-header add_item-3   "%[var(txn.input),add_item(',',,_suff_)]"
+
+	default_backend be
+
+    backend be
+	server s1 ${s1_addr}:${s1_port}
+} -start
+
+client c1 -connect ${h1_fe_sock} {
+	txreq -url "/" \
+	  -hdr "input:"
+	rxresp
+	expect resp.status == 200
+	expect resp.http.add_item-1 == "var_content_suff_"
+	expect resp.http.add_item-2 == "var_content"
+	expect resp.http.add_item-3 == "_suff_"
+	txreq -url "/" \
+	  -hdr "input: input_string"
+	rxresp
+	expect resp.status == 200
+	expect resp.http.add_item-1 == "input_string,var_content_suff_"
+	expect resp.http.add_item-2 == "input_string,var_content"
+	expect resp.http.add_item-3 == "input_string,_suff_"
+} -run
diff --git a/src/sample.c b/src/sample.c
index 766e87ad9..21bbadd9a 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3122,6 +3122,103 @@ static int smp_check_concat(struct arg *args, struct sample_conv *conv,
 	return 1;
 }
 
+/* Append delimiter (only to a non empty input) followed by the optional
+ * variable contents concatenated with the optional sufix.
+ */
+static int sample_conv_add_item(const struct arg *arg_p, struct sample *smp, void *private)
+{
+	struct buffer *tmpbuf;
+	struct sample tmp;
+	size_t max;
+	int var_available;
+
+	tmpbuf = alloc_trash_chunk();
+	if (!tmpbuf)
+		return 0;
+
+	tmpbuf->data = smp->data.u.str.data;
+	if (tmpbuf->data > tmpbuf->size - 1)
+		tmpbuf->data = tmpbuf->size - 1;
+
+	memcpy(tmpbuf->area, smp->data.u.str.area, tmpbuf->data);
+	tmpbuf->area[tmpbuf->data] = 0;
+
+	/* Check if variable is found and we can turn into a string. */
+	var_available = 0;
+	smp_set_owner(&tmp, smp->px, smp->sess, smp->strm, smp->opt);
+	if (arg_p[1].type == ARGT_VAR && vars_get_by_desc(&arg_p[1].data.var, &tmp, NULL) &&
+	    (sample_casts[tmp.data.type][SMP_T_STR] == c_none ||
+	     sample_casts[tmp.data.type][SMP_T_STR](&tmp)))
+		var_available = 1;
+
+	/* Append delimiter only if input is not empty and either
+	 * the variable or the suffix are not empty
+	 */
+	if (smp->data.u.str.data && ((var_available && tmp.data.u.str.data) ||
+	    arg_p[2].data.str.data)) {
+		max = arg_p[0].data.str.data;
+		if (max > tmpbuf->size - 1 - tmpbuf->data)
+			max = tmpbuf->size - 1 - tmpbuf->data;
+
+		if (max) {
+			memcpy(tmpbuf->area + tmpbuf->data, arg_p[0].data.str.area, max);
+			tmpbuf->data += max;
+			tmpbuf->area[tmpbuf->data] = 0;
+		}
+	}
+
+	/* Append variable contents if variable is found and turned into string. */
+	if (var_available) {
+		max = tmp.data.u.str.data;
+		if (max > tmpbuf->size - 1 - tmpbuf->data)
+			max = tmpbuf->size - 1 - tmpbuf->data;
+
+		if (max) {
+			memcpy(tmpbuf->area + tmpbuf->data, tmp.data.u.str.area, max);
+			tmpbuf->data += max;
+			tmpbuf->area[tmpbuf->data] = 0;
+		}
+	}
+
+	/* Append optional suffix. */
+	max = arg_p[2].data.str.data;
+	if (max > tmpbuf->size - 1 - tmpbuf->data)
+		max = tmpbuf->size - 1 - tmpbuf->data;
+
+	if (max) {
+		memcpy(tmpbuf->area + tmpbuf->data, arg_p[2].data.str.area, max);
+		tmpbuf->data += max;
+		tmpbuf->area[tmpbuf->data] = 0;
+	}
+
+	smp->data.u.str = *tmpbuf;
+	smp->data.type = SMP_T_STR;
+	smp_dup(smp);
+	free_trash_chunk(tmpbuf);
+	return 1;
+}
+
+/* Check the "add_item" converter's arguments and extracts the
+ * variable name and its scope.
+ */
+static int smp_check_add_item(struct arg *args, struct sample_conv *conv,
+                           const char *file, int line, char **err)
+{
+	/* Try to decode a variable. */
+	if (args[1].data.str.data > 0 && !vars_check_arg(&args[1], NULL)) {
+		memprintf(err, "failed to register variable name '%s'",
+			  args[1].data.str.area);
+		return 0;
+	}
+
+	if (args[1].data.str.data == 0 && args[2].data.str.data == 0) {
+		memprintf(err, "one of the optional arguments has to be nonempty");
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Compares string with a variable containing a string. Return value
  * is compatible with strcmp(3)'s return value.
  */
@@ -4206,9 +4303,11 @@ INITCALL1(STG_REGISTER, sample_register_fetches, &smp_kws);
 
 /* Note: must not be declared <const> as its list will be overwritten */
 static struct sample_conv_kw_list sample_conv_kws = {ILH, {
+	{ "add_item",sample_conv_add_item,     ARG3(2,STR,STR,STR),   smp_check_add_item,       SMP_T_STR,  SMP_T_STR  },
 	{ "debug",   sample_conv_debug,        ARG2(0,STR,STR),       smp_check_debug,          SMP_T_ANY,  SMP_T_ANY  },
 	{ "b64dec",  sample_conv_base642bin,   0,                     NULL,                     SMP_T_STR,  SMP_T_BIN  },
 	{ "base64",  sample_conv_bin2base64,   0,                     NULL,                     SMP_T_BIN,  SMP_T_STR  },
+	{ "concat",  sample_conv_concat,       ARG3(1,STR,STR,STR),   smp_check_concat,         SMP_T_STR,  SMP_T_STR  },
 	{ "ub64enc", sample_conv_bin2base64url,0,                     NULL,                     SMP_T_BIN,  SMP_T_STR  },
 	{ "ub64dec", sample_conv_base64url2bin,0,                     NULL,                     SMP_T_STR,  SMP_T_BIN  },
 	{ "upper",   sample_conv_str2upper,    0,                     NULL,                     SMP_T_STR,  SMP_T_STR  },
@@ -4235,7 +4334,6 @@ static struct sample_conv_kw_list sample_conv_kws = {ILH, {
 	{ "word",    sample_conv_word,         ARG3(2,SINT,STR,SINT), sample_conv_field_check,  SMP_T_STR,  SMP_T_STR  },
 	{ "regsub",  sample_conv_regsub,       ARG3(2,REG,STR,STR),   sample_conv_regsub_check, SMP_T_STR,  SMP_T_STR  },
 	{ "sha1",    sample_conv_sha1,         0,                     NULL,                     SMP_T_BIN,  SMP_T_BIN  },
-	{ "concat",  sample_conv_concat,       ARG3(1,STR,STR,STR),   smp_check_concat,         SMP_T_STR,  SMP_T_STR  },
 	{ "strcmp",  sample_conv_strcmp,       ARG1(1,STR),           smp_check_strcmp,         SMP_T_STR,  SMP_T_SINT },
 
 	/* gRPC converters. */
-- 
2.17.1

Reply via email to