Re: Inconsistent reading of txn vars from Lua script

2021-05-13 Thread Willy Tarreau
On Thu, May 13, 2021 at 12:24:02PM +0200, Tim Düsterhus wrote:
> > So what I'm proposing is to simply change vars_get_by_name() to call
> > register_name() with alloc=0 in order to fix this mess. We can then
> > check during 2.5 how to refine this to also consider the scope with
> > the variable's name. It's just this, and fixes Joao's test case to
> > always return 403:
> > 
> > diff --git a/src/vars.c b/src/vars.c
> > index 996141f5d..15dcb3c3d 100644
> > --- a/src/vars.c
> > +++ b/src/vars.c
> > @@ -583,7 +583,7 @@ int vars_get_by_name(const char *name, size_t len, 
> > struct sample *smp)
> >  enum vars_scope scope;
> >  /* Resolve name and scope. */
> > -   name = register_name(name, len, , 1, NULL);
> > +   name = register_name(name, len, , 0, NULL);
> >  if (!name)
> >  return 0;
> > Tim, do you agree with this analysis ?
> > 
> 
> Yes, that change makes sense to me.

Great, thanks for the fast response, I'm going to do that and mark it as
a bugfix so that after some observation we can consider backporting it.

> If you'd see my full use case then I
> recommend taking a look at haproxy-auth-request. It's super simple and even
> comes with VTest tests:
> 
> https://github.com/TimWolla/haproxy-auth-request#usage
> https://github.com/TimWolla/haproxy-auth-request/blob/main/auth-request.lua#L50
> https://github.com/TimWolla/haproxy-auth-request/tree/main/test

Thanks. I vaguely remembered it was something simple but I really can't
remember every use case that end up as a patch :-)

Cheers,
Willy



Re: Inconsistent reading of txn vars from Lua script

2021-05-13 Thread Tim Düsterhus

Willy,

On 5/13/21 11:40 AM, Willy Tarreau wrote:

Yes, and it's still unclear to me how this storage is currently arranged,
(i.e. why only store names?) I should have a look for 2.5 probably.


OK now I got a better view of it and there is some misunderstanding in
the way the names are being used to detect if a variable exists. For
example, calling this before calling the Lua code will make it always
succeed:

   http-request set-var(proc.code) int(12)

Note that the variable name here is "proc.code", not "txn.code".

What happens is that there are unified names which are independent of
the variables themselves. The principle of the names is that when
looking for a variable, we need to compare its name, and instead of
storing the name into each and every copy of a variable, there's only
a pointer to a unified location storing names that have been encountered
at least one in the process so that a single word is used in combination
with millions of variables if needed. For this, only the suffix of the
variable's name is stored, its scope is not since we already know it
when looking up a variable.

This means that the "ifexist" option shouldn't be seen as "if this
variable exists anywhere else", but as "if any variable known to the
process already caused the same suffix to be allocated".

What happens here is that while the set_var("txn.code", true) call takes
care of *not* allocating a new entry to store "code" in the names table,
get_var("txn.code") isn't as careful and will finally create it, and
notice that the variable doesn't yet exist, so it returns nil. On a
subsequent call, set_var() will find a matching suffix name and will
then store the variable, which get_var() will then find.

If we had had a get_var("sess.code"), it would also have unblocked the
situation.

In my opinion we have multiple problems here.

The first one is that if the intent of the "ifexist" was to avoid
allocating variables that are not known to the config, it doesn't work
well due to the fact that it doesn't consider the scope, so it should
be stricter and check that the variable exists with the same scope.
But how? I don't know for now.


I introduced it to not allocate the *variable name*, because they were 
never cleaned up. Not allocating the variable would be a nice side effect.


My use case is haproxy-auth-request which uses variables to communicate 
back the auth request's response headers:


https://github.com/TimWolla/haproxy-auth-request/blob/e7b6385b3f1f34e0090968464f19369b2b8d117c/auth-request.lua#L106-L108


Second, the fact that get_var() does automatically cause the creation
of that variable is by far the biggest problem, because in order to
verify if it has been filled, this will cause an allocation which will
later ensure it is always filled. So we must make it support an "ifexist"
option as well so that it is possible to perform an existence lookup
without allocating.

I suspect the set_var() modification was done for a config which uses
Lua to set a variable and where the variable was read from the config,
but that this other case where get_var() is called from Lua was
overlooked.


Yes, this is correct. I'd using Lua to set the variable and the 
variables are expected to be read from the config for further processing.



Last point, overall I think that the "ifexist" mechanism remains of
very limited use due to the automaticity of some of the allocations,
which were initially designed only for referencement from the config
parser. Originally, I remember that Thierry introduced a "declare var"
directive in the global section, which we found painful to use and
unnecessary due to the fact that during the parsing we already get an
exhaustive list of the variables names. But maybe for variables known
to Lua only, we should use an explicit declaration (probably from the
Lua code itself).

Thinking about it, this could correspond to just a single call to
set_var(name, "") in the init code to declare that a variable will be
used by Lua. In this case the only missing part will be taking into
consideration the scope (we could later improve that by prepending
the enum to the string in the storage for example).

So in the end I think that for 2.4 we should simply change the Lua's
get_var() so that it always uses the ifexist variant. It will at
least stop creating random names on the fly and will continue to
work with variables that have been already created by the config or
with set_var(). I don't see a single case where it makes sense to
have get_var() create a variable in your back and return NIL because
set_var() wasn't called so that next time set_var() works.


I agree. If a variable never was created in the first place then 
obviously any read will result in nothing being found. The implicit 
creation sounds like a bug, because it will result in inconsistent behavior.



Looking more closely at vars_get_by_name(), it's only used by Lua's
various get_var() and by the CLI's "get var" that I recently added
without 

Re: Inconsistent reading of txn vars from Lua script

2021-05-13 Thread Willy Tarreau
Hi guys,

On Wed, May 12, 2021 at 09:44:28AM +0200, Willy Tarreau wrote:
> On Wed, May 12, 2021 at 09:12:25AM +0200, Tim Düsterhus wrote:
> > Willy,
> > 
> > On 5/12/21 7:47 AM, Willy Tarreau wrote:
> > > Interestingly, the code for variables was initially made for the config,
> > > so it doesn't seem to destroy variable names when they're released since
> > > that was pointless with the config. I think that code should be revisited
> > > in 2.5 to improve the situation (e.g. by marking that the variable was
> > > dynamically allocated maybe), but I don't know this part well so I'll
> > > probably stop before starting to suggest stupidities :-)
> > > 
> > 
> > There's also this related issue from back when I implemented this additional
> > parameter:
> > 
> > https://github.com/haproxy/haproxy/issues/624
> 
> Yes, and it's still unclear to me how this storage is currently arranged,
> (i.e. why only store names?) I should have a look for 2.5 probably.

OK now I got a better view of it and there is some misunderstanding in
the way the names are being used to detect if a variable exists. For
example, calling this before calling the Lua code will make it always
succeed:

  http-request set-var(proc.code) int(12)

Note that the variable name here is "proc.code", not "txn.code".

What happens is that there are unified names which are independent of
the variables themselves. The principle of the names is that when
looking for a variable, we need to compare its name, and instead of
storing the name into each and every copy of a variable, there's only
a pointer to a unified location storing names that have been encountered
at least one in the process so that a single word is used in combination
with millions of variables if needed. For this, only the suffix of the
variable's name is stored, its scope is not since we already know it
when looking up a variable.

This means that the "ifexist" option shouldn't be seen as "if this
variable exists anywhere else", but as "if any variable known to the
process already caused the same suffix to be allocated".

What happens here is that while the set_var("txn.code", true) call takes
care of *not* allocating a new entry to store "code" in the names table,
get_var("txn.code") isn't as careful and will finally create it, and
notice that the variable doesn't yet exist, so it returns nil. On a
subsequent call, set_var() will find a matching suffix name and will
then store the variable, which get_var() will then find.

If we had had a get_var("sess.code"), it would also have unblocked the
situation.

In my opinion we have multiple problems here.

The first one is that if the intent of the "ifexist" was to avoid
allocating variables that are not known to the config, it doesn't work
well due to the fact that it doesn't consider the scope, so it should
be stricter and check that the variable exists with the same scope.
But how? I don't know for now.

Second, the fact that get_var() does automatically cause the creation
of that variable is by far the biggest problem, because in order to
verify if it has been filled, this will cause an allocation which will
later ensure it is always filled. So we must make it support an "ifexist"
option as well so that it is possible to perform an existence lookup
without allocating.

I suspect the set_var() modification was done for a config which uses
Lua to set a variable and where the variable was read from the config,
but that this other case where get_var() is called from Lua was
overlooked.

Last point, overall I think that the "ifexist" mechanism remains of
very limited use due to the automaticity of some of the allocations,
which were initially designed only for referencement from the config
parser. Originally, I remember that Thierry introduced a "declare var"
directive in the global section, which we found painful to use and
unnecessary due to the fact that during the parsing we already get an
exhaustive list of the variables names. But maybe for variables known
to Lua only, we should use an explicit declaration (probably from the
Lua code itself).

Thinking about it, this could correspond to just a single call to
set_var(name, "") in the init code to declare that a variable will be
used by Lua. In this case the only missing part will be taking into
consideration the scope (we could later improve that by prepending
the enum to the string in the storage for example).

So in the end I think that for 2.4 we should simply change the Lua's
get_var() so that it always uses the ifexist variant. It will at
least stop creating random names on the fly and will continue to
work with variables that have been already created by the config or
with set_var(). I don't see a single case where it makes sense to
have get_var() create a variable in your back and return NIL because
set_var() wasn't called so that next time set_var() works.

Looking more closely at vars_get_by_name(), it's only used by Lua's
various get_var() and by the CLI's "get var" that I 

Re: Inconsistent reading of txn vars from Lua script

2021-05-12 Thread Willy Tarreau
On Wed, May 12, 2021 at 02:12:44PM -0300, Joao Morais wrote:
> > It's not a matter of first or second access. It's that the function
> > you used initially resulted in always allocating an entry for the
> > variable's name, causing some huge memory usage for those who were
> > using them like maps and performing random lookups there. In order
> > to avoid this, Tim added an extra argument saying that we're just
> > performing an opportunistic lookup and that the variable must not
> > be created if it does not exist.
> 
> Afaics this is only an option for set_var()? Here is the doc:
> 
> TXN.set_var(TXN, var, value[, ifexist])
> TXN.get_var(TXN, var)
> http://www.arpalert.org/src/haproxy-lua-api/2.3/index.html
> (link is broken from haproxy site, is the source versioned
> somewhere so I can send a patch?)
> 
> So lookups would still create variables, hence the "in the second run it
> works" - my second script (which reads) create the entry and my first script
> (which writes with ifexists true) can find it and update in the second run
> and beyond.

That looks really odd, but you're probably right :-(
I'll need to have a look at the code to check. Maybe get_var() also
needs that argument, or maybe there is something to check if a
variable exists, I don't know. But I don't like the way all of this
works in general. I think the creation of a variable on a lookup
only results from an internal API limitation that was quickly worked
around when variables were exposed to Lua, and that it's not too late
to fix it.

Willy



Re: Inconsistent reading of txn vars from Lua script

2021-05-12 Thread Joao Morais



> Em 12 de mai. de 2021, à(s) 02:47, Willy Tarreau  escreveu:
> 
> On Tue, May 11, 2021 at 05:41:28PM -0300, Joao Morais wrote:
> 
>> Just to confirm how it works, I created the snippet below:
>> 
>>http-request lua.auth ## assigning txn.core
>>http-request return lf-string %[var(txn.code)] content-type text/plain
>> 
>> It worked since the first run and this is the only place I declared txn.code.
>> Does this mean that a var is created in the following conditions?
>> Any change in the sentences below?
>> 
>> - after the first read from a  Lua script
>> - after the first write from a Lua script provided that ifexists parameter 
>> is set to false
>> - always exists, if used anywhere in the configuration file
> 
> It's not a matter of first or second access. It's that the function
> you used initially resulted in always allocating an entry for the
> variable's name, causing some huge memory usage for those who were
> using them like maps and performing random lookups there. In order
> to avoid this, Tim added an extra argument saying that we're just
> performing an opportunistic lookup and that the variable must not
> be created if it does not exist.

Afaics this is only an option for set_var()? Here is the doc:

TXN.set_var(TXN, var, value[, ifexist])
TXN.get_var(TXN, var)
http://www.arpalert.org/src/haproxy-lua-api/2.3/index.html
(link is broken from haproxy site, is the source versioned
somewhere so I can send a patch?)

So lookups would still create variables, hence the “in the second run it works” 
- my second script (which reads) create the entry and my first script (which 
writes with ifexists true) can find it and update in the second run and beyond.

> Other parts of the code (the native config parts I mean) which use
> variables always result in a creation because these names are static.
> So my understanding is that it can be simplified to this:
>  - a variable declared in the config always exists
>  - a variable accessed from Lua with ifexists set to true will not
>be created but will be found if it exists
>  - a vraiable accessed from Lua with ifexists set to false or not
>present will always be created during the lookup.

Taking the sentence above I’d update this to:

- s/variable accessed/variable updated, set_var(),/
- variable accessed from Lua, get_var(), will always be created

Does this make any sense?




Re: Inconsistent reading of txn vars from Lua script

2021-05-12 Thread Willy Tarreau
On Wed, May 12, 2021 at 09:12:25AM +0200, Tim Düsterhus wrote:
> Willy,
> 
> On 5/12/21 7:47 AM, Willy Tarreau wrote:
> > Interestingly, the code for variables was initially made for the config,
> > so it doesn't seem to destroy variable names when they're released since
> > that was pointless with the config. I think that code should be revisited
> > in 2.5 to improve the situation (e.g. by marking that the variable was
> > dynamically allocated maybe), but I don't know this part well so I'll
> > probably stop before starting to suggest stupidities :-)
> > 
> 
> There's also this related issue from back when I implemented this additional
> parameter:
> 
> https://github.com/haproxy/haproxy/issues/624

Yes, and it's still unclear to me how this storage is currently arranged,
(i.e. why only store names?) I should have a look for 2.5 probably.

Willy



Re: Inconsistent reading of txn vars from Lua script

2021-05-12 Thread Tim Düsterhus

Willy,

On 5/12/21 7:47 AM, Willy Tarreau wrote:

Interestingly, the code for variables was initially made for the config,
so it doesn't seem to destroy variable names when they're released since
that was pointless with the config. I think that code should be revisited
in 2.5 to improve the situation (e.g. by marking that the variable was
dynamically allocated maybe), but I don't know this part well so I'll
probably stop before starting to suggest stupidities :-)



There's also this related issue from back when I implemented this 
additional parameter:


https://github.com/haproxy/haproxy/issues/624

Best regards
Tim Düsterhus



Re: Inconsistent reading of txn vars from Lua script

2021-05-11 Thread Willy Tarreau
On Tue, May 11, 2021 at 05:41:28PM -0300, Joao Morais wrote:
> 
> 
> > Em 10 de mai. de 2021, à(s) 18:04, Willy Tarreau  escreveu:
> > 
> > On Mon, May 10, 2021 at 10:41:36PM +0200, Willy Tarreau wrote:
> >>> core.register_action("auth", { "http-req" }, function(txn)
> >>>   txn:set_var("txn.code", 401, true)
> > 
> > So the problem is exactly here and it works as designed. This
> > argument "ifexist" was added a year ago to avoid Lua allocating
> > random variable names:
> > 
> >  4e172c93f ("MEDIUM: lua: Add `ifexist` parameter to `set_var`")
> > 
> > What the "true" argument does here is to refrain from creating
> > the variable if it does not exist. After you look it up from the
> > service, the variable gets created and it exists, hence why it
> > then works next times.
> > 
> > If you want it to always be created (which I assume you want
> > to), just drop this argument or explicitly set it to false.
> 
> Thanks Willy for the explanation and sorry about the false alarm, I didn't
> see the whole picture here.

No problem, it made me search and (re)discover this area :-)

> Just to confirm how it works, I created the snippet below:
> 
> http-request lua.auth ## assigning txn.core
> http-request return lf-string %[var(txn.code)] content-type text/plain
> 
> It worked since the first run and this is the only place I declared txn.code.
> Does this mean that a var is created in the following conditions?
> Any change in the sentences below?
> 
> - after the first read from a  Lua script
> - after the first write from a Lua script provided that ifexists parameter is 
> set to false
> - always exists, if used anywhere in the configuration file

It's not a matter of first or second access. It's that the function
you used initially resulted in always allocating an entry for the
variable's name, causing some huge memory usage for those who were
using them like maps and performing random lookups there. In order
to avoid this, Tim added an extra argument saying that we're just
performing an opportunistic lookup and that the variable must not
be created if it does not exist.

Other parts of the code (the native config parts I mean) which use
variables always result in a creation because these names are static.
So my understanding is that it can be simplified to this:
  - a variable declared in the config always exists
  - a variable accessed from Lua with ifexists set to true will not
be created but will be found if it exists
  - a vraiable accessed from Lua with ifexists set to false or not
present will always be created during the lookup.

Interestingly, the code for variables was initially made for the config,
so it doesn't seem to destroy variable names when they're released since
that was pointless with the config. I think that code should be revisited
in 2.5 to improve the situation (e.g. by marking that the variable was
dynamically allocated maybe), but I don't know this part well so I'll
probably stop before starting to suggest stupidities :-)

Willy



Re: Inconsistent reading of txn vars from Lua script

2021-05-11 Thread Joao Morais



> Em 10 de mai. de 2021, à(s) 18:04, Willy Tarreau  escreveu:
> 
> On Mon, May 10, 2021 at 10:41:36PM +0200, Willy Tarreau wrote:
>>> core.register_action("auth", { "http-req" }, function(txn)
>>> txn:set_var("txn.code", 401, true)
> 
> So the problem is exactly here and it works as designed. This
> argument "ifexist" was added a year ago to avoid Lua allocating
> random variable names:
> 
>  4e172c93f ("MEDIUM: lua: Add `ifexist` parameter to `set_var`")
> 
> What the "true" argument does here is to refrain from creating
> the variable if it does not exist. After you look it up from the
> service, the variable gets created and it exists, hence why it
> then works next times.
> 
> If you want it to always be created (which I assume you want
> to), just drop this argument or explicitly set it to false.

Thanks Willy for the explanation and sorry about the false alarm, I didn’t see 
the whole picture here.

Just to confirm how it works, I created the snippet below:

http-request lua.auth ## assigning txn.core
http-request return lf-string %[var(txn.code)] content-type text/plain

It worked since the first run and this is the only place I declared txn.code. 
Does this mean that a var is created in the following conditions? Any change in 
the sentences below?

- after the first read from a  Lua script
- after the first write from a Lua script provided that ifexists parameter is 
set to false
- always exists, if used anywhere in the configuration file

Thanks,




Re: Inconsistent reading of txn vars from Lua script

2021-05-10 Thread Willy Tarreau
On Mon, May 10, 2021 at 10:41:36PM +0200, Willy Tarreau wrote:
> > core.register_action("auth", { "http-req" }, function(txn)
> > txn:set_var("txn.code", 401, true)
 
So the problem is exactly here and it works as designed. This
argument "ifexist" was added a year ago to avoid Lua allocating
random variable names:

  4e172c93f ("MEDIUM: lua: Add `ifexist` parameter to `set_var`")

What the "true" argument does here is to refrain from creating
the variable if it does not exist. After you look it up from the
service, the variable gets created and it exists, hence why it
then works next times.

If you want it to always be created (which I assume you want
to), just drop this argument or explicitly set it to false.

Willy



Re: Inconsistent reading of txn vars from Lua script

2021-05-10 Thread Willy Tarreau
On Mon, May 10, 2021 at 07:59:56AM -0300, Joao Morais wrote:
> Hello again! Here are the snippets running with 2.4-dev18 - docker image 
> haproxy:2.4-dev18-alpine:
> 
> $ cat h.cfg
> global
>   log stdout format raw local0
>   lua-load /tmp/h/svc1.lua
>   lua-load /tmp/h/svc2.lua
> defaults
>   timeout server 1m
>   timeout client 1m
>   timeout connect 5s
>   log global
> listen l
>   mode http
>   bind :8000
>   option httplog
>   http-request lua.auth
>   http-request use-service lua.send-failure
> 
> $ cat svc1.lua
> core.register_action("auth", { "http-req" }, function(txn)
>   txn:set_var("txn.code", 401, true)
> end, 0)
> 
> $ cat svc2.lua
> core.register_service("send-failure", "http", function(applet)
> response = applet:get_var("txn.code")
> if response ~= nil then
> applet:set_status(response)
> else
> applet:set_status(403)
> end
> applet:add_header("Content-Length", 0)
> applet:add_header("Content-Type", "text/plain")
> applet:start_response()
> end)
> 
> Now curl'ing the config above:
> 
> $ curl -i localhost:8000
> HTTP/1.1 403 Forbidden
> content-type: text/plain
> content-length: 0
> 
> $ curl -i localhost:8000
> HTTP/1.1 401 Unauthorized
> content-type: text/plain
> content-length: 0
> 
> The first run is always a 403 which means that the reading of the txn.code
> retuned nil, all the next attempts correctly returns 401. Maybe I'm missing
> some kind of initialization here? Otherwise I'm happy to provide this as a
> GitHub issue.

I can reproduce it. I agree there's something odd, as it means that
there is some random matching or that something is not properly
initialized. I suspect that a vars field isn't properly initialized
somewhere. I'm investigating, thanks for the report!

Willy