Re: I just broke opentracing :-(
Hello Willy, On 09/09/2021 08:43 AM, Willy Tarreau wrote: Yes, that's where I found it but didn't know what its use was. A few points however: - sess.ot.uuid will be shared by all requests on the same session, which is probably not what you want (e.g. for requests coming in H2, the last one setting it will replace all other ones' value while they're working with it). I think you should change its scope to be limited to the transaction only (txn.ot.uuid). It's also used in debugging outputs where it will be per-request and may not match what the requests will log because of this. Thanks, i'll change that. - I finally found its definition here in filter.h: #define FTL_OT_VAR_UUID "sess", "ot", "uuid" Please note the typo in the macro name (FTL instead of FLT) Strange that there is only one such typo. :) - the comment in struct flt_ot_runtime_context for uuid says "Randomly generated UUID". We already have a function to produce a random UUID (ha_generate_uuid()) but flt_ot_runtime_context_init() makes its own from elements that at first glance look like they're filled from other locations (uuid.node, uuid.clock_seq etc), but in fact are simply aliased on the random ints. In this case better use the existing function and avoid duplicating the painful output format generation. If you need to keep a copy of the two randoms, just modify the existing function to pass an optional array to store the randoms. At first glance they're not used though. When ot filter source was created, I didn't know what would be used and what wouldn't. At the time I didn't even know what opentracing was, what it was used for and how it was done. So, adding 'useful' things to the source, there are also those members of the structure that were not used anywhere later (because I thought at the time that they might be used somewhere). Not that they bother anything, but I will remove them. I will also use the haproxy function to generate uuid, it's a small change in ot source. Then I didn't use that function because it doesn't set those two uuid 64-bit numbers and it was a bit of a weird way for me to generate uuid because at the end of the haproxy function the value of v[2] rotates to the right by 14 bits and the value of v[3] to the left by 18 bits which for uint64_t[2] input gave me mixed bits. For example, if both random numbers are equal to 0x0123456789abcdef, the result is 89abcdef-4567-4123-8def-048d159e26af, while to me the logical result would be 89abcdef-4567-4123-8def-0123456789ab. This result is obtained if v[2] rotates to the right by 16 bits and v[3] to the left by 16 bits. If all the bits are equally good random, this is somehow a more logical result for me, because they give the correct order of the bytes of the input data (I'm not going into the correctness of the big/little endian way of writing data now). - there's quite a bunch of functions with no description at all of their purpose, input values, return values etc. .. Yes, writing comments always takes useful time, but they are necessary; at least for weirder things.. I will fill in those function descriptions over time and send patches. -- Zaga What can change the nature of a man?
Re: I just broke opentracing :-(
Hi Miroslav, On Wed, Sep 08, 2021 at 08:02:35PM +0200, Miroslav Zagorac wrote: > On 09/08/2021 07:57 PM, Miroslav Zagorac wrote: > > On 09/08/2021 07:42 PM, Willy Tarreau wrote: > > > No rush on this one, I'll let you think about it, just let me know if we > > > need to temporarily disable it from the CI, or if you just want a day or > > > two of checking before knowing if you can get a quick solution or if it > > > will take more time. > > > > > > > Thank you for understanding. For a start it would be good to exclude > > ot filter from the compilation process so as not to interfere with the > > development and testing of other code. OK I can do that. > > Next, I could throw out the part related to context transfer via > > variables with #ifdef/#endif blocks, so that the ot filter code is > > compiled without it. > > > > After that the problem with variables could be solved without > > interfering with the rest of the coding/compilation process and the > > ot filter will be functional (except for the part with context transfer > > via variables, but the question is whether anyone uses it). OK. Given that I really have no idea what it's used for (from the user's perspective), I cannot judge. But maybe we can start with this and ask for testers. > > The variable 'sess.ot.uuid' is just set, it is not used anywhere (as > > far as memory serves me). I imagined that this could be used if the > > user needs it, but it is not necessary. > > > > OK, it is still used but again only as an example that it can be used > and not that it should be used. > > In our examples uuid is set as one of the data from the opentracing > 'baggage' data group: > > test/ctx/ot.cfg: > --- >baggage "haproxy_id" var(sess.ot.uuid) > --- Yes, that's where I found it but didn't know what its use was. A few points however: - sess.ot.uuid will be shared by all requests on the same session, which is probably not what you want (e.g. for requests coming in H2, the last one setting it will replace all other ones' value while they're working with it). I think you should change its scope to be limited to the transaction only (txn.ot.uuid). It's also used in debugging outputs where it will be per-request and may not match what the requests will log because of this. - I finally found its definition here in filter.h: #define FTL_OT_VAR_UUID "sess", "ot", "uuid" Please note the typo in the macro name (FTL instead of FLT) - the comment in struct flt_ot_runtime_context for uuid says "Randomly generated UUID". We already have a function to produce a random UUID (ha_generate_uuid()) but flt_ot_runtime_context_init() makes its own from elements that at first glance look like they're filled from other locations (uuid.node, uuid.clock_seq etc), but in fact are simply aliased on the random ints. In this case better use the existing function and avoid duplicating the painful output format generation. If you need to keep a copy of the two randoms, just modify the existing function to pass an optional array to store the randoms. At first glance they're not used though. - there's quite a bunch of functions with no description at all of their purpose, input values, return values etc. I know pretty well how painful it is to document that, especially since every single time we have to go through this we find bugs due to the caller and callee disagreeing on input domain or output values (typically the type of thing Coverity loves to look at). This is also why it's important. Hint: at least fill the description entry and explain limits/restrictions if any. The locations of empty descriptions can be found with this: $ pcregrep -nMo1 '^ \* DESCRIPTION\n \* -\n' addons/ot/src/*.c I'm not asking you to add all of them immediately, but I'm willing to apply cleanup patches that routinely add a few of them when you have a bit of time to work on this. And of course bugfixes as well if you find bugs while documenting. Thanks, Willy
Re: I just broke opentracing :-(
On 09/08/2021 07:57 PM, Miroslav Zagorac wrote: On 09/08/2021 07:42 PM, Willy Tarreau wrote: No rush on this one, I'll let you think about it, just let me know if we need to temporarily disable it from the CI, or if you just want a day or two of checking before knowing if you can get a quick solution or if it will take more time. Thank you for understanding. For a start it would be good to exclude ot filter from the compilation process so as not to interfere with the development and testing of other code. Next, I could throw out the part related to context transfer via variables with #ifdef/#endif blocks, so that the ot filter code is compiled without it. After that the problem with variables could be solved without interfering with the rest of the coding/compilation process and the ot filter will be functional (except for the part with context transfer via variables, but the question is whether anyone uses it). The variable 'sess.ot.uuid' is just set, it is not used anywhere (as far as memory serves me). I imagined that this could be used if the user needs it, but it is not necessary. OK, it is still used but again only as an example that it can be used and not that it should be used. In our examples uuid is set as one of the data from the opentracing 'baggage' data group: test/ctx/ot.cfg: --- baggage "haproxy_id" var(sess.ot.uuid) --- -- Zaga What can change the nature of a man?
Re: I just broke opentracing :-(
On 09/08/2021 07:42 PM, Willy Tarreau wrote: No rush on this one, I'll let you think about it, just let me know if we need to temporarily disable it from the CI, or if you just want a day or two of checking before knowing if you can get a quick solution or if it will take more time. Thank you for understanding. For a start it would be good to exclude ot filter from the compilation process so as not to interfere with the development and testing of other code. Next, I could throw out the part related to context transfer via variables with #ifdef/#endif blocks, so that the ot filter code is compiled without it. After that the problem with variables could be solved without interfering with the rest of the coding/compilation process and the ot filter will be functional (except for the part with context transfer via variables, but the question is whether anyone uses it). The variable 'sess.ot.uuid' is just set, it is not used anywhere (as far as memory serves me). I imagined that this could be used if the user needs it, but it is not necessary. -- Zaga What can change the nature of a man?
Re: I just broke opentracing :-(
On Wed, Sep 08, 2021 at 07:23:59PM +0200, Miroslav Zagorac wrote: > On 09/08/2021 06:46 PM, Willy Tarreau wrote: > > I have no doubt about this, what I mean is that once such needs are > > identified, as much as possible we should try to move these parts into > > the proper location (vars), or if really needed, that must be clearly > > explained around the code so that someone else may try to figure what > > to change there. For example I've seen a function that parses scope > > names and turns them list heads, that's the perfect example of something > > that ought to have been adapted into vars.c if what was provided was not > > sufficient, because since then new scopes were added without anyone > > noticing they were listed there, so that function was not updated (there > > I doubt anyone cares but conceptually it's a problem). > > It is clear to me.. at the time when the code for the ot filter was > being developed, it was easier for me to write the functions I was > missing (ie the content I was missing in the existing functions) and > then see what to do with it .. and it remained like that after > completion. At the time, I didn't think that something more > significant could change in working with variables one day. I know how we end up like this, I'm not saying that in bad terms, just as something to be careful about in the future. The codebase is huge nowadays and no single person knows even 10% of it anymore, so if we continue to create hacks or temporary stuff in various corners it's unmaintainable and breaks all over every time someone tries to fix a design problem. And that's already what happened with these shitty variables that have been causing us some gray hair for a few years now. > I was wrong, obviously. No worries. > > > The moment we write the variables we know their > > > names but at the moment we read them we don't know how many there are > > > and their complete name, rather we take all the variables of a > > > particular prefix. > > > > Hmmm that may become quite problematic then since we don't have that > > name anymore :-/ Are these variables used only for internal stuff in > > OT or are they used to exchange with the rest of the configuration ? > > In opentracing there is a context of data, it is a group of data that > is transferred from one place where they are used to another place. > > This data is defined as a key:value, where the use of haproxy variables > suited me perfectly. The problem is that it is not known in advance > what the keys of this data are and how much of this data is, this is > set by the user in the opentracing configuration (which has nothing to > do with haproxy). OK thanks for explaining! Now I understand indeed why you reused variables for this. > The operations that handle this data are 'inject' and 'extract', the > first writes all the data from the opentracing context to the haproxy > variable and the second extracts the data from the haproxy variable to > the opentracing context. > > During the 'inject' operation, it is known which are the variables (or > their names) because they are generated from the 'key' part of the > opentracing data. > > During the 'extract' operation this is not known, ie only the prefix of > the variable under which one of the 'inject' operations of that > variable is written is known. Got it. It's just as if you'd serialize a blob that it extracted on the other side, in some sort. > > Do you have a concrete example of a use case for such prefixes on > > receipt ? For example if you're dumping all variables from your current > > request, maybe that's enough. Or if you need prefixes for dumping all > > variables of a particular scope, we could proceed differently. > > Context transfer is conceived in two ways, one is via variables (if > transferred within the same haproxy process) and the other is via http > header (if transferred between different processes, there does not have > to be haproxy as one of the processes). > > An example of this is the configuration from the test/ctx directory, > more precisely the test/ctx/ot.cfg file. The only explicit variable I'm seeing there is sess.ot.uuid, so I guess that the rest is totally transparent, and that if needed we can arrange so that sess.ot.uuid is always created on new sessions. > An example of context transfer via http header is in the test/fe and > test/be directories where data is transferred from the frontend process > via the http header to the backend process, which can then add > opentracing data to the trace initiated in the frontend process. > > In principle, context transfer via haproxy variables can be omitted > because this data can also be transferred via opentracing process > without interfering with its context, because this process is started > and completed in the same haproxy program and does not need to be > exposed to the outside. OK. > But maybe someone needs it; I implemented it because it can be done and > not because someo
Re: I just broke opentracing :-(
On 09/08/2021 06:46 PM, Willy Tarreau wrote: I have no doubt about this, what I mean is that once such needs are identified, as much as possible we should try to move these parts into the proper location (vars), or if really needed, that must be clearly explained around the code so that someone else may try to figure what to change there. For example I've seen a function that parses scope names and turns them list heads, that's the perfect example of something that ought to have been adapted into vars.c if what was provided was not sufficient, because since then new scopes were added without anyone noticing they were listed there, so that function was not updated (there I doubt anyone cares but conceptually it's a problem). It is clear to me.. at the time when the code for the ot filter was being developed, it was easier for me to write the functions I was missing (ie the content I was missing in the existing functions) and then see what to do with it .. and it remained like that after completion. At the time, I didn’t think that something more significant could change in working with variables one day. I was wrong, obviously. The moment we write the variables we know their names but at the moment we read them we don't know how many there are and their complete name, rather we take all the variables of a particular prefix. Hmmm that may become quite problematic then since we don't have that name anymore :-/ Are these variables used only for internal stuff in OT or are they used to exchange with the rest of the configuration ? In opentracing there is a context of data, it is a group of data that is transferred from one place where they are used to another place. This data is defined as a key:value, where the use of haproxy variables suited me perfectly. The problem is that it is not known in advance what the keys of this data are and how much of this data is, this is set by the user in the opentracing configuration (which has nothing to do with haproxy). The operations that handle this data are 'inject' and 'extract', the first writes all the data from the opentracing context to the haproxy variable and the second extracts the data from the haproxy variable to the opentracing context. During the 'inject' operation, it is known which are the variables (or their names) because they are generated from the 'key' part of the opentracing data. During the 'extract' operation this is not known, ie only the prefix of the variable under which one of the 'inject' operations of that variable is written is known. Do you have a concrete example of a use case for such prefixes on receipt ? For example if you're dumping all variables from your current request, maybe that's enough. Or if you need prefixes for dumping all variables of a particular scope, we could proceed differently. Context transfer is conceived in two ways, one is via variables (if transferred within the same haproxy process) and the other is via http header (if transferred between different processes, there does not have to be haproxy as one of the processes). An example of this is the configuration from the test/ctx directory, more precisely the test/ctx/ot.cfg file. An example of context transfer via http header is in the test/fe and test/be directories where data is transferred from the frontend process via the http header to the backend process, which can then add opentracing data to the trace initiated in the frontend process. In principle, context transfer via haproxy variables can be omitted because this data can also be transferred via opentracing process without interfering with its context, because this process is started and completed in the same haproxy program and does not need to be exposed to the outside. But maybe someone needs it; I implemented it because it can be done and not because someone might use it. On the other hand, the transfer of context between processes via http headers is mandatory. I don't know at all if i was able to clarify some things related to the use of variables and the use of opentracing context. I need to take a closer look at how ot filter source can be redesigned to retain functionality and use the new structure definition to store variables. Just thinking loud, if you absolutely need to enumerate a list of variables when setting them, maybe it could be possible to enumerate their names into another well-known one and look up that one first on the other side? It is possible, it is not a bad idea. I'll think about it, thanks for the advice. I'm sorry that ot filter code caused these compilation problems. -- Zaga What can change the nature of a man?
Re: I just broke opentracing :-(
On Wed, Sep 08, 2021 at 06:30:15PM +0200, Miroslav Zagorac wrote: > there is a reason why i used some functions related to > reading/setting/searching variables. If I could use the original haproxy > functions, I wouldn't write these because duplicating the > code doesn't make sense. I have no doubt about this, what I mean is that once such needs are identified, as much as possible we should try to move these parts into the proper location (vars), or if really needed, that must be clearly explained around the code so that someone else may try to figure what to change there. For example I've seen a function that parses scope names and turns them list heads, that's the perfect example of something that ought to have been adapted into vars.c if what was provided was not sufficient, because since then new scopes were added without anyone noticing they were listed there, so that function was not updated (there I doubt anyone cares but conceptually it's a problem). > What I have seen so far is that the 'name' member of the vars structure > has actually been deleted and that the var_clear() function has been > given another parameter. Yep, previously, there was a "name" member pointing to a pre-allocated entry into the names table, that contained the part after the scope. The goal was to allocate names only once at boot time when variables couldn't be added at runtime. Now instead we don't keep this name and we only keep a 64-bit hash of that same part. > Generally, in the ot filter it is possible to transfer data via a group > of variables with a common prefix, so that in one place we set all the > variables and their values and in another we read all the variables > with a given prefix. OK I see, thanks for explaining! > The moment we write the variables we know their > names but at the moment we read them we don't know how many there are > and their complete name, rather we take all the variables of a > particular prefix. Hmmm that may become quite problematic then since we don't have that name anymore :-/ Are these variables used only for internal stuff in OT or are they used to exchange with the rest of the configuration ? Do you have a concrete example of a use case for such prefixes on receipt ? For example if you're dumping all variables from your current request, maybe that's enough. Or if you need prefixes for dumping all variables of a particular scope, we could proceed differently. > I need to take a closer look at how ot filter source can be redesigned > to retain functionality and use the new structure definition to store > variables. Just thinking loud, if you absolutely need to enumerate a list of variables when setting them, maybe it could be possible to enumerate their names into another well-known one and look up that one first on the other side? Thanks! Willy
Re: I just broke opentracing :-(
On 09/08/2021 04:41 PM, Willy Tarreau wrote: Hi Miroslav, I just discovered that the changes I've made to the totally broken variables API now broke opentracing because apparently it's using some of the variable code's internals for its own use: https://github.com/haproxy/haproxy/runs/3545475810 That's annoying because unbreaking the API was already extremely complicated due to the global name table limitation that became a feature on its own, but now I see that some upper level code outside of vars/ is also making intimate use of that broken API (e.g. in flt_ot_vars_unset), and I have no idea what it tries to do nor why. Most of the breakage comes from all the undocumented functions flt_ot_vars_* but I don't even understand why this does not use the vars_* API that Lua, CLI and SPOE already use. Note that it doesn't seem that the breakage is large, so most likely we can figure together a few changes to perform to fix it, but reversing all that is quite a pain, and even if I went through the process of trying to build it I wouldn't know if what I'm doing works or not :-/ The changes that broke are in the following commit range which tries hard to minimize the visible changes: 10080716b..55f8a830d Hello Willy, there is a reason why i used some functions related to reading/setting/searching variables. If I could use the original haproxy functions, I wouldn't write these because duplicating the code doesn't make sense. What I have seen so far is that the 'name' member of the vars structure has actually been deleted and that the var_clear() function has been given another parameter. Generally, in the ot filter it is possible to transfer data via a group of variables with a common prefix, so that in one place we set all the variables and their values and in another we read all the variables with a given prefix. The moment we write the variables we know their names but at the moment we read them we don't know how many there are and their complete name, rather we take all the variables of a particular prefix. I need to take a closer look at how ot filter source can be redesigned to retain functionality and use the new structure definition to store variables. -- Zaga What can change the nature of a man?
I just broke opentracing :-(
Hi Miroslav, I just discovered that the changes I've made to the totally broken variables API now broke opentracing because apparently it's using some of the variable code's internals for its own use: https://github.com/haproxy/haproxy/runs/3545475810 That's annoying because unbreaking the API was already extremely complicated due to the global name table limitation that became a feature on its own, but now I see that some upper level code outside of vars/ is also making intimate use of that broken API (e.g. in flt_ot_vars_unset), and I have no idea what it tries to do nor why. Most of the breakage comes from all the undocumented functions flt_ot_vars_* but I don't even understand why this does not use the vars_* API that Lua, CLI and SPOE already use. Note that it doesn't seem that the breakage is large, so most likely we can figure together a few changes to perform to fix it, but reversing all that is quite a pain, and even if I went through the process of trying to build it I wouldn't know if what I'm doing works or not :-/ The changes that broke are in the following commit range which tries hard to minimize the visible changes: 10080716b..55f8a830d I would highly appreciate it if you could have a look, figure what needs to be done, if possible using the public vars API, otherwise that we discuss what needs to be extended, and if you could document these functions, because, to be honest, the following doesn't help me much: /*** * NAME * flt_ot_vars_get - * * ARGUMENTS * s - * scope - * prefix - * opt- * err- * * DESCRIPTION * - * * RETURN VALUE * - */ struct otc_text_map *flt_ot_vars_get(struct stream *s, const char *scope, const char *prefix, uint opt, char **err) Thanks, Willy