Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
If there are no objections I will apply the latest patch to trunk. Regards, Christos On 06/26/2014 05:45 PM, Tsantilas Christos wrote: A new patch. Changes: - clt_conn_id renamed to clt_conn_tag - Amos requested fixes. I hope it is OK. Regards, Christos On 06/22/2014 08:43 PM, Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... On 06/19/2014 09:07 PM, Tsantilas Christos wrote: On 06/16/2014 06:36 PM, Alex Rousskov wrote: On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. If there is not any objection I will implement this scenario. Looks that this approach is the best and cover more cases than the accumulated Notes values. If someones need to accumulate Note values he can configure squid to send old note value to helper and helper include it in its response. This is simple. If required in the future we can implement a configuration parameter which configures one or more notes as always accumulated. For example: AccumulateHelperNotes status message HTH, Alex.
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
A new patch. Changes: - clt_conn_id renamed to clt_conn_tag - Amos requested fixes. I hope it is OK. Regards, Christos On 06/22/2014 08:43 PM, Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... On 06/19/2014 09:07 PM, Tsantilas Christos wrote: On 06/16/2014 06:36 PM, Alex Rousskov wrote: On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. If there is not any objection I will implement this scenario. Looks that this approach is the best and cover more cases than the accumulated Notes values. If someones need to accumulate Note values he can configure squid to send old note value to helper and helper include it in its response. This is simple. If required in the future we can implement a configuration parameter which configures one or more notes as always accumulated. For example: AccumulateHelperNotes status message HTH, Alex. Support client connection annotation by helpers via clt_conn_tag=TAG. TCP client connections tagging is useful for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. A URL rewriter or, external ACL helper may mark the authenticated client connection to avoid going through authentication steps during subsequent requests on the same connection and to share connection authentication information with Squid ACLs, other helpers, and logs. After this change, Squid accepts optional clt_conn_tag=TAG pair from a helper and associates the received TAG with the client TCP connection. Squid treats the received clt_conn_tag=TAG pair as a regular annotation, but also keeps it across all requests on the same client connection. A helper may update the client connection TAG value during subsequent requests. Also after this patch the notes comming from helpers replaces any existing note values. This is a Measurement Factory project === modified file 'src/Notes.cc' --- src/Notes.cc 2014-04-30 09:41:25 + +++ src/Notes.cc 2014-06-26 10:55:50 + @@ -14,40 +14,41 @@ * This program is free software; you can redistribute it and/or
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/24/2014 08:21 AM, Amos Jeffries wrote: On 24/06/2014 1:44 a.m., Tsantilas Christos wrote: On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: in src/Notes.cc: * replaceOrAdd iterates over the notes list twice in the event of removals (due to remove() iterating over it from scratch). Please fix this to iterate over the list only once and do an in-place replace if the key is found already there, add if end() is reached. * remove() appears to be unnecessary after the above correction. I can fix a litle replaceOrAdd, but the remove() method still is needed. By what code? I see it only being used by replaceOrAdd(), and only to perform the unnecessary loop. It is fine as a remove() implementation, but to me looks needless at this time. Which loop is unnecessary? You need one loop to iterate over src NotePairs and one loop to iterate over this-entries to find and remove the existing item. The second loop done by remove() method. Am I missing something? The replaceOrAdd() method method will be: void NotePairs::replaceOrAdd(const NotePairs *src) { for (std::vectorNotePairs::Entry *::const_iterator i = src-entries.begin(); i != src-entries.end(); ++i) { remove((*i)-name.termedBuf()); entries.push_back(new NotePairs::Entry((*i)-name.termedBuf(), (*i)-value.termedBuf())); } } in Notes.*: * findFirst() no longer makes sense. There being no second now. - Please rename to find(). This is not true. We can still support multiple values per key name. I am suggesting to not touch this one. Every helper callback is now using Update*() which uses replaceOrAddd() to ensure uniqueness. Not exactly uniqueness, just that the new key/value pairs will replace any existing. I see that the list output directly from the helper may contain duplicates and might be used by the handler instead of the HttpRequest merged list. We may want to change that to make handler behaviour match behaviour of the ACL testing the HttpRequest list. My understanding from discussion is that in future we may want to support multiple values per key note. Maybe using a configuration parameter, or other mechanism. Moreover the helper may return for the same key multiple values: ... key=value1 key=value2 key=value3 Just if already exist the key 'key' in notes list will be replaced by the new key/values pairs. We can change the above and support only one value per key. My opinion is that there is not any strong reason for denying multiple values per key. It may be useful feature. But I have not strong opinion on this. in src/client_side_request.cc: * ClientRequestContext::clientRedirectDone was previously calling SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes list before updating the request. Order is important here because Sync will kill Squid with an assert if the objects have different notes list objects. Until that is fixed please retain the order of: - sync first, then update the request. * same in ClientRequestContext::clientStoreIdDone It is the same, it is not make difference. UpdateRequestNotes will just copy helper notes to the request notes. The SyncNotes will check if AccessLogEntry and HttpRequest object points to the same NotePairs object. If you are going to retain it the way it is please add a note to the AccessLogEntry::notes member indicating that it must NEVER be set other than by calling SyncNotes() with the current transactions HttpRequest object. Doing so will guarantee crashes after this patch goes in. The AccessLogEntry::notes must never set. This is does not have to do with the order the UpdataRequestNotes and SyncNotes called. The squid will crash even if we call SyncNotes before UpdateRequestNotes. However I have not any problem to change the order. Maybe make more sense and also I can remove the if (!request.notes) block from UpdateRequestNotes. The burden will also be placed on auditors to check this with every notes related patch. (I would rather avoid that extra work TBH). OK, if I will avoid the extra work that way I will call SyncNotes before UpdateRequestNotes :-) Regards, Christos Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/23/2014 11:00 PM, Amos Jeffries wrote: On 24/06/2014 8:19 a.m., Alex Rousskov wrote: On 06/23/2014 07:44 AM, Tsantilas Christos wrote: On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. The clt_conn_id is not a client tag, but it can be a connection-tag or connection-label. client-tag does not work well indeed because the tag is specific to the connection, not the client, for some popular definitions of a client. For example, many differently authenticated clients may share the same tagged connection (when that connection comes from a peer that aggregates traffic from many end-users). My understanding was that a major use-case for this feature was for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. Correct. There is a certain danger in conflating a use case with a feature that resulted from that use case, but you have stated the original motivation correctly. Allowing client A to login client B, C, ...Z etc only makes sense if the client and user are detached concepts at this level. Client being strictly the client-server model type of client for whom any number of users may exist, or proxying for other further away clients in a multi-hop setup. In the scope of this particular feature, only the former interpretation of client applies: The agent making a connection to Squid. Distant users indirectly delivered or proxied to Squid do not apply. Either way by using this tag the helper is assuming that it is relevant to all future requests on the whole connection. ie that there is either one client, or one user which the note represents. Not quite: One user or client may open multiple connections to Squid, each possibly having its own ID/tag/label. The feature is about [client] connection tagging, not client or user tagging. Amos, does clt_conn_tag, clt_conn_mark, clt_conn_label, or clt_conn_note (or their longer, partially spelled out versions) sound better than clt_conn_id to you? Yes. Any of them are better. tag has existing history from exetrnal_acl_type interface that may want to be considered. tag usage in external_acl_type is semantically similar to what we are doing here. Unfortunately, tag is not called clt_request_tag, and it has a different never-reset semantics. mark is used in QoS (TOS/DSCP) connection marking context, a semantically related concept. Note that some supported connection markings can be reset/changed during a single connection lifetime (AFAICT), just like the connection ID being proposed. On the other hand, those markings alter bytes on the wire while the proposed feature does not do that [directly]. note is used for various annotations with a different multi-value semantics (unless we change that ASAP). Also, the proposed feature is setting the _value_ of that annotation without controlling the note key so it is somewhat wrong to call it clt_conn_note. It should be clt_conn_note_value instead, which is probably too long/awkward. None of these suffixes are a perfect fit. I do not have a strong preference, but can suggest clt_conn_tag if nobody has a strong preference or better ideas. HTH, Alex.
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... Design issue: * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. in src/Notes.cc: * replaceOrAdd iterates over the notes list twice in the event of removals (due to remove() iterating over it from scratch). Please fix this to iterate over the list only once and do an in-place replace if the key is found already there, add if end() is reached. * remove() appears to be unnecessary after the above correction. in Notes.*: * findFirst() no longer makes sense. There being no second now. - Please rename to find(). in src/cf.data.pre: * kv-pairs is a format syntax definition used throughout the documentation. Please retain it. If the term needs re-defining that is an issue outside this patch scope. * Please reword Squid understands the following optional key=value pairs received from URL rewriters:. Since there are optional kv-pairs already documented in the rewrite vs redirect vs no-change syntax we need to specify that these are optional extensions. Perhapse this: In addition to the above kv-pairs Squid also understands the following optional kv-pairs received from URL rewriters: * Please update the kv-pair description like so (where FOO is the term tag or label selected to replace ID by the above mentioned design issue) Associates a FOO with the client TCP connection. The FOO is treated as a regular annotation but persists across future requests on the client connection rather than just the current request. A helper may update the FOO during subsequent requests by returning a new kv-pair. * Reference to url_rewrite_extras should be directy after the helper input syntax line. Like so: [channel-ID SP] URL [SP extras]NL see url_rewrite_extras on how to send extras with optional values to the helper. in src/client_side_request.cc: * ClientRequestContext::clientRedirectDone was previously calling SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes list before updating the request. Order is important here because Sync will kill Squid with an assert if the objects have different notes list objects. Until that is fixed please retain the order of: - sync first, then update the request. * same in ClientRequestContext::clientStoreIdDone * in ClientHttpRequest::doCallouts() please use SBuf::isEmpty() instead of length() to check for existence. Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... Design issue: * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. The clt_conn_id is not a client tag, but it can be a connection-tag or connection-label. Lets wait Alex to answer on this. in src/Notes.cc: * replaceOrAdd iterates over the notes list twice in the event of removals (due to remove() iterating over it from scratch). Please fix this to iterate over the list only once and do an in-place replace if the key is found already there, add if end() is reached. * remove() appears to be unnecessary after the above correction. I can fix a litle replaceOrAdd, but the remove() method still is needed. The replaceOrAdd() method method will be: void NotePairs::replaceOrAdd(const NotePairs *src) { for (std::vectorNotePairs::Entry *::const_iterator i = src-entries.begin(); i != src-entries.end(); ++i) { remove((*i)-name.termedBuf()); entries.push_back(new NotePairs::Entry((*i)-name.termedBuf(), (*i)-value.termedBuf())); } } in Notes.*: * findFirst() no longer makes sense. There being no second now. - Please rename to find(). This is not true. We can still support multiple values per key name. I am suggesting to not touch this one. in src/cf.data.pre: * kv-pairs is a format syntax definition used throughout the documentation. Please retain it. If the term needs re-defining that is an issue outside this patch scope. * Please reword Squid understands the following optional key=value pairs received from URL rewriters:. Since there are optional kv-pairs already documented in the rewrite vs redirect vs no-change syntax we need to specify that these are optional extensions. Perhapse this: In addition to the above kv-pairs Squid also understands the following optional kv-pairs received from URL rewriters: * Please update the kv-pair description like so (where FOO is the term tag or label selected to replace ID by the above mentioned design issue) Associates a FOO with the client TCP connection. The FOO is treated as a regular annotation but persists across future requests on the client connection rather than just the current request. A helper may update the FOO during subsequent requests by returning a new kv-pair. * Reference to url_rewrite_extras should be directy after the helper input syntax line. Like so: [channel-ID SP] URL [SP extras]NL see url_rewrite_extras on how to send extras with optional values to the helper. Ok for these changes in src/client_side_request.cc: * ClientRequestContext::clientRedirectDone was previously calling SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes list before updating the request. Order is important here because Sync will kill Squid with an assert if the objects have different notes list objects. Until that is fixed please retain the order of: - sync first, then update the request. * same in ClientRequestContext::clientStoreIdDone It is the same, it is not make difference. UpdateRequestNotes will just copy helper notes to the request notes. The SyncNotes will check if AccessLogEntry and HttpRequest object points to the same NotePairs object. * in ClientHttpRequest::doCallouts() please use SBuf::isEmpty() instead of length() to check for existence. OK. Lets wait others comments before I post a new patch. Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/23/2014 07:44 AM, Tsantilas Christos wrote: On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. The clt_conn_id is not a client tag, but it can be a connection-tag or connection-label. client-tag does not work well indeed because the tag is specific to the connection, not the client, for some popular definitions of a client. For example, many differently authenticated clients may share the same tagged connection (when that connection comes from a peer that aggregates traffic from many end-users). connection-tag does not work well because the proposed feature is specific to the connection coming from an HTTP client. We may decide to tag connections to servers later. The internal instance IDs are not particularly relevant to this discussion about an external interface name IMO, but yes, there are many IDs used in Squid, for various needs. The connectionId_ name of the new data member avoids confusion with the existing InstanceId members IMO, but if the proposed option is renamed, we may need to rename the data member as well. Please note that there is no client in the data member name because the class manages a connection to a client so the client scope should be assumed by default. Amos, does clt_conn_tag, clt_conn_mark, clt_conn_label, or clt_conn_note (or their longer, partially spelled out versions) sound better than clt_conn_id to you? HTH, Alex.
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 24/06/2014 8:19 a.m., Alex Rousskov wrote: On 06/23/2014 07:44 AM, Tsantilas Christos wrote: On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. The clt_conn_id is not a client tag, but it can be a connection-tag or connection-label. client-tag does not work well indeed because the tag is specific to the connection, not the client, for some popular definitions of a client. For example, many differently authenticated clients may share the same tagged connection (when that connection comes from a peer that aggregates traffic from many end-users). My understanding was that a major use-case for this feature was for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. Allowing client A to login client B, C, ...Z etc only makes sense if the client and user are detached concepts at this level. Client being strictly the client-server model type of client for whom any number of users may exist, or proxying for other further away clients in a multi-hop setup. Either way by using this tag the helper is assuming that it is relevant to all future requests on the whole connection. ie that there is either one client, or one user which the note represents. connection-tag does not work well because the proposed feature is specific to the connection coming from an HTTP client. We may decide to tag connections to servers later. The internal instance IDs are not particularly relevant to this discussion about an external interface name IMO, but yes, there are many IDs used in Squid, for various needs. The connectionId_ name of the new data member avoids confusion with the existing InstanceId members IMO, but if the proposed option is renamed, we may need to rename the data member as well. Please note that there is no client in the data member name because the class manages a connection to a client so the client scope should be assumed by default. Amos, does clt_conn_tag, clt_conn_mark, clt_conn_label, or clt_conn_note (or their longer, partially spelled out versions) sound better than clt_conn_id to you? Yes. Any of them are better. tag has existing history from exetrnal_acl_type interface that may want to be considered. Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 24/06/2014 1:44 a.m., Tsantilas Christos wrote: On 06/23/2014 09:50 AM, Amos Jeffries wrote: On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... Design issue: * can we call this client-tag= or connection-tag= or connection-label=. We have ID used internally via InstanceId template members, which are very different to this ID being user-assigned. The clt_conn_id is not a client tag, but it can be a connection-tag or connection-label. Lets wait Alex to answer on this. See my answer to Alex. By using this note the helper is explicitly defining that connection == 1 client. in src/Notes.cc: * replaceOrAdd iterates over the notes list twice in the event of removals (due to remove() iterating over it from scratch). Please fix this to iterate over the list only once and do an in-place replace if the key is found already there, add if end() is reached. * remove() appears to be unnecessary after the above correction. I can fix a litle replaceOrAdd, but the remove() method still is needed. By what code? I see it only being used by replaceOrAdd(), and only to perform the unnecessary loop. It is fine as a remove() implementation, but to me looks needless at this time. The replaceOrAdd() method method will be: void NotePairs::replaceOrAdd(const NotePairs *src) { for (std::vectorNotePairs::Entry *::const_iterator i = src-entries.begin(); i != src-entries.end(); ++i) { remove((*i)-name.termedBuf()); entries.push_back(new NotePairs::Entry((*i)-name.termedBuf(), (*i)-value.termedBuf())); } } in Notes.*: * findFirst() no longer makes sense. There being no second now. - Please rename to find(). This is not true. We can still support multiple values per key name. I am suggesting to not touch this one. Every helper callback is now using Update*() which uses replaceOrAddd() to ensure uniqueness. I see that the list output directly from the helper may contain duplicates and might be used by the handler instead of the HttpRequest merged list. We may want to change that to make handler behaviour match behaviour of the ACL testing the HttpRequest list. in src/client_side_request.cc: * ClientRequestContext::clientRedirectDone was previously calling SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes list before updating the request. Order is important here because Sync will kill Squid with an assert if the objects have different notes list objects. Until that is fixed please retain the order of: - sync first, then update the request. * same in ClientRequestContext::clientStoreIdDone It is the same, it is not make difference. UpdateRequestNotes will just copy helper notes to the request notes. The SyncNotes will check if AccessLogEntry and HttpRequest object points to the same NotePairs object. If you are going to retain it the way it is please add a note to the AccessLogEntry::notes member indicating that it must NEVER be set other than by calling SyncNotes() with the current transactions HttpRequest object. Doing so will guarantee crashes after this patch goes in. The burden will also be placed on auditors to check this with every notes related patch. (I would rather avoid that extra work TBH). Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? For example something like the following in a global section should be enough: The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax key=value . Some keys have special meaning to Squid, as documented here. Currently Squid understands the following optional key=value pairs received from URL rewriters: clt_conn_id=ID Associates the received ID with the client TCP connection. The clt_conn_id=ID pair is treated as a regular annotation but it persists across all transactions on the client connection rather than disappearing after the current request. A helper may update the client connection ID value during subsequent requests by returning a new ID value. To send the connection ID to the URL rewriter, use url_rewrite_extras: url_rewrite_extras clt_conn_id=%{clt_conn_id}note ... On 06/19/2014 09:07 PM, Tsantilas Christos wrote: On 06/16/2014 06:36 PM, Alex Rousskov wrote: On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. If there is not any objection I will implement this scenario. Looks that this approach is the best and cover more cases than the accumulated Notes values. If someones need to accumulate Note values he can configure squid to send old note value to helper and helper include it in its response. This is simple. If required in the future we can implement a configuration parameter which configures one or more notes as always accumulated. For example: AccumulateHelperNotes status message HTH, Alex. Support client connection annotation by helpers via clt_conn_id=ID. TCP client connections tagging is useful for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. A URL rewriter or, external ACL helper may mark the authenticated client connection to avoid going through authentication steps during subsequent requests on the same connection and to share connection authentication information with Squid ACLs, other helpers, and logs. After this change, Squid accepts optional clt_conn_id=ID pair from a helper and associates the received ID with the client TCP connection. Squid treats the received clt_conn_id=ID pair as a regular annotation, but also keeps it across all requests on the same client connection. A helper may update the client connection ID value during subsequent requests. To send clt_conn_id=ID pair to a URL rewriter, use url_rewrite_extras with a %{clt_conn_id}note macro. Also after this patch the notes comming from helpers replaces any existing note values. This is a Measurement Factory project === modified file 'src/Notes.cc' --- src/Notes.cc 2014-04-30 09:41:25 + +++ src/Notes.cc 2014-06-22 17:15:26 + @@ -14,40 +14,41 @@ * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by *
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 23/06/2014 5:43 a.m., Tsantilas Christos wrote: Hi all, The attached patch replaces existin annotation values with the new one received from helpers. Just one question. We are documenting key-value pairs in cf.data.pre only for url-rewriter helpers, but annotations works for all squid helpers. Should we move the related url-rewriter section to a global section? If yes where? The external_acl_type and store_id_program document their protocol and keys as well. Those interfaces are one where users commonly are required to create custom helpers so having it in these docs is helpful if a bit odd. The other interfaces (auth, logging, unlinkd, diskd, pinger, etc) which do not document the protocol in detail generally the Squid bundled helpers are used and the docs reference the wiki for the more advanced users. Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/16/2014 06:36 PM, Alex Rousskov wrote: On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. If there is not any objection I will implement this scenario. Looks that this approach is the best and cover more cases than the accumulated Notes values. If someones need to accumulate Note values he can configure squid to send old note value to helper and helper include it in its response. This is simple. If required in the future we can implement a configuration parameter which configures one or more notes as always accumulated. For example: AccumulateHelperNotes status message HTH, Alex.
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 20/06/2014 6:07 a.m., Tsantilas Christos wrote: On 06/16/2014 06:36 PM, Alex Rousskov wrote: On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? All helpers may return message= or log=. Replacing existing notes means we have to pass at least those to every helper explicitly. AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Ideally yes. However a helper requiring previous helpers annotations because it *might* replace them is a new detail created by this proposed replacment model. In the existing model of appending notes the secondary helper only requires what it needs for its core action logics. With the message= example, any 1-byte variance on first helpers message= output will result in extra churn on the second helpers response cache and at least an extra lookup loading on the helper. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. If there is not any objection I will implement this scenario. Looks that this approach is the best and cover more cases than the accumulated Notes values. If someones need to accumulate Note values he can configure squid to send old note value to helper and helper include it in its response. This is simple. The only case it adds coverage for is giving helpers ability to erase previous helpers annotations. While reducing the helper response cache HIT ratio. That said, I am okay with the design change. AFAIK there is nobody making much complicated use of helper notes (yet). Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/15/2014 12:07 AM, Amos Jeffries wrote: On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Why would that mean the above? AFAICT, the helper should get only the annotations it needs. That need is helper-specific and, hence, is configurable via the various _extras and equivalent directives. That is already supported and does not need to change. Here is the overall sketch for supporting unique annotations: 1. Send the helper the annotations it is configured to get (no changes here). 2. For each unique annotation key received from the helper, remove any old annotation(s) with the same key. 3. Store annotations received from the helper (no changes here). To support explicit annotation deletion, we can adjust #3 to skip key-value pairs with the value equal to '-'. HTH, Alex.
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 15/06/2014 4:58 a.m., Alex Rousskov wrote: On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. Doing that would mean passing all existing annotations to every helper lookup. Which would seriously degrade the helper result caching. Amos
Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID
On 06/11/2014 08:52 AM, Tsantilas Christos wrote: I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. I suggest making all annotations unique (i.e., new values overwrite old ones) because helpers that want to accumulate annotation values can do that by returning old values along with new ones: received by helper: name=v1 returned by helper: name=v1,v2 Please note that the opposite does not work: If annotation values are always accumulated, a helper cannot overwrite/remove the old value. If, in the future, we discover a need to support accumulating annotations, we can add a squid.conf option to enumerate annotation keys that should be accumulated. IMO, if others agree that annotations should not automatically accumulate, that change can be done as a part of this patch (because it needs the uniqueness exception/code anyway). If others disagree, the patch can go in as is while we wait for the need to add a configuration option to control uniqueness. Cheers, Alex.
[PATCH] Support client connection annotation by helpers via clt_conn_id=ID
TCP client connections tagging is useful for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. A URL rewriter or, external ACL helper may mark the authenticated client connection to avoid going through authentication steps during subsequent requests on the same connection and to share connection authentication information with Squid ACLs, other helpers, and logs. After this change, Squid accepts optional clt_conn_id=ID pair from a helper and associates the received ID with the client TCP connection. Squid treats the received clt_conn_id=ID pair as a regular annotation, but also keeps it across all requests on the same client connection. A helper may update the client connection ID value during subsequent requests. This patch documents the clt_conn_id key=value pair in cf.data.pre file only for url rewriters. Because annotations are common to all helpers we may want to make a special section at the beginning of cf.data.per for all helpers. Suggestions are welcome. I must also note that this patch adds an inconsistency. All annotation key=values pairs received from helpers, accumulated to the existing key notes values. The clt_conn_id=Id pair is always unique and replaces the existing clt_conn_id=Id annotation pair. We may want to make all annotations unique, or maybe implement a configuration mechanism to define which annotations are overwriting their previous values and which appending the new values. This is a Measurement Factory project Regards, Christos Support client connection annotation by helpers via clt_conn_id=ID. TCP client connections tagging is useful for faking various forms of connection-based authentication when standard HTTP authentication cannot be used. A URL rewriter or, external ACL helper may mark the authenticated client connection to avoid going through authentication steps during subsequent requests on the same connection and to share connection authentication information with Squid ACLs, other helpers, and logs. After this change, Squid accepts optional clt_conn_id=ID pair from a helper and associates the received ID with the client TCP connection. Squid treats the received clt_conn_id=ID pair as a regular annotation, but also keeps it across all requests on the same client connection. A helper may update the client connection ID value during subsequent requests. To send clt_conn_id=ID pair to a URL rewriter, use url_rewrite_extras with a %{clt_conn_id}note macro. This is a Measurement Factory project === modified file 'src/Notes.cc' --- src/Notes.cc 2014-04-30 09:41:25 + +++ src/Notes.cc 2014-06-10 15:20:30 + @@ -14,40 +14,41 @@ * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. * */ #include squid.h #include AccessLogEntry.h #include acl/FilledChecklist.h #include acl/Gadgets.h +#include client_side.h #include ConfigParser.h #include globals.h #include HttpReply.h #include HttpRequest.h #include SquidConfig.h #include Store.h #include StrList.h #include algorithm #include string Note::Value::~Value() { aclDestroyAclList(aclList); } Note::Value::Pointer Note::addValue(const String value) { Value::Pointer v = new Value(value); @@ -186,40 +187,54 @@ return value.size() ? value.termedBuf() : NULL; } const char * NotePairs::findFirst(const char *noteKey) const { for (std::vectorNotePairs::Entry *::const_iterator i = entries.begin(); i != entries.end(); ++i) { if ((*i)-name.cmp(noteKey) == 0) return (*i)-value.termedBuf(); } return NULL; } void NotePairs::add(const char *key, const char *note) { entries.push_back(new NotePairs::Entry(key, note)); } void +NotePairs::remove(const char *key) +{ +std::vectorNotePairs::Entry *::iterator i = entries.begin(); +while(i != entries.end()) { +if ((*i)-name.cmp(key) == 0) { +delete *i; +i = entries.erase(i); +} else { +++i; +} +} +} + +void NotePairs::addStrList(const char *key, const char *values) { String strValues(values); const char *item; const char *pos = NULL; int ilen = 0; while (strListGetItem(strValues, ',', item, ilen, pos)) { String v;