Re: [RFC] connections logging
On 24/06/2014 12:26 p.m., Alex Rousskov wrote: On 06/20/2014 03:04 AM, Amos Jeffries wrote: I am playing with the idea of adding a new log file to record just the connections handled by Squid (rather than the HTTP request/reply transactions). This log would include connections opened but never used, port details, connection lifetimes, re-use counts on persistent connections, SSL/TLS presence/version (or not), and other connection details we find a need for later. The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: 1. Logging when the connection is received (unlike transactions that are logged when the transaction is completed). 2. Logging client-to-Squid connections. 3. Introduction of some kind of unique connection ID that can be associated with HTTP/etc transactions on that connection, correlating access.log and connection.log entries. [ uniqueness scope is being clarified, but I hope it would just be a single worker lifetime ] For IDs I was thinking of just emitting the MasterXaction::id value sent in by TcpAcceptor. Assumign this is defined as the TCP transaction (see below). Let's coordinate our efforts! Sure. The Squid native access.log appears unsuitable for adjustment given its widely used format and the number of details to be logged. Agreed. We should support a similarly configurable log format and other existing logging knobs but use a different, dedicated log (or logs if we want to record from-Squid connections and to-Squid connections differently). Reusable logformat codes such as %la and %lp should be reused, but new ones will probably be needed as well. Ack. There is also a useful side effect of MasterXaction. At present the TcpAcceptor generates a MasterXaction object to record initial connection details for the handler, this creates a bit of complexity in the recipient regarding whether the MasterXactino object applies to a future HTTP request or one which is currently being parsed. This could be simplified if the connection was a transaction and the HTTP requests on it each a different transaction spawned from the connection MasterXaction when parsing a new requst. I am not sure what you mean by if the connection was a transaction. I mean defining the TCP actions of opening a socket/connnection, its entire lifetime, and close() as being one transaction. With the TcpAcceptor produced MasterXaction representing that cycle. HTTP requests arriving on the connection being separate transactions spawned by the parsing actions. (unfortunately the read(2) does not always nicely match a parse event identifying the exact HTTP request start). So we have two new transaction types there. TCP transaction and HTTP transaction, to go alongside ICAP and eCAP transaction etc. The MasterXaction is usually associated with at least one connection (via the small t transaction using that connection), and one connection is usually associated with at least one MasterTransaction, but transactions and connections are completely different concepts at Squid level IMO. I doubt we should try to merge the concept of a transaction and a connection somehow. Generating the HTTP transaction MasterXaction object (child of MasterXaction?) by cloning it from the TCP connection one represents that relationship. The clone could be seeded with the parents id value to persist that info for later use (eg your start of HTTP transaction log entry). MasterTransaction is created at connection acceptance time because that is when the first HTTP/etc. transaction starts (parsing headers is not the first step) and, hence, that is when the master transaction starts. Hmm. We have some issue here satisfying the people who want HTTP transation only to represent the time of request+reply versus browser happy eyeballs (and now Squid) opening standby connections with long durations before first byte. That is one reason supporting separate transaction MasterXaction for TCP and HTTP. We could make the connection acceptance event itself a transaction (an object of some new AcceptXaction class inherited from a generic Transaction class and associated with the corresponding Master Transaction object just like HTTP and ICAP transactions should be). I do not yet see what that would actually buy us except more classes to worry about, but, at the very least, it would not be conceptually wrong. I agree accept by itself is probably useless. The longer TCP lifecycle on the other hand (see above) has some use for
Re: [RFC] connections logging
On 24/06/2014 4:41 p.m., Kinkie wrote: On Tue, Jun 24, 2014 at 2:26 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/20/2014 03:04 AM, Amos Jeffries wrote: I am playing with the idea of adding a new log file to record just the connections handled by Squid (rather than the HTTP request/reply transactions). This log would include connections opened but never used, port details, connection lifetimes, re-use counts on persistent connections, SSL/TLS presence/version (or not), and other connection details we find a need for later. The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: I have an use case, as well: timing logging. It would be useful to have a chance to log the timing of certain key moments in a request's processing path. Things like accept, end of slow acl matching, end of dns resolution(s), connected to uplink, and so on. This could help administrators identify congestion issues in their infrastructure. Kinkie And that use case gives us a good potential name for it. event.log ? Amos
Re: [RFC] connections logging
On 06/24/2014 12:55 AM, Amos Jeffries wrote: On 24/06/2014 4:41 p.m., Kinkie wrote: On Tue, Jun 24, 2014 at 2:26 AM, Alex Rousskov wrote: On 06/20/2014 03:04 AM, Amos Jeffries wrote: The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: I have an use case, as well: timing logging. It would be useful to have a chance to log the timing of certain key moments in a request's processing path. Things like accept, end of slow acl matching, end of dns resolution(s), connected to uplink, and so on. This could help administrators identify congestion issues in their infrastructure. And that use case gives us a good potential name for it. event.log ? I was also going to say that if we want to log more than just connection-specific events, then we should expand the proposal to accommodate arbitrary events. Needless to say, the initial implementation may only support one or two event kinds, but the design should allow for other future event kinds. Other known use cases include logging initial HIT or MISS detection: I had to patch Squid recently to enable that event logging (for legacy reasons the admin could not use ToS marks). This kind of circles back to the debugging discussion we are avoiding on another thread -- in both cases, we need an internal API to signal useful events and the corresponding configuration knobs to control their logging. The line between developer-useful (i.e., debugging) events and admin-useful events is blurry, but we do not have to define it if we just focus on logging admin-useful stuff. I would not be surprised if this eventually morphs into a better debugging interface as well. Cheers, Alex.
Re: [RFC] connections logging
On 06/24/2014 12:33 AM, Amos Jeffries wrote: On 24/06/2014 12:26 p.m., Alex Rousskov wrote: On 06/20/2014 03:04 AM, Amos Jeffries wrote: I am playing with the idea of adding a new log file to record just the connections handled by Squid (rather than the HTTP request/reply transactions). This log would include connections opened but never used, port details, connection lifetimes, re-use counts on persistent connections, SSL/TLS presence/version (or not), and other connection details we find a need for later. The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: 1. Logging when the connection is received (unlike transactions that are logged when the transaction is completed). 2. Logging client-to-Squid connections. 3. Introduction of some kind of unique connection ID that can be associated with HTTP/etc transactions on that connection, correlating access.log and connection.log entries. [ uniqueness scope is being clarified, but I hope it would just be a single worker lifetime ] For IDs I was thinking of just emitting the MasterXaction::id value sent in by TcpAcceptor. Assumign this is defined as the TCP transaction (see below). Yes, MasterXaction ID can be copied to be used as a connection ID, but there is no good reason for such a reuse IMO. We need a Connection ID. IMO, it would be better to just add a Connection-dedicated InstanceId (especially if it satisfies the yet-unknown uniqueness scope requirements). Yes, an InstanceId-based ID would work for the worker lifetime uniqueness scope, and I hope that is all we need to support. I am not sure what you mean by if the connection was a transaction. I mean defining the TCP actions of opening a socket/connnection, its entire lifetime, and close() as being one transaction. With the TcpAcceptor produced MasterXaction representing that cycle. I strongly disagree that we should change the MasterXaction definition like that. We already have a Connection class to represent what you want, I think. Why drag Master Transaction into this, abandoning its current role of aggregating information about transactions caused by a single HTTP request (or equivalent)? HTTP transactions need a Master Transaction to correlate an HTTP request received from a client with the corresponding HTTP request(s) sent to the server, with the corresponding HTTP response(s) received from the server, with the corresponding ICAP messages, DNS activity, etc. A Master Transaction usually relates to several Connections as well, but Connection is not a Master Transaction. So we have two new transaction types there. TCP transaction and HTTP transaction, to go alongside ICAP and eCAP transaction etc. It is a bad idea to use the same terminology and class hierarchy for low-level transport concepts such as TCP and for high level HTTP transactions executed on top of a transport connection. We should just use Connection instead of TCP transaction. If we need TcpConnection, we can add that. The Connection hierarchy is separate from the Xaction hierarchy, but Xactions may use Connections, of course. The MasterXaction is usually associated with at least one connection (via the small t transaction using that connection), and one connection is usually associated with at least one MasterTransaction, but transactions and connections are completely different concepts at Squid level IMO. I doubt we should try to merge the concept of a transaction and a connection somehow. Generating the HTTP transaction MasterXaction object (child of MasterXaction?) by cloning it from the TCP connection one represents that relationship. The clone could be seeded with the parents id value to persist that info for later use (eg your start of HTTP transaction log entry). I am afraid we have a completely different view of what MasterXaction is and what protocol transactions are (or should be). It is possible that I am completely misinterpreting your statement about cloning and seeding, but if you are calling for creation of some abstract MasterXaction class that will be a parent for various specific FooMasterXaction classes such as HttpMasterXaction and TcpMasterXaction, then I see no need for (and a lot of harm from) that level of complexity. I do not know how to avoid another prolonged discussion here, but perhaps we can start with definitions of the proposed classes (instead of discussing cloning and seeding which are just low-level details)? The classes I am talking are, roughly: * MasterXaction: [Historical/aggregated
Re: [RFC] connections logging
On 06/20/2014 03:04 AM, Amos Jeffries wrote: I am playing with the idea of adding a new log file to record just the connections handled by Squid (rather than the HTTP request/reply transactions). This log would include connections opened but never used, port details, connection lifetimes, re-use counts on persistent connections, SSL/TLS presence/version (or not), and other connection details we find a need for later. The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: 1. Logging when the connection is received (unlike transactions that are logged when the transaction is completed). 2. Logging client-to-Squid connections. 3. Introduction of some kind of unique connection ID that can be associated with HTTP/etc transactions on that connection, correlating access.log and connection.log entries. [ uniqueness scope is being clarified, but I hope it would just be a single worker lifetime ] Let's coordinate our efforts! The Squid native access.log appears unsuitable for adjustment given its widely used format and the number of details to be logged. Agreed. We should support a similarly configurable log format and other existing logging knobs but use a different, dedicated log (or logs if we want to record from-Squid connections and to-Squid connections differently). Reusable logformat codes such as %la and %lp should be reused, but new ones will probably be needed as well. There is also a useful side effect of MasterXaction. At present the TcpAcceptor generates a MasterXaction object to record initial connection details for the handler, this creates a bit of complexity in the recipient regarding whether the MasterXactino object applies to a future HTTP request or one which is currently being parsed. This could be simplified if the connection was a transaction and the HTTP requests on it each a different transaction spawned from the connection MasterXaction when parsing a new requst. I am not sure what you mean by if the connection was a transaction. The MasterXaction is usually associated with at least one connection (via the small t transaction using that connection), and one connection is usually associated with at least one MasterTransaction, but transactions and connections are completely different concepts at Squid level IMO. I doubt we should try to merge the concept of a transaction and a connection somehow. MasterTransaction is created at connection acceptance time because that is when the first HTTP/etc. transaction starts (parsing headers is not the first step) and, hence, that is when the master transaction starts. We could make the connection acceptance event itself a transaction (an object of some new AcceptXaction class inherited from a generic Transaction class and associated with the corresponding Master Transaction object just like HTTP and ICAP transactions should be). I do not yet see what that would actually buy us except more classes to worry about, but, at the very least, it would not be conceptually wrong. HTH, Alex.
Re: [RFC] connections logging
On Tue, Jun 24, 2014 at 2:26 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/20/2014 03:04 AM, Amos Jeffries wrote: I am playing with the idea of adding a new log file to record just the connections handled by Squid (rather than the HTTP request/reply transactions). This log would include connections opened but never used, port details, connection lifetimes, re-use counts on persistent connections, SSL/TLS presence/version (or not), and other connection details we find a need for later. The driver behind this is to help resolve a growing amount of user queries regarding happy eyeballs idle connections and broken TLS connections. We are also adding other potentially never-used connections ourselves with the standby pools. A couple of days ago, I came across another use case for logging connections unrelated to Happy Eyeballs. It sounds like this is going to be a generally useful feature. My use case is still being clarified, but here are the already known non-obvious requirements: I have an use case, as well: timing logging. It would be useful to have a chance to log the timing of certain key moments in a request's processing path. Things like accept, end of slow acl matching, end of dns resolution(s), connected to uplink, and so on. This could help administrators identify congestion issues in their infrastructure. Kinkie