[RFC] transaction state

Amos Jeffries Fri, 20 May 2011 00:31:31 -0700

For starters. This is probably 3.3. We can continue to hack our wayaround data passing limitations in 3.2. Although with Alex emphasis onoptimization going into 3.2, some of this may help that.


THE RANT:

The more I've been looking at the client-side the more I see duplicatedcopies and re-copies of transaction details. I know you all agree thereis too much of it.


What I've seen myself...

ACLChecklist - storing a copy of as many currently known transactiondetails as people have asked to be checked so far.


  HttpRequest - storing copy of *almost* all transaction details.

  ClientHttpRequest - copy of the transaction details needed by client-side

ConnSateData - copy of the transaction TCP details and someHttpRequest details needed by pinning, 1xx and other weird state handling.


 AccessLogEntry - copy of all most known transaction details.

... and by "copy" I mean complete duplicate. xstrdup() galore etc.

With a bunch of code doing nothing but checking the local copy is upto date with whatever other copy or function parameter it sourced thedetail from. A bunch of other code *assuming* that its getting the rightdetails (sometimes wrongly).

* Have not yet got a good look at the reply handling path in detail yet.Overall it seems to be using the request-path objects in reverse. So noworse or better.




IDEAS:

Note that ClientHttpRequest has a member copy of AccessLogEntry. This is*already* available and unique on a per-request basis from the verystart of the HTTP request arrival and parsing. Persists across the wholetransaction lifetime and is used for logging at the end.

I propose that the first thing we do is clean up its internalstructure design. To make sure it has all the fields we will need in thenet step.

I propose then to rename as a general-purpose transaction storage area(TransactionDetails?). To avoid people ignoring it as a "logging-only"thing.

I propose then to roll each step/object along the transaction pathwayto using it as their primary storage area for transaction details andhistory.

- incremental so can be done in the background for low impact startingimmediately.

 - will soon lead to removal of several useless copies.

- will mean component/Jobs updated are guaranteed to have *all*details for the current state of the transaction available should theyneed it.

NOTE: little fine-detail processing pathways like ident will only needa selected refcount/cbdata/locked sub-child of the whole slab object.This is fine and will help drop dependencies. Thus the proposed modularhierarchy structure below.

To kick-start things this is what I've been thinking we need itsstructure to look like:


class TransactionDetails {

 class TimeDetails {
   // all the timing and wait stats we can dream up.
   // for the transaction as a whole.
   // specific times stay in their own component.
 } time;

 // Details about the TCP links used by this transaction.
 class TcpDetails {
   struct { FD, ip, port, eui, ident } client;
   struct s_ { FP, ip, port, eui, ident } server;
   vector<s_> serverHistory;
   // NP: not sure if we want a server-side history
   // if so it would go here listing all outbound attempts.
 } tcp;

 class SquidInternalDetails {
    // which worker/disker served this request?
    ext_acl; // details from external ACL tested
    auth; // details from proxy-auth helpers
    status; // status flags hit/miss/peer/aborted/timeout etc
    hier; // heirarchy details, HierarchyLogEntry
 } squid;

 // Details about the ICP used by this transaction.
 class IcpDetails {
    icp_opcode opcode;
 }

 // Details about the HTCP used by this transaction.
 class HtcpDetails {
    htcp_opcode? opcode;
 }

 // Details about the HTTP used by this transaction.
 class HttpDetails {
   // currently to be used request/reply.
   // points to the later specific objects
   HttpRequestPointer request;
   HttpReplyPointer reply;

   // specific state objects
   HttpRequestPointer original_request; // original received
   HttpRequestPointer adapted_request; // after adaptation

   HttpRequestPointer original_reply; // original received
   HttpRequestPointer adapted_reply; // after adaptation
   // NP: original reply may be nil if non-HTTP source.
   //     in which case...
   HttpRequestPointer generated_reply; // pre-adaptation.
 } http;

 // Details about the adaptation used by this transaction.
 class AdaptationDetails {
    { ...} icap; // icap state and history, pretty much as-is
    {...} ecap; // ecap state if we find anythig to log.
 } adapt;

 // Details about the FTP used by this transaction.
 class FtpDetails {
   vector<String> protoLog; // FTP msgs used in this fetch.
 }

 // ... other entries similar to FTP for gopher, wais, etc.
}

NOTE that "headers", "private" and "cache" are gone.
 - "headers" blobs are part of HttpRequest (or should be)
 - "private" is duplicate of HttpRequest details

- "cache" is split into whichever component is actually relevant forthe particular field.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.12
  Beta testers wanted for 3.2.0.7 and 3.1.12.1

[RFC] transaction state

Reply via email to