Re: [PATCH v2] Do not release entries that may be kept in local memory cache.

2012-07-06 Thread Dmitry Kurochkin
Hi Amos.

Amos Jeffries squ...@treenet.co.nz writes:

 On 6/07/2012 6:29 a.m., Alex Rousskov wrote:
 On 07/05/2012 11:38 AM, Dmitry Kurochkin wrote:
 Alex Rousskov writes:
 I suggest adding StoreController::keepInLocalMemory(e) that will
 correctly check whether memory caching is enabled at all and then call
 e.memoryCachable() to arrive at the final answer. The new method will be
 called from StoreEntry::trimMemory(). Will that work better?

 Yes, I think it will.

 But perhaps keepInLocalMemory() is not the best name
 It is not a good one indeed, but I cannot think of a better one. I did try!


 given that it will check both local and shared memory cache?
 That aspect is fine because the purpose of the method is to determine
 whether the entry should be kept in _local_ memory for now because a
 local or shared memory caches will need it there later.


 Also, keepInLocalMemory()
 sounds like it should make a final decision whether to cache an entry
 which would not be true in our case.
 That is the aspect I could not fix without resorting to something very
 long like mayBeNeededInLocalMemoryLater(). Feel free to use that one if
 you prefer!


 How about adding StoreController::memoryCacheEnabled() method?  It would
 have a clear meaning.  StoreEntry::trimMemory() would then check for
 (StoreController::memoryCacheEnabled()  StoreEntry::memoryCachable()).
 What do you think?
 I like my approach better because

 a) These decisions should not be made by StoreEntry, in
 StoreEntry::trimMemory() code. They should be made by Store [modules].
 They will become increasingly complex and module-dependent if we start
 optimizing this.

 b) Even today, the decision should be more complex than memory cache is
 enabled. For shared memory cache, it should involve checking whether
 the entry may fit. See MemStore::considerKeeping() and
 MemStore::willFit() call in particular.  That check (as is or adjusted)
 should be applied here today by adding MemStore::keepInLocalMemory() and
 calling that from StoreController::keepInLocalMemory().

 (b) is necessary to avoid keeping huge entries in RAM when they have no
 chance of being stored in a shared memory cache later. I should have
 mentioned (b) earlier, sorry.

 It sounds like another analysis of the places calling trimMemory() is 
 needed. So that trimMemory is isolated to just trim like its name 
 suggests. The maybeTrim() decisions being outside functionality (in 
 StoreController?) and which is the only place calling e.trimMemory().


What is the benefit given that maybeTrim() is not used separately?

 Looking at this v2 patch the Bug comment and IN_MEMORY test looks like 
 it should be inside the keepInLocalMemory() instead. Since things 
 already *in* local memory are obviously allowed there somehow. That is 
 probably wrong, but is clearly implied by the symbol names around it.


IN_MEMORY indicates that entry is cached in local memory, so it probably
belongs to SoreControlle::keepInLocalMemoryCache().  On the other hand,
I am not sure this check is needed at all.  I am ok with moving it.  But
keeping it where it is now with the comment makes it clear where it came
from in case we want to remove it later.

Regards,
  Dmitry

 Amos


[RFC] Drop --enable-ntlm-fail-open

2012-07-06 Thread Amos Jeffries
While building the helper changes it has become clear that the 
--enable-ntlm-fail-open feature has been half missing for quite some time.
The SMB helper sends the LD code back to Squid when the directive is 
compiled in, but there is no Squid code handling such responses, back as 
far as squid-2.


Given that in all the time this appears to be broken we have only had 
one complaint, form someone who was only experimenting with NTLM 
options. I think its overdue for complete removal.


Amos



Re: [RFC] Drop --enable-ntlm-fail-open

2012-07-06 Thread Kinkie
On Fri, Jul 6, 2012 at 2:53 PM, Amos Jeffries squ...@treenet.co.nz wrote:
 While building the helper changes it has become clear that the
 --enable-ntlm-fail-open feature has been half missing for quite some time.
 The SMB helper sends the LD code back to Squid when the directive is
 compiled in, but there is no Squid code handling such responses, back as far
 as squid-2.

 Given that in all the time this appears to be broken we have only had one
 complaint, form someone who was only experimenting with NTLM options. I
 think its overdue for complete removal.

I agree.
Almost 10 years have passed since it ceased being useful.


-- 
/kinkie


Re: [PATCH v2] Do not release entries that may be kept in local memory cache.

2012-07-06 Thread Alex Rousskov
On 07/05/2012 09:02 PM, Amos Jeffries wrote:

 It sounds like another analysis of the places calling trimMemory() is
 needed.

StoreEntry::trimMemory() is only called from StoreEntry::swapOut(),
before we are starting to swap data out (or skipping swap because the
entry is not swappable). Ideally, it should also be called from places
where we unlock entry's memory as well, but that is the way it was
written. Changing that is outside this bug scope and may be difficult.


 So that trimMemory is isolated to just trim like its name
 suggests. The maybeTrim() decisions being outside functionality (in
 StoreController?) and which is the only place calling e.trimMemory().

First of all, I would ignore the maybe aspect. The caller does not
rely on any trimming actually happening and some computations of how
much to trim (if anything!) happen deeper inside MemObject. We can add
maybe to a bunch of methods so that their names reflect the
possibility that nothing would be trimmed. Those old methods are present
in all Squid versions. I think renaming them just to polish the names
would cause more porting headaches than it would solve problems.


If I get your primary idea correctly, then instead of
StoreEntry::swapOut() calling e.trimMemory() it will call new
StoreController::trimMemory(e) and the controller would decide whether
to proceed with trimming and, if yes, will call e.trimMemory().
StoreEntry::trimMemory() will not have any new conditions added then.

If that is what you are suggesting, it sounds good to me, and I can do
that during commit of v3 patch (it would not change the functionality or
the order of the checks; just move a couple of checks to a different
class/method).


 Looking at this v2 patch the Bug comment and IN_MEMORY test looks like
 it should be inside the keepInLocalMemory() instead. Since things
 already *in* local memory are obviously allowed there somehow. That is
 probably wrong, but is clearly implied by the symbol names around it.

They were allowed into local memory because Squid cannot do I/O without
using local memory :-). Every byte read is placed in local memory first.

The issue here is whether those local memory bytes can be forgotten
(i.e., the entry memory can be trimmed). The answer depends on the entry
state and on the memory caching module state/functionality. The latter
aspect was ignored for a while, until this patch.


HTH,

Alex.


Re: [PATCH] Add request_header_add option and [request|reply]_header_* manglers fixes

2012-07-06 Thread Tsantilas Christos
On 07/01/2012 02:25 PM, Amos Jeffries wrote:
 Thank you. In most places the problem is that AccessLogEntry is the
 closest thing we have to a XactionData object for the whole HTTP
 transaction. (I still have patches waiting around to polish up and
 submit for audit creating that master transaction object).
 AccessLogEntry is available from the ConnStateData / conn() almost
 everywhere in the code. So it *is* actually available everywhere through
 the transaction, just not filled out.  We have a lot of code cleaning to
 make it consistent, (but outside this patches scope).

The AccessLogEntry is not available from ConnStateData, it is only
available from ClientHttpRequest, and this is correct because an
accessLogEntry is Http request related, not connection related.

 
 For now it is available via request-conn()-ale. Or a new temporary
 one created in local scope for the function doing display output. It may
 need filling out with pointers to things required (is SSL content, HTTP
 request, client connection), the code before logging is very
 inconsistent/non-existent.

The only way to access AccessLogEntry is using the following path:
 request-conn()-getCurrentContext()-http-al

I am not feeling comfortable using the request-conn() although it looks
that works well in the current squid code. Moreover the
request-conn()-getCurrentContext()-http path, I think does not
guarantee that the ClientHttpRequest is the object we need.

Sometimes I am felling the initial design was that a (server side)
HttpRequest corresponds to more than one connections (more than one
connections accessing the same object) and of course to more than one
ClientHttpRequest objects.

Also I believe that the request-conn()-getCurrentContext() may not
return the ClientSocketContext corresponding to our HttpRequest.
We can go from ClientSocketContext to HttpRequest
(ClientSocketContext-http-request) but no the reverse.

Any opinions here?


Re: [PATCH] Add request_header_add option and [request|reply]_header_* manglers fixes

2012-07-06 Thread Alex Rousskov
On 07/06/2012 08:41 AM, Tsantilas Christos wrote:
 On 07/01/2012 02:25 PM, Amos Jeffries wrote:
 Thank you. In most places the problem is that AccessLogEntry is the
 closest thing we have to a XactionData object for the whole HTTP
 transaction. (I still have patches waiting around to polish up and
 submit for audit creating that master transaction object).
 AccessLogEntry is available from the ConnStateData / conn() almost
 everywhere in the code. So it *is* actually available everywhere through
 the transaction, just not filled out.  We have a lot of code cleaning to
 make it consistent, (but outside this patches scope).
 
 The AccessLogEntry is not available from ConnStateData, it is only
 available from ClientHttpRequest, and this is correct because an
 accessLogEntry is Http request related, not connection related.
 

 For now it is available via request-conn()-ale. Or a new temporary
 one created in local scope for the function doing display output. It may
 need filling out with pointers to things required (is SSL content, HTTP
 request, client connection), the code before logging is very
 inconsistent/non-existent.
 
 The only way to access AccessLogEntry is using the following path:
  request-conn()-getCurrentContext()-http-al
 
 I am not feeling comfortable using the request-conn() although it looks
 that works well in the current squid code. Moreover the
 request-conn()-getCurrentContext()-http path, I think does not
 guarantee that the ClientHttpRequest is the object we need.
 
 Sometimes I am felling the initial design was that a (server side)
 HttpRequest corresponds to more than one connections (more than one
 connections accessing the same object) and of course to more than one
 ClientHttpRequest objects.
 
 Also I believe that the request-conn()-getCurrentContext() may not
 return the ClientSocketContext corresponding to our HttpRequest.
 We can go from ClientSocketContext to HttpRequest
 (ClientSocketContext-http-request) but no the reverse.
 
 Any opinions here?

I agree that getting AccessLogEntry via getCurrentContext() is wrong,
even if it works in most cases. A pipelining connection may have many
requests associated with it and, at the time we log the transaction, our
request may no longer be current.

Can we add a link from HttpRequest to the corresponding ClientHttpRequest?

Or is it better to add a link from HttpRequest to AccessLogEntry?

Seriously malformed and special requests may not have HttpRequest
assciated with them, so I understand why ClientHttpRequest stores
AccessLogEntry and not HttpRequest, but we need to make a way for the
rest of the code to find AccessLogEntry without relying on
getCurrentContext(). And HttpRequest is available nearly everywhere.


HTH,

Alex.


Re: [PATCH] Add request_header_add option and [request|reply]_header_* manglers fixes

2012-07-06 Thread Tsantilas Christos
On 07/06/2012 05:55 PM, Alex Rousskov wrote:
 On 07/06/2012 08:41 AM, Tsantilas Christos wrote:
 On 07/01/2012 02:25 PM, Amos Jeffries wrote:
 Thank you. In most places the problem is that AccessLogEntry is the
 closest thing we have to a XactionData object for the whole HTTP
 transaction. (I still have patches waiting around to polish up and
 submit for audit creating that master transaction object).
 AccessLogEntry is available from the ConnStateData / conn() almost
 everywhere in the code. So it *is* actually available everywhere through
 the transaction, just not filled out.  We have a lot of code cleaning to
 make it consistent, (but outside this patches scope).

 The AccessLogEntry is not available from ConnStateData, it is only
 available from ClientHttpRequest, and this is correct because an
 accessLogEntry is Http request related, not connection related.


 For now it is available via request-conn()-ale. Or a new temporary
 one created in local scope for the function doing display output. It may
 need filling out with pointers to things required (is SSL content, HTTP
 request, client connection), the code before logging is very
 inconsistent/non-existent.

 The only way to access AccessLogEntry is using the following path:
  request-conn()-getCurrentContext()-http-al

 I am not feeling comfortable using the request-conn() although it looks
 that works well in the current squid code. Moreover the
 request-conn()-getCurrentContext()-http path, I think does not
 guarantee that the ClientHttpRequest is the object we need.

 Sometimes I am felling the initial design was that a (server side)
 HttpRequest corresponds to more than one connections (more than one
 connections accessing the same object) and of course to more than one
 ClientHttpRequest objects.

 Also I believe that the request-conn()-getCurrentContext() may not
 return the ClientSocketContext corresponding to our HttpRequest.
 We can go from ClientSocketContext to HttpRequest
 (ClientSocketContext-http-request) but no the reverse.

 Any opinions here?
 
 I agree that getting AccessLogEntry via getCurrentContext() is wrong,
 even if it works in most cases. A pipelining connection may have many
 requests associated with it and, at the time we log the transaction, our
 request may no longer be current.
 
 Can we add a link from HttpRequest to the corresponding ClientHttpRequest?

Probably yes. We are creating the HttpRequest inside
clientProcessRequest function (client_side.cc file). We can link it here.
We may not be able to link it in some cases where we are creating fake
HttpRequest objects (httpsEstablish function)

 
 Or is it better to add a link from HttpRequest to AccessLogEntry?

We need to convert AccessLogEntry to cbdata class or use RefCounts to
use link to AccessLogEntry.

 
 Seriously malformed and special requests may not have HttpRequest
 assciated with them, so I understand why ClientHttpRequest stores
 AccessLogEntry and not HttpRequest, but we need to make a way for the
 rest of the code to find AccessLogEntry without relying on
 getCurrentContext(). And HttpRequest is available nearly everywhere.

OK.


 
 
 HTH,
 
 Alex.
 



Re: Generic helper I/O format

2012-07-06 Thread Henrik Nordström
tor 2012-07-05 klockan 16:00 +1200 skrev Amos Jeffries:

 The blob only exists in this discussion for two reasons; the old helpers 
 backward compatibility requires it, and  you wanted to discuss a body 
 field for the responses. Even not understanding properly the specifics 
 of why you wanted to discuss it I dont think its a good idea since we 
 can key-pair and encode anything new.

Agreed. And please use URL-encoding requiring encoding of at minimum %
and any whitespace. It's much less error prone than the quoting
mechanism.

 If we can agree on making it key-pair across the board for all the 
 details which might be passed back we can move on.

Yes.

With blob only for legacy helpers.

Regards
Henrik



Re: Generic helper I/O format

2012-07-06 Thread Amos Jeffries

On 5/07/2012 4:16 p.m., Robert Collins wrote:

On Thu, Jul 5, 2012 at 4:00 PM, Amos Jeffries wrote:

Why do we need backwards compat in the new protocol?

As an alternative, consider setting a protocol= option on the helpers,
making the default our latest-and-greatest,a nd folk running
third-party helpers can set protocl=v1 or whatever to get backwards
compat.

This lets us also warn when we start deprecating the old protocol,
that its going away.

-Rob


I'm trying to avoid adding more options and complexity with this. To be 
done properly that would mean adding whole new config design bits to set 
protocol= on several of the more hidden helper lookups. But with one 
shared parser and a fallback per-helper parser handling the blob cases 
for backward compatibility we can migrate people over without any config 
change.


Amos



Re: [PATCH v2] Do not release entries that may be kept in local memory cache.

2012-07-06 Thread Alex Rousskov
On 07/06/2012 07:11 PM, Amos Jeffries wrote:
 On 7/07/2012 2:36 a.m., Alex Rousskov wrote:
 On 07/05/2012 09:02 PM, Amos Jeffries wrote:

 It sounds like another analysis of the places calling trimMemory() is
 needed.
 StoreEntry::trimMemory() is only called from StoreEntry::swapOut(),
 before we are starting to swap data out (or skipping swap because the
 entry is not swappable). Ideally, it should also be called from places
 where we unlock entry's memory as well, but that is the way it was
 written. Changing that is outside this bug scope and may be difficult.
 
 I thought that was the point of r11969. To cause trimming of no longer
 necessary non-cacheable/swappable object buffers. With this patch being
 additional fixes for the unexpected side effect(s).

The point of r11969 was to trim unswappable objects. The bug in that
patch was that we also started trimming objects that should have been
preserved for the memory cache. This patch fixes that bug.


 So that trimMemory is isolated to just trim like its name
 suggests. The maybeTrim() decisions being outside functionality (in
 StoreController?) and which is the only place calling e.trimMemory().

 First of all, I would ignore the maybe aspect. The caller does not
 rely on any trimming actually happening and some computations of how
 much to trim (if anything!) happen deeper inside MemObject. We can add
 maybe to a bunch of methods so that their names reflect the
 possibility that nothing would be trimmed. Those old methods are present
 in all Squid versions. I think renaming them just to polish the names
 would cause more porting headaches than it would solve problems.


 If I get your primary idea correctly, then instead of
 StoreEntry::swapOut() calling e.trimMemory() it will call new
 StoreController::trimMemory(e) and the controller would decide whether
 to proceed with trimming and, if yes, will call e.trimMemory().
 StoreEntry::trimMemory() will not have any new conditions added then.

 If that is what you are suggesting, it sounds good to me, and I can do
 that during commit of v3 patch (it would not change the functionality or
 the order of the checks; just move a couple of checks to a different
 class/method).

 Yes that was what I meant. And yes it is a design change, the renamed
 functions just make it abundantly clearer that portage is either not
 possible or needs more than usual careful checking.

Since we are not changing the lower-level MemObject methods that do the
actual trimming, I will not rename them. I will add a maybe prefix to
StoreEntry::trimMemory(). I hope this is a reasonable compromise.


 Looking at this v2 patch the Bug comment and IN_MEMORY test looks like
 it should be inside the keepInLocalMemory() instead. Since things
 already *in* local memory are obviously allowed there somehow. That is
 probably wrong, but is clearly implied by the symbol names around it.
 They were allowed into local memory because Squid cannot do I/O without
 using local memory :-). Every byte read is placed in local memory first.
 
 Um. Lets be clear on the three memory usages now. What I see you talking
 about is:

 0) transit memory - non-cacheables and unknown-cacheables, etc

yes, everything.

 1) local memory cache (non-shared cache_mem)

yes, by default (one can enable shared memory cache in non-SMP mode)

 2) shared memory cache (cache_mem with workers)

yes, by default (one can disable to use local memory caches in SMP)



 (0) and (1) being overlapping right now due to the incomplete transition
 away from global object index.

Yes, they use the same storage and the same index for now.


 keepInLocalMemory() being the test to decide if an object is needed to
 be kept whole for moving (0) - (1) or (0)-(2) at some later point.

Yes, for _possibly_ moving into a memory cache at some later point.


 When its put that way the keep*() is more of a cacheability test. With
 yoru focus being just on whether teh caches can accept the object, as
 opposed to including whether the object is allowed to be stored by HTTP.

The keep*() method checks whether the memory cache may accept the object
now or in the future. To avoid code duplication, the check is split into
general memory cachability (in any memory cache) and the admission
policy of the memory cache (a specific cache). The general checks are in
StoreEntry::memoryCachable().


Hope this clarifies,

Alex.