Re: R: /bzr/squid3/trunk/ r9713: MFC: Back out unintended md5 - squid_md5 substituions from the md5.h - squid_md5.h name change

2009-06-29 Thread Henrik Nordstrom
tis 2009-06-02 klockan 22:21 +1200 skrev Amos Jeffries:

 AFAICT the config.test are only used on windows builds so it is indeed 
 likely not to have been built in a long time.

config.test is for any build specifying to build helpers of a specific
kind without specifying which ones. It then builds the one where
config.test gives a positive execution response.

Regards
Henrik



Re: [PATCH] Truncate HTTP response bodies to match clen

2009-06-29 Thread Henrik Nordstrom
fre 2009-06-26 klockan 12:05 -0600 skrev Alex Rousskov:

 TODO: simply truncating read content would not work for pipelined
 responses. We should preserve extra content for the next transaction on
 a pconn.

Correct, and is a major reason NOT to do pipelining as it then becomes
impossible to protect from the response splitting attack you just
closed...

Regards
Henrik



Re: [PATCH] Enhanced access logging, added ICAP logging, chunked requests, bug #2495 fix.

2009-06-29 Thread Alex Rousskov
On 06/28/2009 12:45 AM, Amos Jeffries wrote:

 I note that there are a lot of different fixes going on here. Is it easy
 to pull out these ones for separate merges?
 
  - Bug 2495
  - chunked requests
  - DNS timers and logging
  - ICAP logging changes

Chunked requests are relatively easy to pull.

Bug 2495 can probably be pulled out, but DNS logging stuff is
essentially a part of it because the new DnsLookupDetail class caries
the wait time info which gets updated in the right places, all for
logging. The two were developed together, at the same time. It would
take a relatively long time to fully separate the two but it can be done.

Most logging is too tightly integrated to be separated (and it probably
makes no sense to do that anyway because it should be reviewed and used
together).

Please note that if this patch is approved, separately developed
features will be merged (by bzr) separately. My understanding is that
bzr will replay the original commit sequence so we will have separate
commit messages for separate (but dependent) features.

With the bzr magic caveat in mind, do you want me to spend time on
separating chunked requests hack and possibly bug 2495 fix?


  I'm taking a closer look at this one in light of the logging alterations
 that I have coming up.
 
 
 Your comments suggest ICAP/eCAP are going to be done outside access.log
 so the bits for them do not belong in AccessLogEntry. I know ALE is a
 mix of uses already, but no need to add even more garbage there before
 we fix that bug. Is it easy to make IcapLogEntry a main object in its
 own files, possibly inheriting from AccessLogEntry if you need the ALE
 fields in the icap.log? If not we shall have to do it after the fact in
 the logging updates, but I would rather have it clean beforehand.

There are two groups of logging features, some go into access.log and
some go into icap.log. Moreover, icap.log can log almost everything
access.log can (but in a different scope).

I doubt I can pull out IcapLogEntry and make it an ALE kid without
redesigning old ALE-dependent code that assumes everything is in ALE. I
think somebody already looked at that and the conclusion was that either
we have to redesign the whole logging mess or keep adding to it. I
wanted to save resources for more important stuff so I said add to the
mess. This is also related to the namespace question below.

Please note that ALE is currently organized as a collection of
scope-specific subclasses. The \todo I added was specifically to
emphasize that I know that this is the wrong style, but I am not going
to change it for now. Note the plural in Inner class declarations :-)


 When splitting an existing code (Hs in this case) for inbound and
 outbound. Please update them to use both directional  and  format
 codes. 

OK.

 The non-directional then needs to default to the old behavior
 with a configu parse WARNING: about the change.
 
 We appear to be fallign into a predictable pattern for LFT. May as well
 keep that up...

Are you sure about the warnings? Every squid.conf that used a custom (or
copied default) format will generate these warnings. I think this may
annoy too many folks. I am not against adding support for both * and
* versions if you think I must, but I wonder if warnings should wait
until most folks switch or start using to the documented/recommended fields?


 These don't fit however. What is the difference now?
 
 +LFT_SENT_HTTP_CODE,
 +LFT_RECEIVED_HTTP_CODE,

One is sent by Squid to the client. The other one is received by Squid
from the server. Or are you asking about some other difference?

 
 These are ICAP headers right? but not with names 'Last-Matched-Icap:'
 etc.?  So can you make the ICAP ones all LFT_ICAP_*
 -LFT_LAST_MATCHED_ICAP_HEADER,
 -LFT_LAST_MATCHED_ICAP_HEADER_ELEM,
 -LFT_LAST_MATCHED_ICAP_ALL_HEADERS,
 +LFT_ICAP_LAST_MATCHED_HEADER,
 +LFT_ICAP_LAST_MATCHED_HEADER_ELEM,
 +LFT_ICAP_LAST_MATCHED_ALL_HEADERS,

There are ICAP headers logged to access.log. I think this is why they do
not start with LFT_ICAP_. I am happy to change the enums if you do not
think it would be better the other way around.

I am not sure what you mean by with names 'Last-Matched-Icap: but
these will log actual ICAP header fields. There could be many header
fields with the same name across multiple ICAP transactions (e.g., in a
chain) so we use the last matched trick.


 Is the tag form: %adaptation_ and %icap_* set in stone? There was a very
 nice 'namespacing' proposition a while back which never seems to appear.
 Using the ':' separator syntax like so --   %icap::field, or
 %adapt::icap::field
 The HTTP tags are all reserved char-codes. So if we are adding a new tag
 format on top it might as well be a nice extensible one.

AFAIK, the person who was going to implement
http://wiki.squid-cache.org/Features/AclNamespaces
is no longer working on Squid. I still stand behind that proposal in
principle, but will not have the time to 

[Success!] [MERGE] Changed select loops counter from int to unsigned long int.

2009-06-29 Thread Bundle Buggy

This change has been merged.
Previous status: Semi-approved

For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C1238087502.27909.3.camel%40longhorn%3E

Project: Squid


Re: [PATCH] ICAP service chains and sets

2009-06-29 Thread Amos Jeffries

Amos Jeffries has voted approve.
Status is now: Approved


For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C4A46A0D5.6040901%40measurement-factory.com%3E

Project: Squid


Re: [PATCH] Enhanced access logging, added ICAP logging, chunked requests, bug #2495 fix.

2009-06-29 Thread Amos Jeffries

Alex Rousskov wrote:

Hello,

Please consider the following changes for Squid3 trunk inclusion.
Most of them have been accumulating on the 3p0-plus lp branch and have
been tested in production. All have been tested in the lab. I compressed
the 150KB patch, but the attached text file shows squid.conf
documentation describing most of the features listed below.

Many thanks to Christos Tsantilas for help with coding and testing.

If approved, I will try to do a bzr merge instead of a raw patch,
provided Robert helps me with convincing bzr to obey :-).


I note that there are a lot of different fixes going on here. Is it easy 
to pull out these ones for separate merges?


 - Bug 2495
 - chunked requests
 - DNS timers and logging
 - ICAP logging changes




Thank you,

Alex.
--
Enhanced access logging, added ICAP logging, chunked requests, bug #2495
fix:

Ported features accumulated on 3p0-plus branch as of r8937:

  - A hack to support chunked requests
(see chunked_request_body_max_size in squid.conf).

  - Enhanced access logging
(Hs, sh, sh, pt, tt, icap_total_time, and icap_last_h)

  - ICAP logging
(see icap_log and log_icap in squid.conf as well as
http://wiki.squid-cache.org/Features/AdaptationLog).

  - ICAP retries based on the ICAP responses status code
(see icap_retry and icap_retry_limit in squid.conf).


New v3.1-only changes:

  - Support logging resp. times of adaptation transactions to access.log
(%adaptation_sum_trs and %adaptation_all_trs)

  - Support logging of total DNS wait time to access.log
(%dt)

  - Bug #2495 fix


Squid v3.1.0.9-based
  Contains official Squid v3.1 branch changes up to r9605.



I'm taking a closer look at this one in light of the logging alterations 
that I have coming up.



Your comments suggest ICAP/eCAP are going to be done outside access.log 
so the bits for them do not belong in AccessLogEntry. I know ALE is a 
mix of uses already, but no need to add even more garbage there before 
we fix that bug. Is it easy to make IcapLogEntry a main object in its 
own files, possibly inheriting from AccessLogEntry if you need the ALE 
fields in the icap.log? If not we shall have to do it after the fact in 
the logging updates, but I would rather have it clean beforehand.



When splitting an existing code (Hs in this case) for inbound and 
outbound. Please update them to use both directional  and  format 
codes. The non-directional then needs to default to the old behavior 
with a configu parse WARNING: about the change.


We appear to be fallign into a predictable pattern for LFT. May as well 
keep that up...


These don't fit however. What is the difference now?

+LFT_SENT_HTTP_CODE,
+LFT_RECEIVED_HTTP_CODE,


These are ICAP headers right? but not with names 'Last-Matched-Icap:' 
etc.?  So can you make the ICAP ones all LFT_ICAP_*

-LFT_LAST_MATCHED_ICAP_HEADER,
-LFT_LAST_MATCHED_ICAP_HEADER_ELEM,
-LFT_LAST_MATCHED_ICAP_ALL_HEADERS,
+LFT_ICAP_LAST_MATCHED_HEADER,
+LFT_ICAP_LAST_MATCHED_HEADER_ELEM,
+LFT_ICAP_LAST_MATCHED_ALL_HEADERS,


Is the tag form: %adaptation_ and %icap_* set in stone? There was a very 
nice 'namespacing' proposition a while back which never seems to appear. 
Using the ':' separator syntax like so --   %icap::field, or 
%adapt::icap::field
The HTTP tags are all reserved char-codes. So if we are adding a new tag 
format on top it might as well be a nice extensible one.


.. okay thats me to line 1400. More later.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE16
  Current Beta Squid 3.1.0.9


Re: [PATCH] delay pools 64-bit buckets and IPv6-polish

2009-06-29 Thread Amos Jeffries

Amos Jeffries has voted approve.
Status is now: Semi-approved
Comment:
This is mine.

For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C4A331D05.5060601%40treenetnz.com%3E

Project: Squid


Re: [PATCH] ICAP service chains and sets

2009-06-29 Thread Amos Jeffries

Alex Rousskov wrote:

On 06/27/2009 06:28 AM, Amos Jeffries wrote:

Limit adaptation iterations to adaptation_service_iteration_limit to
protect
Squid from infinite adaptation loops caused by ICAP services constantly
including themselves in the dynamic adaptation chain they request.
When the
limit is exceeded, the master transaction fails. The default limit of 16
should be large enough to not require an explicit configuration in most
environments yet may be small enough to limit side-effects of loops.



I see your comment at src/adaptation/AccessCheck.cc: ~156
  TODO: should we shift and checkCandidates if and only if (!g) ?!

If the TODO is about the case of self-reference you mention above then I
would say yes we must. To drop such loops into error condition as
quickly as possible.  I understand the limit needs to be kept anyway for
multi-stage loops. But losing the loop early saves a lot of resources.


No, this is different. I have not modified AccessCheck much. Its overall
logic is as follows:

  1. AccessCheck::check: Identify service groups that may be applicable
to the virgin message. Start checking candidates by calling
checkCandidates().

  2. AccessCheck::checkCandidates: Get the top candidate and spawn a
non-blocking ACL check for it. This ends this call. The non-blocking
check will call us back in step #3.

  3. AccessCheck::noteAnswer: If there is no match, remove the top
candidate and go back to step #2. If there is a match, go to step #4.

  4. AccessCheck::do_callback: Check that the top candidate service is
still there and send it (or a NULL answer) to whoever requested the
adaptation. Remove the top candidate from the list. Stop the job.

You may have noticed that step #4 always terminates the search. Yet, we
may not have found a valid service (e.g., due to a reconfiguration) and
still have candidate services left to check. I added a TODO because it
smelled fishy to me when I looked at the code, but my focus was not on
changing how adaptation ACLs are processed so I left the code as is.

Your question prompted me to investigate this further and I am convinced
there is a bug. If not everything is peachy in step #4, we must go back
to step #2 again. I have now fixed this in my sources and polished
surrounding code a little (there were other small problems). The TODO is
done. The patch is attached.

This bug is difficult to reproduce because the service must disappear
while an ACL check is pending and it will only have an effect for checks
pending before reconfigure; future checks will work OK again. It is
probably possible to time an appropriate reconfiguration just right, but
I have not tried to do that.



Also, should we prevent circular loops by way of preventing a filter
being run twice over the same request? There have to be few cases where
a single adaptation step needs to be repeated exactly.


It would be nice to warn about the same service appearing twice in a set
or chain (a patch for that is attached and I added it to my sources).

I would not try to forcefully prevent duplicates though because it is
not illegal and might be used in practice for try me twice poor-man
backup implementations or for services that do different things
depending on the message state but cannot do it all at once. I do agree
that these situations are unusual (but very handy for testing! :-).



Okay. Fair enough.




IMO we are finding more and more reasons to add an ERR_*_CONFIG page :)


In most cases, I would just quit (with an error message) if a
configuration error is discovered during startup. However, I was
overruled on that by the start at any cost crowd.

Errors that are discovered at runtime should probably generate a fairly
generic response with an error code or similar information for admins
to match to cache.log entries and documentation. Users do not benefit
from knowing that a Squid cache has a misconfigured ICAP service and
many admins would not want to disclose that fact to end users anyway.



They have a point. We don't _have_ to make every config error fatal. But 
we do need some way to make the admin aware of it such that they have 
incentive to fix fast. Erros in cache.log are doing a lot but not enough.
Errors in their users faces saying they did wrong is a great incentive 
to get it right, and a useful debug tool during pre-rollout tests.


But this is outside your patch scope. I should be starting a different 
thread on this...





Fixed canonical Request URL maintenance when ICAP clones requests.
TODO: The urlCanonical() must become HttpRequest::canonical(), hiding
the
often out-of-sync canonical data member.

IMO: TODO: url cleanup bug needs closing to handle canonical and all
other URL display cases properly behind a single URL object. Requires
decent string code though or memory duplicates will blow out of
proportion to usefulness.


Yes, those are two separate issues: cache consistency (my TODO) and
cache storage implementation (TODO: add a URL or URI class). My TODO 

Re: [PATCH] ICAP service chains and sets

2009-06-29 Thread Alex Rousskov
On 06/27/2009 11:41 PM, Amos Jeffries wrote:
 Alex Rousskov wrote:
 in src/adaptation/AccessCheck.cc you create a ServiceFilter via:
 new AccessCheck( ServiceFilter(method, vp, req, rep), cb, cbdata);
 which takes a copy by  reference. The ServiceFilter locks the given
 request and reply but I can't see where the ServiceFilter is deleted (
 references cannot be and the object is not RefCounted.)
  - I expect this to lead to memory leaks. Possibly big ones.

 ServiceFilter instances in this context are not allocated on the heap.
 Thus, we do not (and cannot) delete them. The compiler generates code to
 destruct the temporary variable and the class data member. Did I
 misunderstood your concern?
 
 Ah right. I had forgotten that C++ trick you seem to like using.
 This works then despite the stack changes during AccessCheck callbacks?
 
 Remember we still have not gotten to the bottom of that 10MB/minute
 memory leak near the last one of these I noticed. Personally I'd rather
 go the long way with new/delete and RefCount to be sure of the copies
 than use this trick.

There is no trick. Well, no more of a trick than calling a function with
an integer argument. It is just simple copying of values (which can be
optimized away by the compiler in some cases) or, if references are
used, temporary sharing of values.

In this specific context, we could replace:

  A a;
  foo(a);

with

  A *a = new A;
  foo(a);
  ...
  delete a; // somewhere

But it seems like the top version is less likely to leak and any leaks
are easier to fix, especially when you consider that foo() would also
need to allocate/deallocate.

I did check that the Adaptation::ServiceFilter class has the right set
of methods to deal with message locking/unlocking during creation,
copying, assignment, and destruction.

We could replace everything with refcounting pointers, but it would be
slightly slower, may not save us if the ServiceFilter class is indeed
buggy, and would simply hide the true problem better -- we need to
replace HTTPMSG*LOCK with refcounting pointers. *That* is the true
source of many leaks and fears. But that nice project is outside this
patch scope.

Thank you,

Alex.


 in adaptation/icap/ModXact.cc when testing for HDR_X_NEXT_SERVICES  I
 think it would be worth warning if a non-routable service attempted to
 hand back a routing header.  How to do it efficiently may be a problem
 though.

 I do not think there would be a critical performance impact if we check
 using the current code. The difference between a hash and a simple array
 with linear search ought to be small when you are dealing with less than
 10 integers.

 Another problem, IMO, is how not to warn on every response (which is far
 more expensive operation than searching). I wanted to port a
 warn-once-in-a-while facility from Polygraph to Squid for a long time...

 Added TODO.
 
 Good point.
 


 This TODO needs to be enacted rather soon. We are aiming for performance
 in 3.2...
   // TODO: use HttpMsg::clone()!

 Agreed. A usable clone() method has been available at least since the
 eCAP addition. This is not a new TODO or a new problem. This is a nice,
 small-context project for volunteers :-).

 
 :-)
 

 On the side;  this has brought to my attention the adaptation/forward.h.
 Can we not call that adaptation/Adaptation.h instead? the forward.h
 breaks CamelCase and will clash with src/forward.h on those recently
 vocal compilers.

 Forward is kind of standard (e.g., iosfwd) and more intuitive, IMO:
 We are providing forward declarations to avoid too many unimportant
 dependencies via header files. I would rather move or rename
 src/forward.h :-)
 
 Fair point. It's not currently causing issue for most places. So its out
 of scope here if we go that way.
 

 notes.dox ***
   \section label takes two parameters: an ID then the text title.

 Fixed.

   mixing \b and the b/b equivalents on alternate lines looks a bit
 mucky. Can we at least do it consistently within one document?

 I have no idea where I copied that ugly mix from :-). Replaced all with
 b/b.


 Thats it from me. Just ... no more huge project for a few decades,
 yes? ;)

 You call this huge?! I spent more time on StringNg. ;-)


 Please look at the enhanced logging patch as well. This work depends on
 that patch. If the changes are approved, the enhanced logging stuff must
 go into the trunk first.

 
 I'm up to that now. This updated patch is a vote yes from me.
 
 Amos



Re: [PATCH] Truncate HTTP response bodies to match clen

2009-06-29 Thread Amos Jeffries

Amos Jeffries has voted approve.
Status is now: Approved


For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C4A450DFE.6040604%40measurement-factory.com%3E

Project: Squid


Re: [PATCH] Uninstall configuration files when RM has spaces

2009-06-29 Thread Amos Jeffries

Amos Jeffries has voted approve.
Status is now: Approved


For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C4A327FC0.7090709%40measurement-factory.com%3E

Project: Squid


Re: [PATCH] Truncate HTTP response bodies to match clen

2009-06-29 Thread Amos Jeffries

Alex Rousskov wrote:

On 06/27/2009 06:32 AM, Amos Jeffries wrote:

Alex Rousskov wrote:

Truncate too-long HTTP response bodies to match their Content-Length
header.

Sometimes a broken server sends more than Content-Length bytes in the
response.  For example, a 302 redirect message with Content-Length: 0
header may include an HTML body. Squid used to send everything it read
to the client, even if it read more than the Content-Length bytes. That
may have helped in some cases, but I think we should be more
conservative when dealing with broken servers to combat message
smuggling attacks and other bad side-effects for clients.

We now do not forward more than the advertised content length and
declare the connection with a broken server non-persistent.

Chunked responses (that Squid should not receive and that must not have
a Content-Length header) are not truncated because RFC 2616 says we MUST
ignore their Content-Length header.

TODO: simply truncating read content would not work for pipelined
responses. We should preserve extra content for the next transaction on
a pconn.

---

The attached patch is against Squid 3.0 and has been tested in
production. More testing is welcomed. I will port to trunk if needed if
the change is accepted.

Thank you,

Alex.


Trouble...

I was of the understanding that Squid already was doing these truncation
+ non-pconn flagging :(   I've written off many bugs in the last few
years to this behavior...


Is this some side case you found internal to Squid? or are the current
Squid-3 actually not protecting people from message-appending?


The problem was triggered at the production site running Squid 3p0-plus
that was accelerating a broken server application. The application was
including bodies in 302 redirects that had Content-Length set to zero.
That, in turn, broke some clients. Neither the application nor the
clients could be fixed, of course.

Besides, it is probably an HTTP violation to forward trailing garbage
anyway, unless we switch to some kind of tunneling mode and become a
TCP proxy, but then you get problems with message smuggling and such.

Do you think this change should go in?



I do. I'm not the best for RFC compliance issues, but its a stuffing 
attack. And we should be protecting from those.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE16
  Current Beta Squid 3.1.0.9


Re: A little help with cbdata and requestLink

2009-06-29 Thread Mark Nottingham

OK.

WRT requestlink, I was unsure what would happen if the request didn't  
go forward; will moving requestLink at the bottom up to here:

  fwdState-request = r; /* requestLink? */
do it?

Thanks,


On 27/06/2009, at 3:05 AM, Alex Rousskov wrote:


On 06/16/2009 09:57 PM, Mark Nottingham wrote:

Thanks. Anybody else have a second to look?


Please s/fwdStartFoo/fwdContinue/ and document what it is. Since  
this is

Squid2 you do not have to do it, of course.

Your cbdata and request manipulations appear technically correct to  
me.

IMHO, the temporary lack of requestLink is poor style that will be
dangerous for future code modifications.

Cheers,

Alex.




On 11/06/2009, at 11:28 PM, Amos Jeffries wrote:


Mark Nottingham wrote:

Would someone mind taking a quick look at this patch:
http://www.squid-cache.org/bugs/attachment.cgi?id=1989
and telling me if I've royally stuffed up with managing fwdState  
and

request linking?
It's to make miss_access a slow lookup...


Looks okay to these uneducated eyes.  Probably best to wait for
someone else to double-check before a HEAD commit, but IMO it looks
good enough for a patching.

This one is long-awaited by many. Thanks.

Amos
--
Please be using
Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1


--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com




squid-2.HEAD hanging with 304 not modified responses

2009-06-29 Thread Adrian Chadd
G'day guys,

I've fixed a bug in Lusca which was introduced with Benno's method_t
stuff. The specific bug is revalidation replies 'hanging' until the
upstream socket closes, forcing an end of message to occur.

The history and patch are here:
http://code.google.com/p/lusca-cache/source/detail?r=14103

Those of you toying with Squid-2.HEAD (eg Mark) - would you mind
verifying that you can reproduce it on Squid-2.HEAD and comment on the
fix?

Thanks,


adrian


Re: [squid-users] NONE/411 Length Required

2009-06-29 Thread Henrik Nordstrom
tor 2009-06-18 klockan 10:20 +1000 skrev Mark Nottingham:

 because the request method is POST. However, the request headers don't  
 have Transfer-Encoding...
 
 What am I missing? I know Bijayant is using Squid-3, but I'm observing  
 the same behaviour in my build of 2...

I don't. POST without Content-Length gets forwarded just fine here using
Squid-2.HEAD but not 2.7.

That change sneaked in as part of the unknown methods patch by Benno..

I have now backported this change to 2.7 as well.

Regards
Henrik



Re: about https support for transparent proxy

2009-06-29 Thread Henrik Nordstrom
fre 2009-06-26 klockan 13:00 -0600 skrev Alex Rousskov:

 It looks like you are working on a useful feature, but can you
 explain in more detail what your patch does? Why is the feature called
 SslConnect? Is it specific to tproxy environments or can it work with
 any transparent Squid? Does it work in combination with SslBump or are
 they mutually exclusive?

My answers from reading the patch:

Not specific to TPROXY, works with normal interception as well.

Do not work with SslBump I think. SslBump requires the CONNECT right?

 What kind of magic is going on in tunnelProxyConnectedWriteDummyDone and
 tunnelProxyConnectedReadDone?

Not entirely sure either. But it's somehow about forwarding an
intercepted SSL connection to a parent proxy where the response from the
parent needs to be discarded.

I guess much of this is actually from our rather poor proxy CONNECT
response processing in the tunnel code.. iirc we normally don't even try
to touch those responses but instead relay them as part of the tunneled
data, resulting in a 200 response logged even if the parent responded
with an error, and no Via header added..

  Why do we not care about certain
 tunnelStart errors if SslConnect is enabled? Perhaps you can add source
 code comments to explain your intent?

Because it's relaying a SSL connection, not HTTP. So it's not possible
to return an HTTP error message like we normally do.

I guess it could be extended to respond with an SSL level error
notification in these cases, but not sure it's worth the effort.

Regards
Henrik



Re: [PATCH] Truncate HTTP response bodies to match clen

2009-06-29 Thread Henrik Nordstrom
lör 2009-06-27 klockan 08:39 -0600 skrev Alex Rousskov:

 Do you think this change should go in?

A strong + from me.

For what it's worth Squid-2 was fixed long ago. But it took quite some
iterations before all cases were covered.. When this condition is
detected it not only discards excess data and closes the connection, it
also throws out the current response from the cache as it's malformed.

Regards
Henrik




Re: Problem with cached entries w/ETag and request without If-None-Match header

2009-06-29 Thread Henrik Nordstrom
fre 2009-06-12 klockan 13:02 -0400 skrev Jason Noble:
 I recently ran into a bug on Squid 2.7 regarding cached content with 
 ETags.  Currently, if all cached entries for a URL include ETags, and a 
 request is received for said URL with no If-None-Match header, Squid 
 will serve a cached entry.  This behavior does not follow RFC 2616.  I 
 have attached a patch that prevents Squid from serving the cached 
 entries in said case here:  
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2677
 
 I would appreciate any feedback regarding this patch.

As others have already pointed out it's plain wrong.

Please read RFC2616 13.6 Caching Negotiated Responses again for more
details on how this works in HTTP. It's a quite simple and effective
scheme when used right.

In short

Selecting request headers (as indicated by Vary) select which response
may be returned in response to the request. If there is multiple
matching responses then the most recent among them should be used.

Conditional request headers determine if that response is to be a 200,
304 or in case of origin servers only 412.

An unconditional request is just that, and will always respond with the
matching 200 response.

Clients can not select which response to return based on etag, only by
using negotiation request headers as indicated by Vary, such as Accept,
Accept-Language, Accept-Encoding, Cookie etc..

Regards
Henrik



Re: A little help with cbdata and requestLink

2009-06-29 Thread Henrik Nordstrom
fre 2009-06-26 klockan 11:05 -0600 skrev Alex Rousskov:
 On 06/16/2009 09:57 PM, Mark Nottingham wrote:
  Thanks. Anybody else have a second to look?
 
 Please s/fwdStartFoo/fwdContinue/ and document what it is. Since this is
 Squid2 you do not have to do it, of course.
 
 Your cbdata and request manipulations appear technically correct to me.
 IMHO, the temporary lack of requestLink is poor style that will be
 dangerous for future code modifications.

Agreed on all points.

In addition i would like the internal  cacheobj checks to be moved up
before miss_access, but that is a separate change.

Regards
Henrik



Re: about https support for transparent proxy

2009-06-29 Thread Alex Rousskov
On 06/28/2009 02:06 PM, Henrik Nordstrom wrote:
 fre 2009-06-26 klockan 13:00 -0600 skrev Alex Rousskov:
 
 It looks like you are working on a useful feature, but can you
 explain in more detail what your patch does? Why is the feature called
 SslConnect? Is it specific to tproxy environments or can it work with
 any transparent Squid? Does it work in combination with SslBump or are
 they mutually exclusive?
 
 My answers from reading the patch:
 
 Not specific to TPROXY, works with normal interception as well.

Ok, but can you tell what the patch does? Forwards raw SSL connections
to the next hop, as if Squid was a TCP proxy? Something else?

 Do not work with SslBump I think. SslBump requires the CONNECT right?

I do not think so. In my tests, SslBump worked for WCCP-intercepted SSL
connections.

 What kind of magic is going on in tunnelProxyConnectedWriteDummyDone and
 tunnelProxyConnectedReadDone?
 
 Not entirely sure either. But it's somehow about forwarding an
 intercepted SSL connection to a parent proxy where the response from the
 parent needs to be discarded.
 
 I guess much of this is actually from our rather poor proxy CONNECT
 response processing in the tunnel code.. iirc we normally don't even try
 to touch those responses but instead relay them as part of the tunneled
 data, resulting in a 200 response logged even if the parent responded
 with an error, and no Via header added..
 
  Why do we not care about certain
 tunnelStart errors if SslConnect is enabled? Perhaps you can add source
 code comments to explain your intent?
 
 Because it's relaying a SSL connection, not HTTP. So it's not possible
 to return an HTTP error message like we normally do.
 
 I guess it could be extended to respond with an SSL level error
 notification in these cases, but not sure it's worth the effort.

We should probably add a few comments such as above to the sources if
this patch is accepted.

Thank you,

Alex.




Re: A little help with cbdata and requestLink

2009-06-29 Thread Alex Rousskov
On 06/29/2009 12:30 AM, Mark Nottingham wrote:
 OK.
 
 WRT requestlink, I was unsure what would happen if the request didn't go
 forward; will moving requestLink at the bottom up to here:
   fwdState-request = r; /* requestLink? */
 do it?

Yes, provided you adjust the code to call unlink if the request did go
forward but not all the way through to where fwdState currently unlinks.

Alex.


 On 27/06/2009, at 3:05 AM, Alex Rousskov wrote:
 
 On 06/16/2009 09:57 PM, Mark Nottingham wrote:
 Thanks. Anybody else have a second to look?

 Please s/fwdStartFoo/fwdContinue/ and document what it is. Since this is
 Squid2 you do not have to do it, of course.

 Your cbdata and request manipulations appear technically correct to me.
 IMHO, the temporary lack of requestLink is poor style that will be
 dangerous for future code modifications.

 Cheers,

 Alex.



 On 11/06/2009, at 11:28 PM, Amos Jeffries wrote:

 Mark Nottingham wrote:
 Would someone mind taking a quick look at this patch:
 http://www.squid-cache.org/bugs/attachment.cgi?id=1989
 and telling me if I've royally stuffed up with managing fwdState and
 request linking?
 It's to make miss_access a slow lookup...

 Looks okay to these uneducated eyes.  Probably best to wait for
 someone else to double-check before a HEAD commit, but IMO it looks
 good enough for a patching.

 This one is long-awaited by many. Thanks.

 Amos
 -- 
 Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
 Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1

 -- 
 Mark Nottingham   m...@yahoo-inc.com


 
 -- 
 Mark Nottingham   m...@yahoo-inc.com
 



Re: about https support for transparent proxy

2009-06-29 Thread Henrik Nordstrom
sön 2009-06-28 klockan 14:18 -0600 skrev Alex Rousskov:

 Ok, but can you tell what the patch does? Forwards raw SSL connections
 to the next hop, as if Squid was a TCP proxy?

Yes.

  Something else?

Not really. But supports both forwarded mode and standalone (connecting
direct, or via a parent proxy).

  Do not work with SslBump I think. SslBump requires the CONNECT right?
 
 I do not think so. In my tests, SslBump worked for WCCP-intercepted SSL
 connections.

Are you sure that's SslBump, and not just https_port?

https_port works kind of in interception mode, if the certificate
warnings/errors is ignored.. has always been like that just not
documented very well.

Note: SslBump (long term) could be made to work in interception mode
with modern browsers sending the requested hostname in the initial SSL
hello message.

Regards
Henrik



Re: about https support for transparent proxy

2009-06-29 Thread Alex Rousskov
On 06/29/2009 01:32 PM, Henrik Nordstrom wrote:
 sön 2009-06-28 klockan 14:18 -0600 skrev Alex Rousskov:
 
 Ok, but can you tell what the patch does? Forwards raw SSL connections
 to the next hop, as if Squid was a TCP proxy?
 
 Yes.
 
  Something else?
 
 Not really. But supports both forwarded mode and standalone (connecting
 direct, or via a parent proxy).

OK, I see.


 Do not work with SslBump I think. SslBump requires the CONNECT right?
 I do not think so. In my tests, SslBump worked for WCCP-intercepted SSL
 connections.
 
 Are you sure that's SslBump, and not just https_port?

You may be right. I thought I did change something in https_port when
working on SslBump but I may be misremembering. If I did, it was
certainly very little because most of the work was on the CONNECT
requests handling. I do remember that I tested WCCP redirection of port
443 traffic and it worked (with certificate warnings).

 https_port works kind of in interception mode, if the certificate
 warnings/errors is ignored.. has always been like that just not
 documented very well.

Agreed.

 Note: SslBump (long term) could be made to work in interception mode
 with modern browsers sending the requested hostname in the initial SSL
 hello message.

Thank you,

Alex.




Re: [PATCH] Enhanced access logging, added ICAP logging, chunked requests, bug #2495 fix.

2009-06-29 Thread Amos Jeffries
On Sun, 28 Jun 2009 01:46:12 -0600, Alex Rousskov
rouss...@measurement-factory.com wrote:
 On 06/28/2009 12:45 AM, Amos Jeffries wrote:
 
 I note that there are a lot of different fixes going on here. Is it easy
 to pull out these ones for separate merges?
 
  - Bug 2495
  - chunked requests
  - DNS timers and logging
  - ICAP logging changes
 
 Chunked requests are relatively easy to pull.
 
 Bug 2495 can probably be pulled out, but DNS logging stuff is
 essentially a part of it because the new DnsLookupDetail class caries
 the wait time info which gets updated in the right places, all for
 logging. The two were developed together, at the same time. It would
 take a relatively long time to fully separate the two but it can be done.
 
 Most logging is too tightly integrated to be separated (and it probably
 makes no sense to do that anyway because it should be reviewed and used
 together).
 
 Please note that if this patch is approved, separately developed
 features will be merged (by bzr) separately. My understanding is that
 bzr will replay the original commit sequence so we will have separate
 commit messages for separate (but dependent) features.
 
 With the bzr magic caveat in mind, do you want me to spend time on
 separating chunked requests hack and possibly bug 2495 fix?

It would probably be worth it for the chunking break-out. Though the rest
sounds less simple. Lets see how big the 'other' patch is when chunking is
not part of it.

FYI I'm asking for two reasons:
 1) easier audit. I got through ~10% of the lines before falling asleep the
other night.

 2) My experience with the merge features is that bzr does its 'replay'
only internally for consustency.

In the public changesets we keep for patches the merge shows up as one
commit with the entire branch log as the changeset comment. see
http://www.squid-cache.org/Versions/v3/3.0/changesets/b8726.patch for an
example of what happened when I tried it for maintenance updates to
3.0.STABLE3.
 These days I only use it to sync devel branch to trunk as a Merged from
trunk change.
 For such big and important stuff as these breaking it out allows us to
keep at least a minor separating in the public changesets. That said, the
'breaking out' can be done using the -r features to cherry-pick for both
the merge and the audit diff rather than creating whole new branches.

 
   I'm taking a closer look at this one in light of the logging
alterations
 that I have coming up.
 
 
 Your comments suggest ICAP/eCAP are going to be done outside access.log
 so the bits for them do not belong in AccessLogEntry. I know ALE is a
 mix of uses already, but no need to add even more garbage there before
 we fix that bug. Is it easy to make IcapLogEntry a main object in its
 own files, possibly inheriting from AccessLogEntry if you need the ALE
 fields in the icap.log? If not we shall have to do it after the fact in
 the logging updates, but I would rather have it clean beforehand.
 
 There are two groups of logging features, some go into access.log and
 some go into icap.log. Moreover, icap.log can log almost everything
 access.log can (but in a different scope).
 
 I doubt I can pull out IcapLogEntry and make it an ALE kid without
 redesigning old ALE-dependent code that assumes everything is in ALE. I
 think somebody already looked at that and the conclusion was that either
 we have to redesign the whole logging mess or keep adding to it. I
 wanted to save resources for more important stuff so I said add to the
 mess. This is also related to the namespace question below.
 
 Please note that ALE is currently organized as a collection of
 scope-specific subclasses. The \todo I added was specifically to
 emphasize that I know that this is the wrong style, but I am not going
 to change it for now. Note the plural in Inner class declarations :-)

Okay. Understood. I'll add this as yet another task on the redesign then.

 
 When splitting an existing code (Hs in this case) for inbound and
 outbound. Please update them to use both directional  and  format
 codes. 
 
 OK.
 
 The non-directional then needs to default to the old behavior
 with a configu parse WARNING: about the change.
 
 We appear to be fallign into a predictable pattern for LFT. May as well
 keep that up...
 
 Are you sure about the warnings? Every squid.conf that used a custom (or
 copied default) format will generate these warnings. I think this may
 annoy too many folks. I am not against adding support for both * and
 * versions if you think I must, but I wonder if warnings should wait
 until most folks switch or start using to the documented/recommended
 fields?

No, we have other stuff such as the reply/request header split to be
identical situation. When the split happens the old format is deprecated
immediately. We need to start warning as early as possible so that when it
does get remove people have less excuses for not being ready.
The deprecation gets notified in four places: config parsing, release 

Re: [PATCH] Enhanced access logging, added ICAP logging, chunked requests, bug #2495 fix.

2009-06-29 Thread Amos Jeffries
On Sun, 28 Jun 2009 01:46:12 -0600, Alex Rousskov
rouss...@measurement-factory.com wrote:
 On 06/28/2009 12:45 AM, Amos Jeffries wrote:
 
snip
 
 Is the tag form: %adaptation_ and %icap_* set in stone? There was a very
 nice 'namespacing' proposition a while back which never seems to appear.
 Using the ':' separator syntax like so --   %icap::field, or
 %adapt::icap::field
 The HTTP tags are all reserved char-codes. So if we are adding a new tag
 format on top it might as well be a nice extensible one.
 
 AFAIK, the person who was going to implement
 http://wiki.squid-cache.org/Features/AclNamespaces
 is no longer working on Squid. I still stand behind that proposal in
 principle, but will not have the time to implement it myself in the
 foreseeable future :-(.

Oops, forgot to answer this one.
I'm not asking for a full implementation, but we can open the door by using
the namespace documented delimiters '::' instead of '_'. Nothing other than
tag characters need change at this point.

Amos



Re: [PATCH] Truncate HTTP response bodies to match clen

2009-06-29 Thread Kinkie
 Besides, it is probably an HTTP violation to forward trailing garbage
 anyway, unless we switch to some kind of tunneling mode and become a
 TCP proxy, but then you get problems with message smuggling and such.

 Do you think this change should go in?

Trunk and 3.1, I think so, 3.0 I'm uncertain.
If the time allows, maybe protecting it with a fast-acl-backed
selector (defaulting to 'apply to all') would be the most flexible
solution, allowing (especially rproxy) administrators to fine-tune the
behaviour.


-- 
/kinkie