Re: [PATCH] Support PROXY protocol

2014-09-02 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 1/09/2014 2:48 a.m., Amos Jeffries wrote:
> On 19/08/2014 10:12 p.m., Amos Jeffries wrote:
>> Updated patch. I believe this covers everything so far,
>> including the 16-bit alignment and segmented TCP packet issues.
> 
>> Amos
> 
> 
> If there are no objections I will apply this soon.
> 

Merged to trunk as rev.1370

Amos
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUBcmJAAoJELJo5wb/XPRj37YH/ijvazrg7W1vEGx73+Iw6Qnw
5dhSajz2ZUOFYW6vDh2JO8fgtOsrNQ5gtCTCOHAB0Xq0AcPKoOjlP/Dv/+iPk9PV
XVJNsDe/2T/4vXtnPGul6YLT3NxkNG30T/dWBfoc262vU0WcItlSR7K4aov/qLh6
6aib9PLO+yrkrAWfFUMI7rkzbJx91kukQMnMcA070PXrol8h0OJpDDjqHb1w293u
nFWnX8fRMf5C0ngeWCLI4JxBFtHEc3Ka1mL0add6ncMpLFRirH/07tRxawxtKQLN
BCu9JpNwASNnINDG0TmdzoTYdqrFUVHZUAb+yADDU8KBIpV1RiZh3gJDve0W1lQ=
=r449
-END PGP SIGNATURE-


Re: [PATCH] pconn_lifetime

2014-09-02 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2/09/2014 7:38 p.m., Tsantilas Christos wrote:
> On 09/02/2014 03:51 AM, Amos Jeffries wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
>> 
>> On 2/09/2014 4:49 a.m., Tsantilas Christos wrote:
>>> Hi all,
>>> 
>>> This patch add a new configuration option the 'pconn_lifetime'
>>> to allow users set the desired maximum lifetime of a
>>> persistent connection.
>>> 
>>> When set, Squid will close a now-idle persistent connection
>>> that exceeded configured lifetime instead of moving the
>>> connection into the idle connection pool (or equivalent). No
>>> effect on ongoing/active transactions. Connection lifetime is
>>> the time period from the connection acceptance or opening time
>>> until "now".
>>> 
>>> This limit is useful in environments with long-lived
>>> connections where Squid configuration or environmental factors
>>> change during a single connection lifetime. If unrestricted,
>>> some connections may last for hours and even days, ignoring
>>> those changes that should have affected their behavior or their
>>> existence.
>>> 
>>> This is a Measurement Factory project
>> 
>> Two problems with this.
>> 
>> * the directive name does not indicate whether it applies to
>> client, server, or ICAP conections.
> 
> It applies to any connection, server, client or ICAP connections.

Those are very distinct groups of connections. With very different
lifetime properties.

> 
>> 
>> * the rationale for its need is a bit screwey. Any directives
>> which affect Squid current running state need to ensure in their
>> post-reconfigure logics that state is consistent. Preferrably
>> create a runner, or use configDoConfigure if that is to much.
> 
> Do you mean to run over the persistent connections list and change
> the timeout?

I mean:

1) for the client connections list, set the flag forcing keep-alive to
terminate after current request.
 - only if the ports config changed.

2) For the server pconn idle pools call close() on all existing
comnnections.
 - but why? what condition requires this?

3) for the cache_peer standby pools, calling close()
   - only if the peer config changed.

4) for the ICAP idle pools calling close().
 - but why? what condition needs this other than ICAP config changes,
and thus parent object holding them open being destructed?

5) for the ICAP currently active connections set a flag closing after
current request.
 - but again why? what condition has changed?

> 
>> 
>> To support that we need some API updates to globally access the 
>> currently active connection lists for servers/client/icap. But
>> there are other things like stats and reporting also needing that
>> API, so we should look at adding it instead of yet another
>> "temporary" workaround.
> 
> We do not need to make any change to support ICAP and server side 
> connections. This is because we are storing this connections to 
> IdleConnList or PconnPool structures.
> 
> Currently we can not change timeout for client-side keep-alived 
> connections. We do not have a list with these connections
> available.
> 

That is what I refer to as needed API. Stats reporting and delay pool
reassignnment also need this std::list of active ConnStateData
connections.

Amos
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUBbfzAAoJELJo5wb/XPRj93IH/i3QbIA/mcU/3EA3mnz2E1Vu
166/k3zlCBAD/NhcD9pFb735VoLSJqIprcfsnCbPoPFKJfX4d4kwSfTtTLshMyBP
vJbyKwcWdSt4ZUybGyeKXXitLaUFkQ8tZQgXzs60msxC58EMAsQ7JB08bEPzV38P
PrIU1YQK/bsH+cSw07CoqiBmLbWYqvSpkVTsPNGXBCTAvPr3gWGz3wNvzZ2c4Ai0
hlVs95XY9WvIIYUjPrRD+BKnBtagdga/1ouclg5mPa7rR6/0txONA7zTXVMAth7c
7jDZh/YY9hEgF9TG9xEwcHsh0w7Wo9lt9l7eZlftdyOGzsZ/ei+FlAy68zfKW7g=
=tLcn
-END PGP SIGNATURE-


Re: [PATCH] %

2014-09-01 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 27/08/2014 11:08 p.m., Tsantilas Christos wrote:
> Hi all,
> 
> The total server time is not computed in some cases, for example
> for CONNECT requests. An other example case is when server-first
> bumping mode is used and squid connects to SSL peer, but connection
> terminated before the SSL handshake completes.
> 
> The attached patch is trying to fix these cases.
> 
> This is a Measurement Factory project

+1, looks good. Some (optional) polishing though if you please.

* Since you are moving and touching most of the logics for
first_conn_start and total_response_time. Please take the opportunity
to upgrade their naming to camelCase and _ suffix.

Amos
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUBRqiAAoJELJo5wb/XPRjQtsIAJxD4SPKoH5MoV3Nmk8P+d6V
rw7VSZ2uYK8ELT3llp6cYGBNafwyLaSop2UuX4q8O6D9GcwDnTgJSLfUiCuac0S5
q9T3HdJeRX069/B0SBJfLiuyTucO6MuWi8pKtoNuH6jDk8LG5CnCyj1PP2+PlZwn
hKv4tIcT61VBS7koCXt0+xI17g89X67jlc5gn2FSA8tTRwhpFKZDe7bmoMx/slwR
7TXZaUmnC+gSDkKZaTuX0ZgULOBuPtEE8+1Ku8/qpSX3TmUaaKa2JjKovuVAMugA
QIW8dzQGhpUOjb4YlPI+MVCgxIbgxtuB4GLxzYdf8KMmr+AksFs3wIWKIIGLKYg=
=9K5L
-END PGP SIGNATURE-


Re: [PATCH] pconn_lifetime

2014-09-01 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2/09/2014 4:49 a.m., Tsantilas Christos wrote:
> Hi all,
> 
> This patch add a new configuration option the 'pconn_lifetime' to
> allow users set the desired maximum lifetime of a persistent
> connection.
> 
> When set, Squid will close a now-idle persistent connection that 
> exceeded configured lifetime instead of moving the connection into
> the idle connection pool (or equivalent). No effect on
> ongoing/active transactions. Connection lifetime is the time period
> from the connection acceptance or opening time until "now".
> 
> This limit is useful in environments with long-lived connections
> where Squid configuration or environmental factors change during a
> single connection lifetime. If unrestricted, some connections may
> last for hours and even days, ignoring those changes that should
> have affected their behavior or their existence.
> 
> This is a Measurement Factory project

Two problems with this.

* the directive name does not indicate whether it applies to client,
server, or ICAP conections.

* the rationale for its need is a bit screwey.
 Any directives which affect Squid current running state need to
ensure in their post-reconfigure logics that state is consistent.
Preferrably create a runner, or use configDoConfigure if that is to much.

To support that we need some API updates to globally access the
currently active connection lists for servers/client/icap. But there
are other things like stats and reporting also needing that API, so we
should look at adding it instead of yet another "temporary" workaround.

Amos

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUBRSPAAoJELJo5wb/XPRj+wQH/A/XVulpmlHL0aIFJBGJyL6m
GrB15odh+bK2U3K2++ZK+B3VDhNs6X44sG/XdA+bxGSw5E01oeIXQEZPoxzlAzO9
YBJx8zpBtnSlUAwuTwptomZPmbfsWDfOjtQQ0KLlccBpHJnb89M1w9O6bV3XT0hz
CRMPW1tY50R6U/HEBJWeXwwdeRfXJmeFru7sHE/evZCre17eVQ79zrcnlVjMciTD
Ag1ctW7C5u87b8iec81aW4X3dMG2IoQSER97dZ2Cde+tVqArK0UlfC9qZIi99DrT
7/WmdZpzPZyTJrCtZw8LD0iIMur7ez1o7kvSs4Z5AP0EPKAvHaTWwK9oASl2QDE=
=WtX2
-END PGP SIGNATURE-


Re: [PATCH] Support PROXY protocol

2014-08-31 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 19/08/2014 10:12 p.m., Amos Jeffries wrote:
> Updated patch. I believe this covers everything so far, including
> the 16-bit alignment and segmented TCP packet issues.
> 
> Amos
> 

If there are no objections I will apply this soon.

Amos

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUAzXZAAoJELJo5wb/XPRjqDsH/j43Uf2L/esGDV+9XkZCC0ID
76h6XvhQusZc8jOqefb22dKM1kK9b2lQ5su94rjcsUTFkWwK9OJ1KYENvoZps58U
ft+R7vd3fYGnfH6CJwDh33KRsyX+ZqCku53zumCNoG5TdP/FIE1HODyiV7u74lk1
rfcTssI5aG/f2x9gbOvTQURKbSemaiPSauWYIblRDteGp9knRiZSLcfbQFFMsioK
R6010F+dH6VkmvxstW8yDQa/GXZbyw54LnQhRn697bfI5yOIj4jIQLWEz2dyVoyY
ylOaTaM0AKV8CB61T9nV4PfBoveXaMvMqKNdCmoMUueEfPiPy1Exjbak9Rt7IOU=
=1oHT
-END PGP SIGNATURE-


Re: Naming clashes post src/clients, src/servers restructuring

2014-08-28 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 27/08/2014 9:00 a.m., Alex Rousskov wrote:
> On 08/07/2014 Alex Rousskov wrote on [PATCH] Native FTP Relay
> thread:
> 
>> Changes to the general code used by the Native FTP Relay code:
>> 
 * The user- and origin-facing code restructured as agreed
 previously on this mailing list. I will email some thoughts
 about the results separately, but here is the executive
 summary:
 
 src/servers/FtpServer.*  # new FTP server, relaying FTP 
 src/servers/HttpServer.* # old ConnStateData parts
 conflicting with FtpServer src/clients/FtpClient.*  # code
 shared by old and new FTP clients src/clients/FtpGateway.* #
 old FTP client, translating back to HTTP 
 src/clients/FtpRelay.*   # new FTP client, relaying FTP 
 src/ftp/*# FTP stuff shared by clients and
 servers
> 
> 
> Hello,
> 
> After native FTP Relay changes were merged we got even more naming 
> clashes and confusion in Squid code than before. Specifically, the 
> following terms and related terminology conflict with each other
> and/or with the newly agreed upon scheme:
> 
> * server-side(50) means the opposite of src/servers/ *
> client-side(60) means the opposite of src/clients/ *
> src/Server(150) class should be src/clients/Client *
> ConnStateData(100) class should be src/servers/{HttpServer,Server} 
> * "bar server" means a server we connect to, not src/servers/bar *
> "foo client" means a client connected to us, not src/clients/foo
> 
> The numbers in (parenthesis) represent a very approximate number
> of occurrences of the corresponding terms in trunk src/.
> 
> For example, the phrase "HTTP server" may mean either 
> src/servers/HttpServer (Squid class implementing an HTTP server) or
> an HTTP origin server that our src/clients/HttpClient is talking
> to. Very confusing!
> 
> Current developers should be able to successfully navigate this
> mess in most cases because we can usually distinguish new code from
> the old one and, hence, correctly guess the terminology flavor
> used. New developers will be really puzzled. And even for current
> developers, deciding whether to be consistent with the surrounding
> code or with the current/new terminology is going to be exceedingly
> tricky!
> 
> 
> I cannot propose a good solution that we can afford. I assume
> nobody would volunteer to go through sources to fix all the
> inconsistencies and violations in one sweep. If I am wrong, please
> speak up now so that we can go through this procedure before v3.5
> is branched!
> 
> I can only suggest the following imperfect measures to reduce the
> number of clashes (using HTTP as an example/placeholder for other
> protocols):
> 
> 0. Move src/Server.* to client/Client.* now(##).

Agreed.


> Unlike ConnStateData, the existing Server class was scoped and
> named correctly (but based on the old terminology!) so it clashes
> with the new terminology more than any other class and it is easier
> to rename/move (we do not need to split HTTP code out of it -- that
> has been done already).

ConnStateData is a non-issue for now. see #3.

> 
> 1. If you want to talk about an agent connecting to Squid using
> HTTP, say "HTTP user". Yes, "user" clashes with a human "user" that
> may control that connecting HTTP agent, but that distinction is
> usually not important. Use "human HTTP user" and "HTTP user agent"
> to differentiate where needed. Yes, "user" also often implies "end
> user" rather than just the previous HTTP hop. As I said, this
> suggestion is far from perfect.
> 

If we are to do #3 below there should be no confusion when talking
about clients and servers (non-class name) as external to Squid.
Developers with basic understanding of client-server networking will
not be confused that a client and FooServer interact, or a FooClient
and server. Those who are we can reference to wikipedia or a tech
dictionary.


> 2. If you want to talk about an agent Squid connects to using HTTP,
> say "HTTP peer" (in general) and "HTTP origin" when talking about
> the end BAR server specifically. Yes, this suffers from problems
> similar to #1 above.
> 
> 3. Use future class names in new comments, even if those classes do
> not exist yet, especially when working outside the corresponding
> class code. For example, when documenting something about
> ConnStateData inside FwdState, call it Server instead of
> ConnStateData. This will reduce future changes as the classes get
> renamed.

Sounds like a good plan. It may be worth referencing RFC 7230 for
terminology definition of the naming for client, server, gateway,
relay, tunnel classes.

Or better, once Server is renamed referencing the existig class names
should also be fine. The "FooStateData" names for example are not in
themselves part of the "client"/"server" actor-vs-side confusion.


> 
> 4. Keep ConnStateData as is for now, until somebody volunteers to 
> extract HTTP code from it and place the extracted HTTP

Re: [PATCH] SSL Peek and Splice

2014-08-26 Thread Amos Jeffries
On 26/08/2014 5:18 a.m., Tsantilas Christos wrote:
> I am attaching a new patch.
> I hope this patch fixes most of the issues.
> 

Looks fine to me now. +1.

Amos



Re: Squid 3.5 release timetable

2014-08-25 Thread Amos Jeffries
On 23/08/2014 9:34 a.m., Amos Jeffries wrote:
> On 23/08/2014 6:02 a.m., Alex Rousskov wrote:
>> On 08/21/2014 11:54 PM, Amos Jeffries wrote:
>>> Update:
>>>
>>>  * FTP and Kerberos patches are in and building fine.
>>>  (build farm has some internal issues clouding the state)
>>>
>>>  * I am punting Bearer and Parser-NG off into 3.6 series. Some design
>>> decisions may take a while to resolve and re-audit.
>>>
>>>  * PROXY protocol and Splice/Peek appear to be almost done with audit.
>>>
>>> So if those last two can be pushed onward this week I would like to
>>> branch once they are merged and the build farm gives them a mostly-blue
>>> pass rate.
>>
>> What about the boilerplate updates? I think it would be best to do them
>> before branching as they are another example of simple changes affecting
>> a lot of files. I will try to find the time to revisit that project next
>> week.
> 
> Ah, right. If you could at least check and update that list of
> authorships we have permission to move blurbs for I can throw small bits
> of time at the incremental updates.
>  Would a comment in the script listing dates as to when each author gave
> permission be doable to make this less of a big ask?
> 
> Amos
> 


Sorry, another followup. I'm seeing "Claim: Duane Wessels" showing up
already and I've only checked a few bits. It might be worth chasing him
up for permission as well.

Amos



Re: [RFC] squid-3.6 unit tests

2014-08-24 Thread Amos Jeffries
On 24/08/2014 12:01 p.m., Amos Jeffries wrote:
> On 24/08/2014 3:59 a.m., Alex Rousskov wrote:
>>
>> If the proposed changes take a few months to implement, then yes, I
>> agree, we should not wait. If it is a matter of a week or two, I suggest
>> doing it now. This is your call though.
>>
> 
> It should be a relatively quick job. I will give it a shot while waiting
> on boilerplate.
> 

Spoke too soon. We will have to sort out the automake subdir-objects
issues first.

Amos



Re: [RFC] squid-3.6 unit tests

2014-08-23 Thread Amos Jeffries
On 24/08/2014 3:59 a.m., Alex Rousskov wrote:
> 
> If the proposed changes take a few months to implement, then yes, I
> agree, we should not wait. If it is a matter of a week or two, I suggest
> doing it now. This is your call though.
> 

It should be a relatively quick job. I will give it a shot while waiting
on boilerplate.

Amos



Re: [RFC] squid-3.6 unit tests

2014-08-23 Thread Amos Jeffries
On 23/08/2014 5:41 a.m., Alex Rousskov wrote:
> On 08/20/2014 09:16 PM, Amos Jeffries wrote:
>> Future versions of autoconf/automake will be auto-enabling their
>> subdir-objects feature. This impacts Squid in a few ways, the largest
>> being how we perform unit testing.
>>
>> At present our unit tests link to objects in $(top_srcdir)/src/tests/
>> and sometimes from other src/foo/ directories. The current automake will
>> build a .o file in the current working directory, but with
>> subdir-objects the .o would be built in the src/tests/ etc directory and
>> shared between any other units using the base .c file.
>>  This will naturally cause build failures when the source sub-directory
>> has not yet been created by the build process. subdir-objects is not
>> automatically creating paths.
>>
>> I am proposing that:
>>
>> 1) we reverse the decision to build any unit tests in src/
>> subdirectories. Returning to the original design of using top level
>> test-suite/ directory to build and run them. This will ensure that all
>> necessary .la objects and src/ directories are already existing before
>> unit tests get built.
>>
>> We are already facing some problems from that plan. For example when
>> unit tests for src/fs/ need to link against src/http/*.la we currently
>> have to build them in src/Makefile.am after all src/*/ sub-directories
>> have been recursed and built. Some possible compat/ unit tests are also
>> blocked by lib/* not being built at the time.
>>
>>
>> 2) the STUB .cc files are moved from src/tests/. With subdir-objects we can
>>
>>  a) place them in the relevant src/foo/ directory, or
>>  b) place them in src/stubs/.
>>
>> With a rename from stub_foo.cc to foo.stub.cc so we can make use of a
>> generic automate their build and dist operations from Common.am (in case
>> of 2a) or src/stubs/Makefile.am (in case of 2b).
> 
> 
> No objections from me. While it would be better to localize unit tests,
> our unit tests are apparently not local enough to be easily localized.
> In many cases, this is probably due to poor quality of the primary code
> (too many dependencies), and will take a while to improve.
> 
> 
>> If agreed this would take effect after 3.5 branching.
> 
> This is your call, but I am surprised you want to make such a relatively
> big (and yet essentially simple) change after branching. I would
> recommend to do it before you branch to minimize long-term maintenance
> overheads.

Changes big enough to affect the unit tests operations or results are
(small?) feature or interface updates which are not really candidates
for backporting under the current release policy. So the difference
between branches caused by this should not matter for porting bug fixes
around in the main code.

Amos


Re: Squid 3.5 release timetable

2014-08-22 Thread Amos Jeffries
On 23/08/2014 6:02 a.m., Alex Rousskov wrote:
> On 08/21/2014 11:54 PM, Amos Jeffries wrote:
>> Update:
>>
>>  * FTP and Kerberos patches are in and building fine.
>>  (build farm has some internal issues clouding the state)
>>
>>  * I am punting Bearer and Parser-NG off into 3.6 series. Some design
>> decisions may take a while to resolve and re-audit.
>>
>>  * PROXY protocol and Splice/Peek appear to be almost done with audit.
>>
>> So if those last two can be pushed onward this week I would like to
>> branch once they are merged and the build farm gives them a mostly-blue
>> pass rate.
> 
> What about the boilerplate updates? I think it would be best to do them
> before branching as they are another example of simple changes affecting
> a lot of files. I will try to find the time to revisit that project next
> week.

Ah, right. If you could at least check and update that list of
authorships we have permission to move blurbs for I can throw small bits
of time at the incremental updates.
 Would a comment in the script listing dates as to when each author gave
permission be doable to make this less of a big ask?

Amos



Re: Squid 3.5 release timetable

2014-08-21 Thread Amos Jeffries
Update:

 * FTP and Kerberos patches are in and building fine.
 (build farm has some internal issues clouding the state)

 * I am punting Bearer and Parser-NG off into 3.6 series. Some design
decisions may take a while to resolve and re-audit.

 * PROXY protocol and Splice/Peek appear to be almost done with audit.

So if those last two can be pushed onward this week I would like to
branch once they are merged and the build farm gives them a mostly-blue
pass rate.

Amos



Re: [PATCH] Kerberos configure patch + some cleanup

2014-08-21 Thread Amos Jeffries
On 19/08/2014 10:08 p.m., Amos Jeffries wrote:
> On 10/08/2014 10:37 p.m., Markus Moeller wrote:
>> Apologies. I must have overlooked it. Here is the updated patch
>>
> 
> This one looks much better. If there are no objections I will apply it
> shortly.
> 
> Amos
> 


Applied as trunk rev.13538

Amos


[RFC] squid-3.6 unit tests

2014-08-20 Thread Amos Jeffries
Future versions of autoconf/automake will be auto-enabling their
subdir-objects feature. This impacts Squid in a few ways, the largest
being how we perform unit testing.

At present our unit tests link to objects in $(top_srcdir)/src/tests/
and sometimes from other src/foo/ directories. The current automake will
build a .o file in the current working directory, but with
subdir-objects the .o would be built in the src/tests/ etc directory and
shared between any other units using the base .c file.
 This will naturally cause build failures when the source sub-directory
has not yet been created by the build process. subdir-objects is not
automatically creating paths.

I am proposing that:

1) we reverse the decision to build any unit tests in src/
subdirectories. Returning to the original design of using top level
test-suite/ directory to build and run them. This will ensure that all
necessary .la objects and src/ directories are already existing before
unit tests get built.

We are already facing some problems from that plan. For example when
unit tests for src/fs/ need to link against src/http/*.la we currently
have to build them in src/Makefile.am after all src/*/ sub-directories
have been recursed and built. Some possible compat/ unit tests are also
blocked by lib/* not being built at the time.


2) the STUB .cc files are moved from src/tests/. With subdir-objects we can

 a) place them in the relevant src/foo/ directory, or
 b) place them in src/stubs/.

With a rename from stub_foo.cc to foo.stub.cc so we can make use of a
generic automate their build and dist operations from Common.am (in case
of 2a) or src/stubs/Makefile.am (in case of 2b).


If agreed this would take effect after 3.5 branching.

Amos


Re: [Patch] ssl_bump X.509 version mismatch

2014-08-20 Thread Amos Jeffries
On 21/08/2014 3:19 a.m., Steve Hill wrote:
> On 20/08/14 15:41, Alex Rousskov wrote:
> 
>> This is probably fixed in trunk r13533. The problem may not be limited
>> to self-signed certificates. See the change log for details.
> 
> Ahh damn, I didn't check the trunk! :)
> Yes, it looks like it will solve the problem.
> 
>> *v4: I am worried that Squid might generate certificates that Squid
>> itself cannot use if we just blindly copy the version value like that. I
>> have seen posts discussing v4 certificates... On the other hand, I do
>> not know whether Squid can successfully negotiate a secure connection
>> with a server using x509 v4. Perhaps Squid should mimic the original
>> version after lowering it to v3 if needed?
> 
> I'm not especially familiar with what new stuff v4 brings to the table.
>  Capping the version at 3 (until the rest of Squid can support 4) seems
> reasonable, although we obviously have to be careful not to include v4
> extensions in a v3 certificate.
> 
>> *v3 where v2 would suffice: There are cases where Squid is correctly
>> generating a v2 certificate while mimicking a v3 certificate (because
>> Squid does not mimic any of the extensions in the true certificate). Is
>> it really a good idea to increase/mimic the version in this case? I am
>> not sure. What do you think?
> 
>> A) "mimic everything except for the stuff we know is unsafe" and
>> B) "mimic only the stuff we know is safe to mimic".
>>
>> We started with (A) but, based on the initial SslBump experience, we now
>> think that (B) works better in most (but not all!) use cases. Your patch
>> (if applied literally) follows (A). The current code uses (B). Do you
>> think we should replace trunk r13533 with your patch or some adjusted
>> version of it as discussed in the yellow flags above?
> 
> Unfortunately I don't think I'm really knowledgeable enough about SSL to
> make that judgment.
> 

X.509 v3 was standardized in 1996, and the current RFC5280 is dated 2008.

The basic design of TLS (not SSL) AIUI is to advertise latest version
and downgrade to remote ends capabilities only if necessary.

So are we not in a position to always use v3(+) certificate ?

Amos


Re: [PATCH 6/8] reconfiguration leaks: SSL certificate context cache

2014-08-20 Thread Amos Jeffries
On 20/08/2014 9:27 a.m., Alex Rousskov wrote:
> On 06/15/2014 05:00 AM, Tsantilas Christos wrote:
>> On 06/13/2014 10:46 PM, Alex Rousskov wrote:
>>> On 04/25/2014 01:46 AM, Amos Jeffries wrote:
>>>> On 25/04/2014 12:56 p.m., Alex Rousskov wrote:
>>>>> Do not leak fake SSL certificate context cache when reconfigure
>>>>> changes port addresses.
> 
>>>> This requires the guarantee that all connections using the storage are
>>>> closed right?
> 
> 
>>> Hi Christos,
>>>
>>>My understanding is that deleting a cached LocalContextStorage object
>>> does not actually affect connections that use the corresponding SSL_CTX
>>> and certificate because any SSL object using those things increments
>>> their sharing counter and deleting LocalContextStorage only decrements
>>> that counter. The [cached] SSL_CTX object is not destroyed by
>>> SSL_CTX_free until that sharing counter reaches zero. Is my
>>> understanding flawed?
> 
> 
>> This is true. The SSL_CTX objects are not destroyed.
> 
> 
> 
>>> Do we have any code that stores SSL_CTX pointers for asyncrhonous use
>>> (i.e., across many main loop iterations) but does not increment the
>>> sharing counter?
> 
> 
>> Nope.
>> I hope I am not loosing anything. In any case if such case found it
>> should be considered as bug, and must fixed...
> 
> 
> Hi Amos,
> 
> Does the above exchange resolve your concerns regarding that 6/8
> leak patch? I have re-attached the same patch here for your convenience.

It does, yes.  +1.

Amos

> 
> 
> Thank you,
> 
> Alex.
> 
> 



Re: [PATCH] SSL Peek and Splice

2014-08-19 Thread Amos Jeffries
On 13/08/2014 11:20 p.m., Tsantilas Christos wrote:
> Hi all,
> 
> This is a first patch which implements the Peek-and-Splice feature
> described in wiki:
>   http://wiki.squid-cache.org/Features/SslPeekAndSplice
> 
> The goal of this patch is to make SSL bumping decision after the origin
> server name is known.
> 
> Short description
> 
> 
> Peek and Splice peeks at the SSL client Hello message and SNI info if
> any (bumping step 1), sends identical or a similar Hello message to the
> SSL server and peeks at the SSL server Hello message (bumping step 2),
> and finally decides to proceed with splicing or bumping the connection
> (bumping step 3).
> 
> After the step 1 bumping step completes the SNI information is available
> and after the step 2 bumping step completes the server certificate is
> available.
> 
> The ssl_bump access list evaluated on every bumping step to select the
> bumping mode to use. The new acl "at_step" can be used to match the
> current bumping step.
> 
> In most cases:
>  - if the user select "peek" bumping mode at step2 then at step3 can select
>one of the "splice" or "terminate" modes.
>  - If the user select "stare" bumping mode at step2 then at step 3 can
> select
>one of the "bump" or "terminate" modes.
> 
> If the squid built with the SQUID_USE_OPENSSL_HELLO_OVERWRITE_HACK and
> the client uses openSSL library similar to the library used by squid
> then bumping is possible after "peek" bumping mode selection and
> "splice" after "stare" bumping mode selection.
> 
> The bump, terminate and splice are final decisions.
> 
> Example configurations:
> 
> acl step1 at_step  SslBump1
> acl step2 at_step  SslBump2
> acl step3 at_step  SslBump3
> 
> ssl_bump peek step1 all
> ssl_bump splice step2 BANKS
> ssl_bump peek step2 all
> ssl_bump terminate step3 BLACKLIST
> ssl_bump splice step3 all
> 
> This is a Measurement Factory project


in configure.ac:
- please perform all the AC_CHECK_HEADERS for openssl/*.h inside the
--with-openssl block. Some of them are already being checked there.


in acl/AtBumpStep .h and .cc
 - please wrap the file contents with USE_OPENSSL internally.
 - please also rename either the files or classes according to the
coding guidelines about filename matching the class name. Classes seem
to be "AtStep" instead of "AtBumpStep".


in src/acl/AtBumpStepData.h
- missing gap between "" and <> include lists. This also happens elsewhere.


in src/acl/AtBumpStep.h:
 - this does not need to be doxygen comment, nor so big:
+/**
+ * Not implemented to prevent copies of the instance.
+ */


in src/client_side.cc:
- "debugs(33, 5, "httpsCreate: " shuffled but not cleaned up. Should
have the function name prefix removed.

- httpsSslBumpStep2AccessCheckDone() debugs message "Bio for  " please
display the whole connection detail, or at least prefix the fd with "FD ".
  - similar problems in clientPeekAndSpliceSSL() fd debugs output.
  - later on in src/ssl/PeerConnector.cc you are using lower case "fd "
to prefix the FD. The log grep pattern is "FD n" where n is the numeric
FD number.


in src/format/Token.cc:
- please add a reservation for the ssl:: include does not need HAVE_STRING
- is there a particular reason you are using std::string for members of
class sslFeatures instead of SBuf ?
- please use mb_size_t for class ServerBio::helloMsgSize. It may be
capable of holding 64-bit sizes even when int is only 32-bit.


in src/tunnel.cc:
- switchToTunnel()
"+tunnelState->server.size_ptr = NULL;//" should probably be set
to one of the ConnStateData::al.http.clientReplySz counters.


Also,
 - please do make fd.h (not fde.h) export its
default_read_method/default_write_method functions instead of
re-defining them all over the place.


+1. Thats all cosmetic. So I think this can go in after fixing.

Amos


Re: [PATCH] Support PROXY protocol

2014-08-19 Thread Amos Jeffries
Updated patch. I believe this covers everything so far, including the
16-bit alignment and segmented TCP packet issues.

Amos

=== modified file 'doc/release-notes/release-3.5.sgml'
--- doc/release-notes/release-3.5.sgml  2014-08-11 16:09:06 +
+++ doc/release-notes/release-3.5.sgml  2014-08-12 14:53:59 +
@@ -27,40 +27,41 @@
 
 Although this release is deemed good enough for use in many setups, please 
note the existence of 
 http://bugs.squid-cache.org/buglist.cgi?query_format=advanced&product=Squid&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&version=3.5";
 name="open bugs against Squid-3.5">.
 
 Changes since earlier releases of Squid-3.5
 
 The 3.5 change history can be http://www.squid-cache.org/Versions/v3/3.5/changesets/"; name="viewed 
here">.
 
 
 Major new features since Squid-3.4
 Squid 3.5 represents a new feature release above 3.4.
 
 The most important of these new features are:
 
Support libecap v1.0
Authentication helper query extensions
Support named services
Upgraded squidclient tool
Helper support for concurrency channels
Native FTP Relay
+   Receive PROXY protocol, Versions 1 & 2
 
 
 Most user-facing changes are reflected in squid.conf (see below).
 
 
 Support libecap v1.0
 Details at http://wiki.squid-cache.org/Features/BLAH";>.
 
 The new libecap version allows Squid to better check the version of
   the eCAP adapter being loaded as well as the version of the eCAP library
   being used.
 
 Squid-3.5 can support eCAP adapters built with libecap v1.0,
but no longer supports adapters built with earlier libecap versions
due to API changes.
 
 
 Authentication helper query extensions
 Details at http://www.squid-cache.org/Doc/config/auth_param/";>.
 
@@ -188,72 +189,111 @@
 
 FTP Relay highlights:
 
 
 Added ftp_port directive telling Squid to relay native FTP commands.
 Active and passive FTP support on the user-facing side; require
  passive connections to come from the control connection source IP
  address.
 IPv6 support (EPSV and, on the user-facing side, EPRT).
 Intelligent adaptation of relayed FTP FEAT responses.
 Relaying of multi-line FTP control responses using various formats.
 Support relaying of FTP MLSD and MLST commands (RFC 3659).
 Several Microsoft FTP server compatibility features.
 ICAP/eCAP support (at individual FTP command/response level).
 Optional "current FTP directory" tracking with the assistance of
  injected (by Squid) PWD commands (cannot be 100% reliable due to
  symbolic links and such, but is helpful in some common use cases).
 No caching support -- no reliable Request URIs for that (see above).
 
 
+Receive PROXY protocol, Versions 1 & 2
+More info at http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt";>
+
+PROXY protocol provides a simple way for proxies and tunnels of any kind to
+   relay the original client source details without having to alter or 
understand
+   the protocol being relayed on the connection.
+
+Squid currently supports receiving HTTP traffic from a client proxy using 
this protocol.
+   An http_port which has been configured to receive this protocol may only be 
used to
+   receive traffic from client software sending in this protocol.
+   HTTP traffic without the PROXY header is not accepted on such a port.
+
+The accel and intercept options are still used to 
identify the
+   traffic syntax being delivered by the client proxy.
+
+Squid can be configured by adding an http_port
+   with the require-proxy-header mode flag. The 
proxy_protocol_access
+   must also be configured with src ACLs to whitelist proxies which 
are
+   trusted to send correct client details.
+
+Forward-proxy traffic from a client proxy:
+
+ http_port 3128 require-proxy-header
+ proxy_protocol_access allow localhost
+
+
+Intercepted traffic from a client proxy or tunnel:
+
+ http_port 3128 intercept require-proxy-header
+ proxy_protocol_access allow localhost
+
+
+Known Issue:
+   Use of require-proxy-header on https_port is not 
supported.
+
 
 Changes to squid.conf since Squid-3.4
 
 There have been changes to Squid's configuration file since Squid-3.4.
 
 Squid supports reading configuration option parameters from external
files using the syntax parameters("/path/filename"). For example:
 
 acl whitelist dstdomain parameters("/etc/squid/whitelist.txt")
 
 
 The squid.conf macro ${service_name} is added to provide the service name
of the process parsing the config.
 
 There have also been changes to individual directives in the config file.
 
 This section gives a thorough account of those changes in three categories:
 
 



 
 
 
 New tags
 
 
collapsed_forwarding
Ported from Squid-2 with no configuration or visible behaviour 
changes.
Collapsing of requests is performed across SMP workers.
 
+   proxy_protocol_ac

Re: [PATCH] Kerberos configure patch + some cleanup

2014-08-19 Thread Amos Jeffries
On 10/08/2014 10:37 p.m., Markus Moeller wrote:
> Apologies. I must have overlooked it. Here is the updated patch
> 

This one looks much better. If there are no objections I will apply it
shortly.

Amos



Re: [PATCH] Better handling of huge native FTP requests

2014-08-19 Thread Amos Jeffries
On 13/08/2014 11:46 a.m., Alex Rousskov wrote:
> Hello,
> 
> The attached patch avoids assertions on large FTP commands and
> cleans up both general and FTP-specific error handling code during
> request parsing. Please see the patch preamble for technical details.
> 
> 
> Thank you,
> 
> Alex.
> 

I've not had time for much detail checking. From a quick scan all I
could see was that in Ftp::Server::parseOneRequest flags.readMore is set
to false twice. The one at the method end can now be removed.

+1. If this works for you I think it can go in.

Amos



Re: [PATCH] Support PROXY protocol

2014-08-12 Thread Amos Jeffries
On 11/08/2014 4:32 p.m., Alex Rousskov wrote:
> On 08/05/2014 08:31 PM, Amos Jeffries wrote:
> 
>> I am adding proxy_protocol_access as the first access control, reverting
>> follow_x_forwarded_for for the second. 
> 
> Great. I think this is a much simpler/cleaner design.
> 
> 
>> +} else if (strcmp(token, "require-proxy-header") == 0) {
>> +s->flags.proxySurrogate = true;
>> +debugs(3, DBG_IMPORTANT, "Disabling TPROXY Spoofing on port " << 
>> s->s << " (require-proxy-header enabled)");
>> +
> 
> The IMPORTANT message should probably be printed only if TPROXY spoofing
> on that port was actually configured/requested.
> 
> And, at that point, I would actually make the apparently illegal
> combination a fatal misconfiguration error, but I suspect you do not
> like that approach, and I will not insist on it.
> 

Done.

>> But retaining the new description texts.
> 
>> -DEFAULT_DOC: X-Forwarded-For header will be ignored.
>> +DEFAULT_DOC: indirect client IP will not be accepted.
> 
> The old documentation line was much more clear IMO. If it was incorrect,
> can you rephrase the correction please? Perhaps you meant to say
> something like "Any client IP specified in X-Forwarded-For header will
> be ignored"? If yes, the current version still sounds better to me.
> Adding Forwarded header to those description is a good idea if correct,
> of course.
> 

Done.

>> +DEFAULT_DOC: all TCP connections will be denied
> 
> I would clarify that with either
> 
> * "all TCP connections to ports with require-proxy-header will be denied" or
> 
> * "proxy_protocol_access deny all"
> 

Done

>> +Any host for which we accept client IP details can place
>> +incorrect information in the relevant header, and Squid
> 
> I would s/for/from/ but perhaps I misunderstand how this works.
> 
> 
>> +tok.skip(Proxy1p0magic);
> 
> We already know the magic is there. If you want to optimize this, then
> skip() in ConnStateData::parseProxyProtocolHeader() and pass the
> Tokenizer object to the version-specific parsers. I do not insist on
> this change, but you may want to add a corresponding TODO if you do not
> want to optimize this.
> 

This is done with the parser method Tokenizer so that if we happen not
to receive the whole header in one read for any reason we can easily
undo and retry on the following read(2).
 Only if the Tokenizer parses completely and fully is the I/O buffer
updated.

We could generate a temporary SBuf and setup the Tokenizer with that.
But it's simpler just to setup the Tokenizer with the existing buffer
and unconditionally skip the magic we already know exists.

> 
>> +if (!tok.prefix(tcpVersion, CharacterSet::ALPHA+CharacterSet::DIGIT))
> 
> That dynamic creation, addition, and destruction of a 256-element vector
> is a significant performance penalty. Please create a static version of
> that alphanumeric(?) CharacterSet instead of computing it at least once
> for every PROXY connection.
> 
> 
>> +const CharacterSet ipChars =  CharacterSet("IP Address",".:") + 
>> CharacterSet::HEXDIG;
> 
> Same here, made worse by adding creation of an intermediate set.
> 
> Please check for other occurrences.

Fixed, and checked.


>> +if (!tok.prefix(tcpVersion, CharacterSet::ALPHA+CharacterSet::DIGIT))
>> +return proxyProtocolError(tok.atEnd()?"PROXY/1.0 error: invalid 
>> protocol family":NULL);
> 
> I recommend removing that code and using
> 
>   if (tok.skip("UNKNOWN")) { ... your UNKNOWN code here ...}
>   else if (tok.skip("TCP")) { ... your TCP code here ...}
> 
> instead.
> 
> 
>> +if (!tcpVersion.cmp("UNKNOWN")) {
>> +} else if (!tcpVersion.cmp("TCP",3)) {
> 
> Please test for the more likely/common condition first.
> 

Agreed. Done.

>> +// skip to first LF (assumes it is part of CRLF)
>> +const SBuf::size_type pos = in.buf.findFirstOf(CharacterSet::LF);
>> +if (pos != SBuf::npos) {
>> +if (in.buf[pos-1] != '\r')
> 
> It would be better to use Tokenizer for this IMO. If you insist on
> manual parsing, consider at least skipping the magic header in the
> findFirstOf() call.
> 

Waiting on the FTP change that makes it easy to settle in trunk.

> 
>> + return proxyProtocolError(tok.atEnd()?"PROXY/1.0 error: missing SP":NULL);
> 
> Please surround the ? and : parts of the operator with spaces for
> readability, especially since the string itself contains 

Re: [PATCH] Native FTP Relay

2014-08-11 Thread Amos Jeffries
On 11/08/2014 11:30 a.m., Alex Rousskov wrote:
> On 08/10/2014 06:11 AM, Amos Jeffries wrote:

>> * please use tok.prefix(CharacterSet::ALPHA) to parse the FTP command
>> instead of BlackSpace. Then explicit delimiter check to validate the input.
>>  - RFC 959 is clear:
>>   "The command codes themselves are alphabetic characters terminated
>>by the character  (Space) if parameters follow and Telnet-EOL
>>otherwise."
> 
> Not changed. I do not see a compelling reason for a general purpose
> relay to police traffic by default. Squid itself does not care if there
> is some other character present in some command name. Why increase the
> chances of breaking what was working without Squid by rejecting commands
> by default?
> 
> Yes, it is possible that some command that Squid does care about would
> have invalid trailing characters in it, preventing Squid from
> recognizing it, but, with your approach, Squid would have to reject that
> command anyway, so we would not break something that would work in a
> policing Squid. In the worst case, we would just make triage more difficult.
> 
> If you insist on Squid rejecting RFC-invalid command names, I will
> change the code, but I suggest leaving it permissive for now.
> 

I disagree be should be quite so tolerant in the current Internet
without an explicit reason. But that is not a blocker on this patch, we
can fix it later.

RFC 959 is quite explicit. Section 5.3 documents what consists a valid
command (alphabet characters only, case insensitive, 4 or fewer).
Also, the basic FTP commands are a fairly well-known set. Everything
else must be listed in FEAT - which we can filter for offerings of
commands not fitting the supported syntax.



+1. Branch looks good enough for merge now. Please do, so I can re-work
the other concurrent submissions on the new
Tokenizer/CharacterSet/ConnStateData APIs.

Amos



Re: RELEASENOTES.html outdated

2014-08-11 Thread Amos Jeffries
On 11/08/2014 6:59 a.m., Christian wrote:> Hi,
>
> link of 'squid-3.4.6' points to outdeated
> 'http://www.squid-cache.org/Versions/v3/3.4/RELEASENOTES.html' which is
> version 3.4.5
>
> seems that Releasenotes.html did not get updated to reflect 3.4.6.
>
> Cheers
>

Thank you. This should be fixed with 3.4.7 when it comes out.

Amos



Re: [PATCH] Native FTP Relay

2014-08-10 Thread Amos Jeffries
On 10/08/2014 2:44 p.m., Alex Rousskov wrote:
> On 08/08/2014 11:13 AM, Amos Jeffries wrote:
>> On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
>>> On 08/08/2014 09:48 AM, Amos Jeffries wrote:
>>>> Audit results (part 1):
>>>> * Please apply CharacterSet updates separately.
>>>> * Please apply Tokenizer API updates separately.
>>>> * please apply src/HttpHdrCc.h changes separately.
> 
>>> Why? If the branch is merged, they will be applied with their own/
>>> existing commit messages.
> 
>> They are updates on previous feature changesets unrelated to FTP which
>> will block future parser alterations (also unrelated to FTP) being
>> backported if this project takes long to stabilize.
>>
>> If they have their own commit messages then cherrypicking out of the
>> branch should be relatively easy.
> 
> Thank you for explaining your rationale. To minimize overheads, let's
> come back to this issue if the branch is not merged soon.
> 
> 
>>>> * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
>>>> AnyP::ProtocolVersion parameters)
> 
>> Okay, leave as-is but please document that as the reason for using that
>> version number
> 
> Done. It was actually documented in the corresponding commit message. To
> provide _source code_ documentation in one place, I added
> Ftp::ProtocolVersion() and used that everywhere instead of hard-coded
> "1,1". It will be easier to track down all "1,1" users that way if
> somebody wants to change or polish this further later. See ftp/Elements.*

Thank you.
NP: I found two related spots that need converting as well. Highlighted
below in this round of audit notes.

> 
> The resulting code required more changes than one may expect and is
> still a little awkward because Http::ProtocolVersion should have been
> just a function (not a class/type), and because PortCfg::setTransport
> internals did not belong to AnyP where they were placed. I did my best
> to avoid cascading changes required to fix all that by leaving
> Http::ProtocolVersion as is and moving PortCfg::setTransport() internals
> into cache_cf.cc where they can access FTP-specific code.
> 
> 
>> and a TODO about cleaning up the message processing to
>> stop assuming HTTP versions.
> 
> Done in one of the many client_side.cc places where we still assume
> HTTP. Here is the new TODO:
> 
>> /* RFC 2616 section 10.5.6 : handle unsupported HTTP major versions 
>> cleanly. */
>> /* We currently only support 0.9, 1.0, 1.1 properly */
>> /* TODO: move HTTP-specific processing into servers/HttpServer and such 
>> */
>> if ( (http_ver.major == 0 && http_ver.minor != 9) ||
>> (http_ver.major > 1) ) {
> 
> 
> This part of the problem will be gone when HTTP-specific code will be
> moved to HttpServer, but the ICAP compatibility part will not go away
> regardless of any cleanup.

The ICAP compatibility issue is a bug in ICAP or specific ICAP servers
that needs fixing there. HTTP itself can already (or shortly will) have
present "HTTP/2.0", "h2" or "h2-NN" message versions to be encapsulated.


> 
>> in src/Server.cc:
>> * this change looks semantically wrong:
>>-if (!doneWithServer()) {
>>+if (mayReadVirginReplyBody()) {
>>  closeServer();
> 
> The change itself does not make the code look semantically wrong. On the
> semantics level, the code looks about the same before and after the change:
> 
> Was: If we are not done with the server, close it.
> Now: If we are still reading from the server, close it.

It is now *apparently* skipping all the possible non-reply-reading
reasons for closing the server connection. That is semantic change...

> 
>> in src/Server.h:
>> * I'm not sure the call order of doneWithServer() and
>> mayReadVirginReplyBody() is the right way around. 
> 
> Sorry, I am not sure what you mean by "call order" in src/Server.h
> context. That file does not determine the order of those two calls.

... but mayReadVirginReplyBody() is calling doneWithServer() in the HTTP
case. You clarified why its done below and I am certain now that it is
indeed wrong, although necessary as a temporary workaround until the
non-reply-reading cases are checked and fixed.

> 
>> It seems like "being
>> done completely with a server" includes "whether more reply is coming"
>> as one sub-condition of several (see above), not the reverse.
> 
> Correct. As the commit message quoted above tries to explain, FTP needed
> to distinguish two cases that the old code merged toge

Re: [PATCH] OAuth 2.0 Bearer authentication

2014-08-09 Thread Amos Jeffries
On 5/08/2014 3:22 a.m., Alex Rousskov wrote:
> On 07/31/2014 03:29 AM, Amos Jeffries wrote:
> 
>>  A garbage collection TTL "cleanup_interval" is configurable and removes
>> cache entries which have been stale for at least 1 hr.
> 
> While some old code still uses periodic cleanup, I think we should avoid
> adding more code like that. Periodic cleanup leads to cache overflows,
> memory usage surprises, and hot idle. Please remove old unneeded cache
> entries when adding a new cache entry instead of checking the cache
> state every hour.

I agree. However this is a common algorithm for all of Squid
authentication types. Updating it should be done as a separate action
and cover more than just this auth scheme. In particular the core cache
code is shared by Basic and Digest.


>> +// only clear tokens out of cache after 1 hour stale
>> +// could make this configurable
>> +time_t stalenessLimit = current_time.tv_sec - 60*60;
> 
> Cache size should be(**) regulated in terms of memory usage (bytes), not
> entry age (seconds). I believe estimating memory footprint of a cache
> entry would be fairly straightforward in this case. Would you be willing
> to convert the code to use size limits?

I agree with the idea. If/when we can move to LruMap or equivalent that
will happen. For now this is consistent with the other auth schemes, and
sufficient as a temporary state.


The memory calculation is also not as simple as it seems. The SBuf
content consists of the majority of memory usage and is dynamic at
run-time. What is actually needed to account the memory size of a
TokenCache node is:

 sizeof(TokenPointer)
  +
 sizeof(Token)
  +
 (*X)->b64encoded.store_->capacity

Note that:
 1) store_ is correctly private and SBuf::length() is not sufficient to
account for most of the allocated MemBlob capacity being consumed.
  The Token may be holding 64KB I/O buffer MemBlob for access to 10
bytes of token string.

 2) the store_->capacity may be shared by Token, so freeing just this
entry may not reduce memory usage by amount it "uses". Accounting for it
may have caused the cache size to be over-estimated.

 3) the whole calculation is hard-coded at compile time in LruMap. So we
cannot call the SBuf size methods anyway.


I notice that the current use of LruMap for storing
CertValidationResponse objects is also suffering from this problem. It
fails to account for the data size used by a whole vector of error
strings, and all of a X509 certificate memory allocation (we know those
are huge with lots of linked details).
  Instead it assumes the cert is sizeof(X509*) and the string content is
sizeof(std::vector) which is a KB or more away from reality
per-entry.


>> +/*
>> + * For big caches we could consider stepping through
>> + * 100/200 entries at a time. Lets see how it flies
>> + * first.
>> + */
>> +Auth::Bearer::TokenCache *cache = &Auth::Bearer::Token::Cache;
>> +for (Auth::Bearer::TokenCache::iterator itr = cache->begin(); itr != 
>> cache->end();) {
> 
> I do not think we should do linear search like that. It is going to be
> too slow for large caches and adding arbitrary "maximum number of
> iterations" limits just adds more tuning headaches. Why not use an LRU
> cache instead? IIRC, we have even added LruMap class for that recently.
> 

No less efficient than the linear walk the LruMap does. A full scan is
required for the time-based removal since entries are mapped by their
key value not by expiry time.


LruMap also uses raw pointers for storing its entries. These Token
objects are ref-counted and if we store as raw pointers they will either
leak or be freed while still linked by the cache.
 The only way around that seems to be storing Pointers to the list and
making every access a double de-reference. Or re-implementing the entire
LruMap template using Pointer and a dynamic size() calculation.

The existing cache model used by both Basic and Digest already is
sufficient for the needs of this feature. Note that the periodic cleanup
is an optimization for large caches, to reduce the impact of this linear
search for stale objects instead of


> 
>> +For Basic and Bearer the default is "Squid proxy-caching web server".
> 
> Please add a comma after Bearer.

Done.

Amos



Re: [PATCH] Kerberos configure patch + some cleanup

2014-08-08 Thread Amos Jeffries
On 9/08/2014 8:32 a.m., Markus Moeller wrote:
> It should be in there or did I miss some ?

The original bits are still there in the patch copy mailed to the list.

Specifically in helpers/external_acl/kerberos_ldap_group/support_ldap.cc
get_bin_attributes() bits I can see :

* redux function setup:

+LDAPMessage *msg;
+char **attr_value = NULL;
+int *attr_len=NULL;
+size_t max_attr = 0;
+
+attr_value = *ret_value;
+attr_len = *ret_len;

should be:
+char **attr_value = *ret_value;
+int *attr_len = *ret_len;
+size_t max_attr = 0;


* main for loop:
  - for (msg = ldap_first_entry
+ for (LDAPMessage *msg = ldap_first_entry

* drop these:
  BerElement *b;
  char *attr;


* switch case should be:
case LDAP_RES_SEARCH_ENTRY:
{
  BerElement *b = NULL;
  ...
  ber_free(b, 0);
} break;


* for loops inside that switch case should be:

 - for (attr = ldap_first_attribute...
+ for (char *attr = ldap_first_attribute...

 - int il; for (il = 0; ...
+ for (int il = 0; ...


Otherwise it looks okay.

Amos

> 
> Markus
> 
> -Original Message----- From: Amos Jeffries Sent: Friday, August 08,
> 2014 1:28 PM To: squid-dev@squid-cache.org ; Markus Moeller Subject: Re:
> [PATCH] Kerberos configure patch + some cleanup
> On 8/08/2014 8:02 a.m., Markus Moeller wrote:
>> Are there any objections to this patch ?
> 
> The audit results from me I accidentally sent in private.
> Do you have an updated patch with those fixes?
> 
> Amos
> 
> 
> 



Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 9/08/2014 4:57 a.m., Alex Rousskov wrote:
> On 08/08/2014 09:48 AM, Amos Jeffries wrote:

>>
>> Audit results (part 1):
>>
>> * Please apply CharacterSet updates separately.
>>
>> * Please apply Tokenizer API updates separately.
>>
>> * please apply src/HttpHdrCc.h changes separately.
> 
> Why? If the branch is merged, they will be applied with their own/
> existing commit messages.
> 

They are updates on previous feature changesets unrelated to FTP which
will block future parser alterations (also unrelated to FTP) being
backported if this project takes long to stabilize.

If they have their own commit messages then cherrypicking out of the
branch should be relatively easy.

> 
>> * is FTP/1.1 really the protocol version of FTP parsed by Squid? (the
>> AnyP::ProtocolVersion parameters)
> ...
>>  - NP: 0,0 version should be usable and probably best since there does
>> not appear to be any actual official version numbering for FTP.
> 
> Since there is no "any actual official version numbering for FTP", we
> are using what works best. Version 1.1 works best because it causes
> fewer FTP exceptions in the general code because FTP control connections
> are persistent by default, just like HTTP/1.1 connections. I think there
> is a comment about that.
> 
> Using 0.0 would probably create more exceptions. 0.0 will most likely
> also screw up some ICAP services that expect /1.0 or /1.1 when parsing
> headers.
> 
> After reading the above arguments, do you still want us to switch to 0.0?
> 

Yuck. Messy.

Okay, leave as-is but please document that as the reason for using that
version number and a TODO about cleaning up the message processing to
stop assuming HTTP versions.

> 
>> in src/cf.data.pre:
>> * please document tproxy, name=, protocol=FTP as supported on ftp_port
> 
> We do not know whether it is supported and do not plan on testing/fixing
> that support right now. Do you still want us to document it as supported?
> 

Yes. You have not changed the TcpAcceptor code for TPROXY lookups or the
ACL code using port name. So they are supported. protocol= you
explicitly added support in the parser.

> 
>> in src/client_side.h:
>> * what is "waiting for future HttpsServer home" meant to mean on
>> postHttpsAccept()
> 
> Unlike HTTP and FTP, there is currently no object dedicated to serving
> HTTPS connections. HttpServer is misused for that on one occasion, but a
> lot more work is needed to properly move HTTPS code from ConnStateData
> into severs/HttpsServer.*  That work is unrelated to FTP.

Okay. Thanks.

Amos


Re: [PATCH] Native FTP Relay

2014-08-08 Thread Amos Jeffries
On 8/08/2014 3:29 p.m., Alex Rousskov wrote:
> Hello,
> 
> Native FTP Relay support is finally ready: Squid receives native FTP
> commands on ftp_port and relays FTP messages between FTP clients and FTP
> origin servers through the existing Squid access control, adaptation,
> and logging layers. A compressed 70KB patch attached to this email is
> also temporary available at [1]. I would like to merge the corresponding
> lp branch[2] into Squid trunk.
> 
> This is a large, complex, experimental feature. I am sure there are
> cases where it does not work well or does not support some existing
> features. I am not aware of any regressions specific to old FTP gateway
> code, and I hope there are no regressions specific to HTTP code, but I
> do not recommend deployment without testing.
> 
> The branch code worked quite well in limited production deployment, but
> the final version (with code restructuring and polishing for the
> official submission) has only seen simple lab testing. AFAIK, the FTP
> interception path has not seen any production deployment at all.
> 
> This code represents more than a year worth of development, started by
> Dmitry Kurochkin in the beginning of 2013. Christos Tsantilas and I
> finished Dmitry's work. I bear responsibility for the bugs, but there
> are probably areas of Dmitry's code that appear to work well and that
> neither Christos nor I have studied.
> 
> Overall, we tried to focus on the new FTP code and leave the old
> FTP/HTTP code alone, except where restructuring was necessary to avoid
> code duplication. For example, the server-facing side reuses a lot of
> the old FTP code, while the relayed FTP commands are passed around in
> HttpMsg structures using the old ClientStreams, doCallout(), and Store
> interfaces.
> 
> If you review the patch, please note that bzr is incapable of tracking
> code across file splits (e.g., old ftp.cc becoming ftp/Parsing.cc plus
> three clients/Ftp*.cc files). Many of the old XXXs and TODOs will appear
> as newly added code in the patch. Usually, one can figure it out by
> searching for the similar but deleted code in the patch.
> 
> Please see the above referenced lp branch for a detailed change log.
> 
> 
> Some Native FTP Relay highlights:
> 
> * Added ftp_port directive telling Squid to relay native FTP commands.
> * Active and passive FTP support on the user-facing side; require
>   passive connections to come from the control connection src IP.
> * IPv6 support (EPSV and, on the user-facing side, EPRT).
> * Intelligent adaptation of relayed FTP FEAT responses.
> * Relaying of multi-line FTP control responses using various formats.
> * Support relaying of FTP MLSD and MLST commands (RFC 3659).
> * Several Microsoft FTP server compatibility features.
> * ICAP/eCAP support (at individual FTP command/response level).
> * Optional "current FTP directory" tracking (cannot be 100% reliable due
>   to symbolic links and such, but is helpful in some common use cases).
> * FTP origin control connection is pinned to the FTP user connection.
> * No caching support -- no reliable Request URIs for that (see above).
> * Significant FTP code restructuring on the server-facing side.
> * Initial steps towards HTTP code restructuring on the client-facing
>   side.
> 
> 
> Changes to the general code used by the Native FTP Relay code:
> 
> 
>> * The user- and origin-facing code restructured as agreed previously on
>>   this mailing list. I will email some thoughts about the results separately,
>>   but here is the executive summary:
>>
>> src/servers/FtpServer.*  # new FTP server, relaying FTP
>> src/servers/HttpServer.* # old ConnStateData parts conflicting with 
>> FtpServer
>> src/clients/FtpClient.*  # code shared by old and new FTP clients
>> src/clients/FtpGateway.* # old FTP client, translating back to HTTP
>> src/clients/FtpRelay.*   # new FTP client, relaying FTP
>> src/ftp/*# FTP stuff shared by clients and servers
>>
>>
>> * Fixed HttpHdr::Private/NoCache(v) implementations and optimized their API.
>>
>> * Tokenizer fixes and API improvements:
>>
>>   Taught Tokenizer to keep track of the number of parsed bytes. Many callers
>>   need to know that because they need to adjust/consume I/O offsets/buffers.
>>   
>>   Adjusted unused Parser::Tokenizer::token() to not treat NUL delimiter
>>   specially. Besides the fact that not all grammars can treat NUL that way, 
>> the
>>   special NUL treatment resulted in some token() calls returning true for 
>> empty
>>   tokens, which was confusing parsers. Callers that do not require trailing
>>   delimiters, should use prefix() instead. This change is based on experience
>>   writing Tokenizer-based FTP parser, although the final parser code uses
>>   prefix() instead of token(), for unrelated reasons.
>>   
>>   Split Parser::Tokenizer::skip(set) into skipOne() and skipAll(). All other
>>   skip() methods skip exactly one thing (either a given character or a gi

Re: [PATCH] Kerberos configure patch + some cleanup

2014-08-08 Thread Amos Jeffries
On 8/08/2014 8:02 a.m., Markus Moeller wrote:
> Are there any objections to this patch ?

The audit results from me I accidentally sent in private.
 Do you have an updated patch with those fixes?

Amos



Re: squid on debian (dst and dstdomain)

2014-08-06 Thread Amos Jeffries
On 7/08/2014 2:30 a.m., Fred Maranhão wrote:
> 2014-08-06 10:36 GMT-03:00 Kinkie wrote:
>> On Wed, Aug 6, 2014 at 2:10 PM, Fred Maranhão wrote:
>>> Hi,
>>>
>>> I just updated my squid in an debian stable box and after the update
>>> everything works fine. but I restart my squid and it not started.
>>>
>>> I detected a (very old) line in /etc/squid3/squid.conf with an error,
>>> but propably an warning till the last version and now an error.
>>>
>>> that was the old line:
>>>
>>> acl site_exemplo dst exemplo.com
>>>
>>> and even wrong, the squid worked normally.
>>>
>>> now I changed this to
>>>
>>> acl site_exemplo dstdomain exemplo.com
>>>
>>> and everything is ok. is this a bug?
>>
>> You slightly changed the semantics of your configuration.
>>
>> acl site_exemplo dst exemplo.com
>>
>> This is is a per-destination-ip-address ACL: it would resolve
>> exemplo.com to its ip address, and then anything matching that
>> destination ip address would be a positive for that ACL, even if it
>> was www.exemplo.com or www.microsoft.com.
>>
>> acl site_exemplo dstdomain exemplo.com
>>
>> This is a per-destination-host-name ACL: it maches URLS which contain
>> _exactly_ "exemplo.com" as hostname part, for instance
>> http://exemplo.com/, http://exemplo.com/index.html and so on.
>> It will _not_ match www.exemplo.com, exemplo.com's IP address or
>> www.microsoft.com even if it is on the same IP address.
>>
>> Only you can know what you want to achieve; please refer to the
>> documentation of the "acl" directive for further details and
>> variations.
> 
> but this is not the importante issue. the important issue is that, in
> my instalation, squid doesn't start with the dst syntax. is there
> anyone using (updated) debian stable in here? the problem is just with
> me?
> 

I am using the latest pre-release while Luigi sorts out packaging
upgrades. There are also several thousand others like yourself using the
released .deb package.

I dont think its the package particularly. However dst ACL requires
working DNS at startup in order to do that resolving Kinkie mentioned.
Check that Squid is able to resolve that domain, the OS gethostbyname()
is used.

Amos


Re: [PATCH] Support PROXY protocol

2014-08-05 Thread Amos Jeffries
On 5/08/2014 2:47 a.m., Alex Rousskov wrote:
> On 07/30/2014 09:02 AM, Amos Jeffries wrote:
> 
>> +NAME: proxy_forwarded_access follow_x_forwarded_for
> 
>>  Requests may pass through a chain of several other proxies
>> +before reaching us. The original source details may by sent in:
>> +* HTTP message Forwarded header, or
>> +* HTTP message X-Forwarded-For header, or
>> +* PROXY protocol connection header.
> 
> ...
> 
>> +For proxy-surrogate ports an allow match is required for Squid to
>> +permit the corresponding TCP connection, before Squid even looks for
>> +HTTP request headers. If there is an allow match, Squid starts using
>> +PROXY header information to determine the source address of the
>> +connection for all future ACL checks.
> 
>> +On each HTTP request Squid checks for X-Forwarded-For header fields.
> 
> 
> Given the "first evaluate proxy_forwarded_access for connection, then
> evaluate proxy_forwarded_access for each HTTP request header"
> functionality described above I wonder whether it is a good idea to
> merge both evaluations into one directive. Would not it be easier for
> admin to write correct ACL rules if we keep them separate?
> 
> For example, let's assume an admin wants to trust a PROXY client device
> at (and only at) address A and XFF-setting child proxy at (and only at)
> address B. If we split the functionality into proxy_forwarded_access and
> follow_x_forwarded_for, then the admin can do:
> 
>   proxy_forwarded_access allow A
>   #proxy_forwarded_access deny all
> 
>   follow_x_forwarded_for allow B
>   #follow_x_forwarded_for deny B
> 
> Right? What about the merged implementation proposed in the patch? How
> can the admin express the above logic? AFAICT, the following will _not_
> work:
> 
>   proxy_forwarded_access allow A
>   proxy_forwarded_access allow B
>   #proxy_forwarded_access deny all
> 
> because it will allow PROXY clients at B and XFF setting by A.
> 

Good point. I have been considering these lookups as stateless. However,
XFF does offer the unusual property that regex can check for specific
ordering of IPs. We do not have that with PROXY+XFF.

I am adding proxy_protocol_access as the first access control, reverting
follow_x_forwarded_for for the second. But retaining the new description
texts.

Also for lack of anything better in the last week I am using
require-proxy-header for the port option.


New patch attached.

Amos
=== modified file 'doc/release-notes/release-3.5.sgml'
--- doc/release-notes/release-3.5.sgml  2014-08-02 14:03:21 +
+++ doc/release-notes/release-3.5.sgml  2014-08-05 15:23:59 +
@@ -26,40 +26,41 @@
 Known issues
 
 Although this release is deemed good enough for use in many setups, please 
note the existence of 
 http://bugs.squid-cache.org/buglist.cgi?query_format=advanced&product=Squid&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&version=3.5";
 name="open bugs against Squid-3.5">.
 
 Changes since earlier releases of Squid-3.5
 
 The 3.5 change history can be http://www.squid-cache.org/Versions/v3/3.5/changesets/"; name="viewed 
here">.
 
 
 Major new features since Squid-3.4
 Squid 3.5 represents a new feature release above 3.4.
 
 The most important of these new features are:
 
Support libecap v1.0
Authentication helper query extensions
Support named services
Upgraded squidclient tool
Helper support for concurrency channels
+   Receive PROXY protocol, Versions 1 & 2
 
 
 Most user-facing changes are reflected in squid.conf (see below).
 
 
 Support libecap v1.0
 Details at http://wiki.squid-cache.org/Features/BLAH";>.
 
 The new libecap version allows Squid to better check the version of
   the eCAP adapter being loaded as well as the version of the eCAP library
   being used.
 
 Squid-3.5 can support eCAP adapters built with libecap v1.0,
but no longer supports adapters built with earlier libecap versions
due to API changes.
 
 
 Authentication helper query extensions
 Details at http://www.squid-cache.org/Doc/config/auth_param/";>.
 
@@ -146,71 +147,111 @@
The default is to use X.509 certificate encryption instead.
 
 When performing TLS/SSL server certificates are always verified, the
results shown at debug level 3. The encrypted type is displayed at debug
level 2 and the connection is used to send and receive the messages
regardless of verification results.
 
 
 Helper support for concurrency channels
 Helper concurrency greatly reduces the communication lag between Squid
and its helpers allowing faster transaction speeds even on sequential
helpers.
 
 The Digest au

[PATCH] OAuth 2.0 Bearer authentication

2014-07-31 Thread Amos Jeffries
RFC 6750 OAuth 2.0 Authorization Framework: Bearer Token Usage

The attached patch adds a minimal implementation of Bearer
authentication scheme to Squid. It consists of three components:

1) Squid build system infrastructure for building Bearer authentication

2) A testing fake-auth helper (bearer_fake_auth).

Helper which takes Bearer helper input an always returns OK.

3) Bearer authentication library ("module") for Squid.

 * implements the logics for squid.conf "Bearer" auth_param scheme and
necessary configuration options.

 * implements the helper management and API for Bearer helpers.

 * implements logics for www-auth and proxy-auth header parsing and
generating.

At present no restriction between HTTP and HTTPS is defined by Squid.
Challenges will be made for both. It is left to the client to ensure
adequate security on the connection it sends Bearer tokens.

 * implements helper driven TTL for token caching.

Due to significant security risks with Bearer tokens the TTL is not
configurable from squid.conf. Instead the helper is expected to provide
a ttl= parameter from the auth backend explicitly determining the time
in seconds for which each response may be cached and re-used. In absence
of ttl= value the helper response is treated as already expired (a nonce).
 A garbage collection TTL "cleanup_interval" is configurable and removes
cache entries which have been stale for at least 1 hr.


 * uses a default token scope of "proxy:HTTP" for generic HTTP proxies



NOTES:
 * At present no web browsers implement Bearer authentication in
response to a proxy-authenticate challenge.
  - However some of the common browsers should support Bearer
authentication with reverse proxies over HTTPS (Firefox and IE
apparently, not Chrome).
  - command line tools and AJAX / XHR implementations which allow header
customisation can be scripted to support Bearer.

 * This is only a minimal implementation, emitting only the realm= and
scope= parameters to clients.
 - The key_extras mechanism can be used to pass extension client request
parameters to the Bearer helper.
 - Extension parameters in Squid responses is not supported.

 * Bearer authentication to cache_peers is not supported explicitly.
  - implicit support exists with login=PASSTHRU, which may be used to
relay Bearer tokens for SSO to multiple proxies.

Amos
=== modified file 'configure.ac'
--- configure.ac2014-07-13 08:49:42 +
+++ configure.ac2014-07-30 23:59:58 +
@@ -1765,40 +1765,53 @@
 SQUID_YESNO([$enableval],
 [unrecognized argument to --enable-auth: $enableval])
 ])
 AC_MSG_NOTICE([Authentication support enabled: ${enable_auth:=yes}])
 SQUID_DEFINE_BOOL(USE_AUTH,$enable_auth,[Enable support for authentication])
 AM_CONDITIONAL(ENABLE_AUTH, test "x$enable_auth" != "xno")
 AUTH_MODULES=""
 
 AC_ARG_ENABLE(auth-basic,
   AS_HELP_STRING([--enable-auth-basic="list of helpers"],
  [Enable the basic authentication scheme, and build the specified helpers.
   Not providing an explicit list of helpers will attempt build of
   all possible helpers. Default is to do so.
   To disable the basic authentication scheme, use --disable-auth-basic.
   To enable but build no helpers, specify "none".
   To see available helpers, see the helpers/basic_auth directory. ]),[
 #nothing to do really
 ])
 m4_include([helpers/basic_auth/modules.m4])
 
+AC_ARG_ENABLE(auth-bearer,
+  AS_HELP_STRING([--enable-auth-bearer="list of helpers"],
+ [Enable the OAuth 2.0 Bearer authentication scheme, and build the
+  specified helpers.
+  Not providing an explicit list of helpers will attempt build of
+  all possible helpers. Default is to do so.
+  To disable the Bearer authentication scheme, use --disable-auth-bearer.
+  To enable but build no helpers, specify "none".
+  To see available helpers, see the helpers/bearer_auth directory. ]),[
+#nothing to do really
+])
+m4_include([helpers/bearer_auth/modules.m4])
+
 AC_ARG_ENABLE(auth-ntlm,
   AS_HELP_STRING([--enable-auth-ntlm="list of helpers"],
  [Enable the NTLM authentication scheme, and build the specified helpers.
   Not providing an explicit list of helpers will attempt build of
   all possible helpers. Default is to do so.
   To disable the NTLM authentication scheme, use --disable-auth-ntlm.
   To enable but build no helpers, specify "none".
   To see available helpers, see the helpers/ntlm_auth directory. ]),[
 ])
 m4_include([helpers/ntlm_auth/modules.m4])
 
 AC_ARG_ENABLE(auth-negotiate,
   AS_HELP_STRING([--enable-auth-negotiate="list of helpers"],
  [Enable the Negotiate authentication scheme, and build the specified 
   helpers.
   Not providing an explicit list of helpers will attempt build of
   all possible helpers. Default is to do so.
   To disable the Negotiate authentication scheme, 
   use --disable-auth-negotiate.
   To enable but build no helpers, specify "none".
@@ -3446,40 

Re: /bzr/squid3/trunk/ r13517: Fix %USER_CA_CERT_* and %CA_CERT_ external_acl formating codes

2014-07-30 Thread Amos Jeffries
Hi Christos,

Can you confirm or deny for me that these %USER_CERT_* macros map to the
%ssl::>cert_* logformat codes?

Their existence is one of the outstanding issues with external_acl_type
upgrade to logformat.

Cheers
Amos

On 31/07/2014 3:31 a.m., Christos Tsantilas wrote:
> 
> revno: 13517
> committer: Christos Tsantilas 
> branch nick: trunk
> timestamp: Wed 2014-07-30 18:31:10 +0300
> message:
>   Fix %USER_CA_CERT_* and %CA_CERT_ external_acl formating codes
>   
> * The attribute part of the %USER_CA_CERT_xx and %CA_CERT_xx formating 
> codes
>   is not parsed correctly, make these formating codes useless.
> * The %USER_CA_CERT_xx documented wrongly
> modified:
>   src/cf.data.pre
>   src/external_acl.cc
> 



Re: [PATCH] Support PROXY protocol

2014-07-30 Thread Amos Jeffries
On 27/07/2014 6:18 a.m., Alex Rousskov wrote:
> On 07/25/2014 08:27 PM, Amos Jeffries wrote:
> 
>> +// detect and parse PROXY protocol version 1 header
>> +if (in.buf.length() > Proxy10magic.length() && 
>> in.buf.startsWith(Proxy10magic)) {
>> + return parseProxy10();
>> +
>> +// detect and parse PROXY protocol version 2 header
>> +} else if (in.buf.length() > Proxy20magic.length() && 
>> in.buf.startsWith(Proxy20magic)) {
>> +return parseProxy20();
>> +
>> +// detect and terminate other protocols
>> +} else if (in.buf.length() >= Proxy20magic.length()) {
>> +// input other than the PROXY header is a protocol error
>> +return proxyProtocolError("PROXY protocol error: invalid header");
>> +}
> 
> AFAICT, the above code declares an error on a valid PROXY header if
> Squid happens to read exactly Proxy20magic bytes first. I recommend
> removing the "optimization" from the first two if conditions, engaging
> the parser as soon as we get enough info to determine the protocol:
> 
> {
> if (in.buf.startsWith(Proxy20magic))
> return parseProxy20();
> 
> if (in.buf.startsWith(Proxy10magic))
> return parseProxy10();
> 
> if (in.buf.length() >= Proxy20magic.length()) {
> // 1.0 magic is shorter, so we know that
> // the input does not start with any PROXY magic
> return proxyProtocolError("PROXY protocol error:...");
> }
> 
> // need more data
> }
> 
> The above avoids checking length twice (in common cases) and also places
> more common(?) case of a 2.0 protocol version first, so it should work
> faster than the patch code, on average :-).

Done.

>> +Squid currently supports receiving HTTP traffic from a client proxy 
>> using this protocol.
>> +   An http_port which has been configured to receive this protocol may only 
>> be used to
>> +   receive traffic from client software sending in this protocol.
>> +   Regular forward-proxy HTTP traffic is not accepted.
> 
> I suggest rephrasing the last sentence to avoid a misinterpretation that
> the port cannot be used for forward-proxy traffic at all. Something
> along these lines:
> 
> ...
> + receive traffic from clients sending the PROXY protocol header.
> + HTTP traffic without the PROXY header is not accepted on such a port.
> 

Done.

>> +/// parse the PROXY/2.0 protocol header from the connection read buffer
>> +bool
>> +ConnStateData::parseProxy20()
> 
> Consider using parseProxy2p0() and similar names in case the PROXY
> protocol goes to version ten :-)
> 

Shrug. Done. The '0' is an addition by me to match standard protocols
version ABNF.

> 
>> On 15/07/2014 4:25 a.m., Alex Rousskov wrote:
>>> On 07/12/2014 10:45 PM, Amos Jeffries wrote:
>>>>>> -NAME: follow_x_forwarded_for
>>>>>> +NAME: proxy_forwarded_access follow_x_forwarded_for
> 
>>>>> The new name sounds worse than the old one. Hopefully this can be left
>>>>> as is or renamed to something better after the "proxy-surrogate" issue
>>>>> is resolved.
> 
>>>> Is "forwarded_access" obectionable ?
> 
>>> IMO, it is misleading/awkward. Follow_x_forwarded_for was pretty good.
>>> We can use something like follow_forwarded_client_info or
>>> trust_relayed_client_info, but let's wait for the primary naming issue
>>> to be resolved first. That resolution might help us here.
> 
> 
> Any news on the renaming front? Does it make sense for us to wait for a
> better protocol name or is it hopeless?

Still might happen at RFC creation time, but that is not in sight. So
hopeless to wait.

So, I've branched that discussion in the other half of this thread.

> 
>>> I see two correct ways to address the https_port problem:
>>>
>>> 1. Refuse PROXY support on https_port for now. As any support
>>> limitation, this is unfortunate, but I think it should be accepted. The
>>> configuration code should be adjusted to reject Squid configurations
>>> that use proxy-surrogate with an https_port.
> 
>> Doing this one. I have no desire for re-writing all the SSL negotiation
>> logics right now.
> 
>> +debugs(3,DBG_CRITICAL, "FATAL: https_port: proxy-surrogate 
>> option cannot be used on HTTPS ports.");
> 
> Please s/cannot be used/is not supported/ to emphasize that we just lack
> the code necessary to make it possible (i.e., this is not some kind of
> fundamenta

Renaming to Support PROXY protocol

2014-07-30 Thread Amos Jeffries
Renaming the sub-thread to keep discussion going on this aspect.

On 26/07/2014 8:57 p.m., Amos Jeffries wrote:
> On 22/06/2014 5:15 p.m., Amos Jeffries wrote:
>> Support receiving PROXY protocol version 1 and 2.
>>
>> PROXY protocol has been developed by Willy Tarreau of HAProxy for
>> communicating original src and dst IP:port details between proxies and
>> load balancers in a protocol-agnostic way.
>>
>> stunnel, HAProxy and some other HTTP proxying software are already
>> enabled and by adding support to Squid we can effectively chain these
>> proxies without having to rely on X-Forwarded-For headers.
>>
>> This patch adds http(s)_port mode flag (proxy-surrogate) to signal the
>> protocol is in use, parsing and processing logics for the PROXY protocol
>> headers on new connections, and extends the follow_x_forwarded_for
>> (renamed proxy_forwarded_access) access control to manage inbound
>> connections.
>>  The indirect client security/trust model remains unchanged. As do all
>> HTTP related logics on the connection once PROXY protocol header has
>> been received.
>>
>>
>> Furture Work:
>>  * support sending PROXY protocol to cache_peers
>>  * rework the PROXY parse logics as a Parser-NG child parser.
>>
>> Amos
>>
> 
> 
> So on the table the question of the http_port option name (and derived
> from that the *_access control name).
> 
> The contenders so far:
> 
>   proxy
>   surrogate [1]
>   proxy-surrogate [1]
>   require-PROXY
>   expect-PROXY [2]
>   require-PROXY-header
>   expect-PROXY-header [2]
>   forwarded [3]
>   proxy-forwarded [3]
>   haproxy-protocol[4]
>   indirect-client
> 
> 
> [1] potential naming confusion with Surrogate protocol HTTP extension.
> And Alex objects that it means "nothing" in this squid context.
> 
> [2] potential naming confusion with "explicit proxy" terminology
> 
> [3] potential naming confusion with "forward proxy" terminology
> 
> [4] free advertising for the competition
> 
> At this stage it looks like Alexs' "require-proxy-header" is front
> runner for relevance. Probably with "indirect_client" for the access
> control.
> 
> Does anyone else have optin names or even just words to throw into the mix?
> 
> Amos
> 



Re: [PATCH] StoreID via eCAP and ICAP

2014-07-29 Thread Amos Jeffries
On 30/07/2014 1:09 a.m., Eliezer Croitoru wrote:
> Any examples of a header description would help to understand what is
> needed for IANA registration.

Any of the documents linked from the registry can be used as an example.

It would be worth checking ICAP protocol spec for anything about new
header requirements or recommendations.

I know HTTP, mail and netnews have some messaging details that need to
be condsidered and mentioned for new headers. But this is ICAP header so
things are different.


PS. though I am happy with Archived-At if you want to avoid work.

Amos



Re: [PATCH] StoreID via eCAP and ICAP

2014-07-29 Thread Amos Jeffries
On 28/06/2014 11:54 a.m., Alex Rousskov wrote:
> Hello,
> 
> The attached patch allows eCAP and ICAP services to set StoreID
> values, just like many existing store_id_program helpers do. To avoid
> surprises from rogue services, the optional behavior must be enabled in
> squid.conf by adding "store-id=1" to the e/icap_service directive.
> Adaptation services are executed before the StoreID helpers but they may
> co-exist. The latest StoreID value wins.
> 
> The patch also merged eCAP and ICAP code that updates adaptation
> history, resolving an old TODO.
> 
> Disclaimer: After these changes, options returned by broken eCAP
> adapters that implemented adapter::Xaction::option() without also
> implementing adapter::Xaction::visitEachOption() will stop working.
> Similarly, options returned by broken eCAP adapters that implemented
> visitEachOption() without also implementing option() may start working.
> I do not think this is a big deal because the loss of backward
> compatibility is limited to broken adapters that would have to be
> rebuild for newer Squid/libecap anyway.
> 

Audit:


in src/adaptation/History.h:
* store_id member is not following the camelCase member naming for this
class.


in src/cf.data.pre:
* please omit the 0|1 values from the documentation. on|off is
sufficient for new option parameters and allows non-numeric
implementation in future.


in src/client_side_request.cc:
* s/emtpy/empty/


+1. The above can be done and the patch applied without another audit
review IMO.


> 
> I used "Store-ID" for the name of the meta header (an ICAP response
> header or an eCAP option name; not an HTTP header!). Other alternatives
> considered but rejected:
> 
> Store-Id: Most registered MIME headers use -ID, not -Id.
> http://www.iana.org/assignments/message-headers/message-headers.xhtml
> 
> X-Store-ID: Trying to avoid adding X-Store-ID now and then renaming it
> to Store-ID ten years later, while adding more code for backward
> compatibility reasons.

Use of X- prefix for experimental headers is officially frowned upon by
IETF and IANA for the same reasons. I agree with not going there.


> The Store-ID header is not registered. I am not
> going to write an Internet Draft to describe it, but somebody could try
> to submit a _provisional_ Store-ID IANA registration that does not
> require a Draft or an RFC AFAICT. However, the header would still need a
> description on the web somewhere. Perhaps somebody can volunteer to take
> a lead on that?
> 
> Archived-At: This is a registered header that is almost perfect for our
> needs. The was worried that other developers would object to using a
> completely different name compared to StoreID helpers (and the feature
> name itself).

I'm not worried about the naming. The semantics fit. If you want to use
this it makes a lot of sense and saves all the work of documenting an
identical header.

Amos


Re: Squid 3.5 release timetable

2014-07-28 Thread Amos Jeffries
On 29/07/2014 6:22 a.m., Alex Rousskov wrote:
> On 07/24/2014 10:37 PM, Amos Jeffries wrote:
> 
>> If any of you have patches lined up for commit or
>> about to be, please reply to this mail with details so we can triage
>> what gets in and what can hold off in audit.
> 
> 
> Here are some of the bigger items on the Factory plate:
> 
> * Native FTP support (works great; still working on class renaming)

Can you get that to audit this week?


> * Peek and Splice (now with SNI; probably ready but may need polishing)
> * StoreID via eCAP/ICAP (waiting for audit or Store-ID naming comments)
> * on-disk header updates or Bug #7 (probably more than two weeks out)

Bug #7 is squid-2.7 parity fix, so a candidate for backporting.

> * Store API polishing (no ETA; depends on bug #7)
> * post-cache REQMOD support (may be available late August 2014)
> 
> 
> Judging by the trunk items listed on the RoadMap wiki page, the only
> major v3.5 improvement already available is Large Rock, so getting at
> least the first two items in would be necessary to justify the
> maintenance/porting overheads of a new Squid version IMO.

Thank you for that list.


FYI there are several goal metrics informing the decision to branch:

* major user visible features/changes
 - Going by user visibility are: Large Rock, Collapsed Forwarding, and
helper *_extras. Possibly also eCAP 1.0.
 - This has been the main criteria previously however after board
discussions I am convinced to give this lower priority and I am no
longer waiting on a particular feature count threshold. What we have now
seems good enough combined with the below criteria.

* annual stable cycle, (aka. ensuring sponsors wait no more than 1 year
for their features to get into a stable production release)
 - initial 3.5 features are almost at 10 months now, even if they are minor.

* significant LOC change between stable and beta
  - just over 33% of squid right now (91K/271K LOC).

* bugs and general stability
  - we have a record low number in both 3.HEAD and 3.4 to work on right now.

Amos


Re: [PATCH] Support PROXY protocol

2014-07-26 Thread Amos Jeffries
On 22/06/2014 5:15 p.m., Amos Jeffries wrote:
> Support receiving PROXY protocol version 1 and 2.
> 
> PROXY protocol has been developed by Willy Tarreau of HAProxy for
> communicating original src and dst IP:port details between proxies and
> load balancers in a protocol-agnostic way.
> 
> stunnel, HAProxy and some other HTTP proxying software are already
> enabled and by adding support to Squid we can effectively chain these
> proxies without having to rely on X-Forwarded-For headers.
> 
> This patch adds http(s)_port mode flag (proxy-surrogate) to signal the
> protocol is in use, parsing and processing logics for the PROXY protocol
> headers on new connections, and extends the follow_x_forwarded_for
> (renamed proxy_forwarded_access) access control to manage inbound
> connections.
>  The indirect client security/trust model remains unchanged. As do all
> HTTP related logics on the connection once PROXY protocol header has
> been received.
> 
> 
> Furture Work:
>  * support sending PROXY protocol to cache_peers
>  * rework the PROXY parse logics as a Parser-NG child parser.
> 
> Amos
> 


So on the table the question of the http_port option name (and derived
from that the *_access control name).

The contenders so far:

  proxy
  surrogate [1]
  proxy-surrogate [1]
  require-PROXY
  expect-PROXY [2]
  require-PROXY-header
  expect-PROXY-header [2]
  forwarded [3]
  proxy-forwarded [3]
  haproxy-protocol[4]
  indirect-client


[1] potential naming confusion with Surrogate protocol HTTP extension.
And Alex objects that it means "nothing" in this squid context.

[2] potential naming confusion with "explicit proxy" terminology

[3] potential naming confusion with "forward proxy" terminology

[4] free advertising for the competition

At this stage it looks like Alexs' "require-proxy-header" is front
runner for relevance. Probably with "indirect_client" for the access
control.

Does anyone else have optin names or even just words to throw into the mix?

Amos


Re: [PATCH] Support PROXY protocol

2014-07-25 Thread Amos Jeffries
On 15/07/2014 4:25 a.m., Alex Rousskov wrote:
> On 07/12/2014 10:45 PM, Amos Jeffries wrote:
> 
>> +bool
>> +ConnStateData::findProxyProtocolMagic()
>> +{
>> +// http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt
>> +
>> +// detect and parse PROXY protocol version 1 header
>> +if (in.buf.length() > Proxy10magic.length() && 
>> in.buf.startsWith(Proxy10magic)) {
>> + return parseProxy10();
>> +
>> +// detect and parse PROXY protocol version 2 header
>> +} else if (in.buf.length() > Proxy20magic.length() && 
>> in.buf.startsWith(Proxy20magic)) {
>> +return parseProxy20();
>> +
>> +// detect and terminate other protocols
>> +} else if (in.buf.length() >= Proxy20magic.length()) {
>> +// input other than the PROXY header is a protocol error
>> +return proxyProtocolError("PROXY protocol error: invalid header");
>> +}
>> +
> 
> I know you disagree, but the above looks much clearer than the earlier
> code to me. Thank you.
> 
> The "// detect and ..." comments are pretty much not needed now because
> the code is self-documenting! The "else"s are also not needed. Your call
> on whether to remove that redundancy.
> 
> Consider adding
> 
>   // TODO: detect short non-magic prefixes earlier to avoid
>   // waiting for more data which may never come
> 
> but this is not a big deal because most non-malicious clients will not
> send valid non-PROXY requests that is 12 bytes or less, I guess.
> 

Added.

> 
>> +// XXX: should do this in start(), but SSL/TLS operations begin before 
>> start() is called
> 
> I agree that this should be done in start(). Fortunately, the code this
> XXX refers to does not rely on job protection (AFAICT) so it is not a
> big deal. The XXX is really about the SSL code that makes the move
> difficult.
> 

Where should socket MTU discovery be configured if not at the point
before where the connection actually starts being used?

The constructor where this code block exists is well after accept(), but
also well before socket usage (other than TLS).

> 
>>>> -NAME: follow_x_forwarded_for
>>>> +NAME: proxy_forwarded_access follow_x_forwarded_for
> 
>>> The new name sounds worse than the old one. Hopefully this can be left
>>> as is or renamed to something better after the "proxy-surrogate" issue
>>> is resolved.
> 
>> Is "forwarded_access" obectionable ?
> 
> IMO, it is misleading/awkward. Follow_x_forwarded_for was pretty good.
> We can use something like follow_forwarded_client_info or
> trust_relayed_client_info, but let's wait for the primary naming issue
> to be resolved first. That resolution might help us here.
> 
> 
>>>> +bool
>>>> +ConnStateData::parseProxyProtocolMagic()
>>>
>>> This appears to parse a lot more than just the magic characters. Please
>>> rename to parseProxyProtocolHeader() or similar.
> 
>> The entire PROXY protocol is "magic" connection header. 
> 
> Not really. In this context, "magic" is, roughly, a "rare" string
> constant typically used as a prefix to detect/identify the following
> structure or message. The PROXY protocol header contains both "magical"
> and "regular" parts.
> 
> 
>> +/**
>> + * Test the connection read buffer for PROXY protocol header.
>> + * Version 1 and 2 header currently supported.
>> + */
>> +bool
>> +ConnStateData::findProxyProtocolMagic()
> 
> This method does not just "test". It actually parses. IMO,
> findProxyProtocolMagic() should be called parseProxyProtocolHeader().
> 
> No, it does not matter that the actual non-magic parsing code is in
> other parsing methods. The method description and name should reflect
> what the method does from the caller point of view. The method internals
> are not that important when you are naming and describing the interface.
> 

Okay, okay. Done.

> 
>>>> +needProxyProtocolHeader_ = xact->squidPort->flags.proxySurrogate;
>>>> +if (needProxyProtocolHeader_)
>>>> +proxyProtocolValidateClient(); // will close the connection on 
>>>> failure
>>>
>>> Please do not place things that require job protection in a class
>>> constructor. Calling things like stopSending() and stopReceiving() does
>>> not (or will not) work well when we are just constructing a
>>> ConnStateData object. Nobody may notice (at the right time) that you
>>> &

Squid 3.5 release timetable

2014-07-24 Thread Amos Jeffries
As luck would have it today is exactly 1 year since the first patch was
held in trunk for 3.5 series release.

Below is my current plan. Any objections please speak up.


1) Branching:

I hope this can be done next weekend. August 1-3, maybe the week after
if there are delays.

We have enough features to make it useful even though some of the larger
projects have not made it in.

However, to minimize work in stage-2 trunk needs to be relatively stable
before this happens. If any of you have patches lined up for commit or
about to be, please reply to this mail with details so we can triage
what gets in and what can hold off in audit.

Note that patches applied after branching may still get to 3.5, but will
have to be stable in trunk first.

Patches that are welcome any time:
 - documentation updates
 - security bug fixes


2) Documentation and stability testing

After branching we need to do as much testing as we can throw at the new
branch and update any missing documentation.

Most of the documentation is already done. So its just a scan through to
check for missing or incorrect bits.


3) 3.5.0.1

I am hoping this can be done by the end of August. Will happen whenever
step-2 is completed.


Stable) as usual depends on bugs

 2 bugs explicitly on 3.HEAD within the 3.5 development period are
blocking stable release until fixed or determined non-critical.

 12 major or higher bugs in 3.4 seem worthy of blocking 3.5 stable for a
bit to get resolved.

 We have 60 other major bugs outstanding across all Squid versions that
should be resolved ASAP if at all possible.




Projects I am aware of that are potentially coming up for commit over
this period:

* Kerberos autoconf updates
  - assuming the current rewrite patch

* PROXY protocol
  - assuming the current partial support patch

* Peek-n-Splice
  - assuming it is relatively isolated changes and well tested already

* StoreID via eCAP/ICAP
  - I'm not sure what this is waiting on.


Amos


Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Amos Jeffries
On 19/07/2014 2:55 a.m., Alex Rousskov wrote:
> On 07/18/2014 01:32 AM, Amos Jeffries wrote:
>> Some of the statisticas being brought up in the IETF HTTP/2 discussions
>> is highlighting certain garbage headers which are unfortunately quite
>> common.
> 
> I join Eliezer in begging for pointers to relevant posts or pages.
> 
> 
>> I have wondered about creating a registry of known garbage and simply
>> dropping those headers on arrival in the parser. This would be in
>> addition to the header registry lookup and masking process we have for
>> hop-by-hop headers.
>>
>> Any other thoughts on this?
> 
> We already have squid.conf options to drop headers. Folks that want to
> focus on saving bandwidth may use them. We can publish the corresponding
> configuration excerpts on the wiki.
> 
> If those options are not enough, let's add more. If those options slow
> Squid down too much, let's discuss optimizations (keeping in mind that
> much better optimizations can probably be obtained by preserving header
> blobs during forwarding).
> 
> However, please do not hard-code policing of messages Squid can grok,
> especially in the parser.


See my post in reply to Eliezer. the general garbage ones we could leave
to admin. But the connection: and content-length header mangling, and
some of the other security bypasses have deeper implications and special
processing may be needed to cleanup properly. ie drop a cneonction:
header and also drop any it lists just to be safe, or reject requests
with cteonnt-length: header in self defense.

Amos


Re: [RFC] bandwidth savigns via header eliding

2014-07-18 Thread Amos Jeffries
On 19/07/2014 2:07 a.m., Eliezer Croitoru wrote:
> This got my eyes but I am not reading all ietf httpbits mails and I
> would like to get a reference for this thread please?
> 

There are two type of removable headers:
 a) headers which exist purely to bypass security
 b) headers which exist due to intermediaries breaking them

The post describing why the (b) group occur is here:
  http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/0132.html

One of the posts which is making me think we could benefit from doing
something is:
  http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1220.html

This lists the existing headers found in the data sets being analysed by
IETF as representative of HTTP web traffic. ie HTTP/2 compression an
dimprovements measured

What I can see in that listing is the following headers (by type above).

(A) group:

 x-powered-by / x-aspnet-version / x-aspnetmvc-version / x-pb-mii -
exists to bypass server security measures applied on Server: header.

 x-served-by - same as X-powered-by but also crossing over to contain
X-forwarded-for: and Forwarded: header contents (but without the
security protections applied for them).

 x-host / x-forwarded-host - exist to bypass Browser same-origin
security measures.

 x-li-uuid - tracking cookie created to bypass Cookie header security
and legislative restrictions.

 x-fs-uuid - header for distributing the UUID of the server hard drive
out to the public network (seriously, what could go wrong with that huh?)

 x-radid - seems to be another disk drive tracking ID method.


(b) group worry me for the reasons given below:

 nncoection / cneonction / x-cnection - reason described in the above
email. I am a little bit worried that in HTTP/1.1 these may have
actually contained lists of headers which were to be dropped by the
earlier intermediary. But obscuring the "Connection:" name we are
potentially transmitting headers like Upgrade: or with private details
that should be elided.

 ntcoent-length / cteonnt-length - Given the reason behind 16-bit rotate
on header name any of the mandatory HTTP/1.1->1.0 and connection:close
addition required to make this safe will alter the checksum. So will
content adaptation if that was the point.
  I am left with assuming that this is done to smuggle messages in a
pipeline through the receiving server as a single request/reply.



There are also a bunch of other headers which can best be called
"garbage". Relatively harmless though.

Old HTTP features and mechanisms which are now not supposed to be sent:

 pragma:close - dead HTTP/1.0 feature. Not to be emitted by HTTP/1.1
software.
 p3p- dead standard, removed from service due to privacy violations.
 x-pad  - supposedly an HTTPS-only feature for "fixing"
 proxy-connection - dead non-standard. we already drop this one


debug headers that are mostly useless (we could help clean this up by
only enabling our x-cache headers based on a debug config option)

 x-cache / x-cache-lookup / x-cache-action / x-cache-hits / x-cache-age
/ x-fb-debug / x-mii-cache-hit / bk-server


Amos

> Thanks,
> Eliezer
> 
> On 07/18/2014 10:32 AM, Amos Jeffries wrote:
>> Some of the statisticas being brought up in the IETF HTTP/2 discussions
>> is highlighting certain garbage headers which are unfortunately quite
>> common.
>>
>> I have wondered about creating a registry of known garbage and simply
>> dropping those headers on arrival in the parser. This would be in
>> addition to the header registry lookup and masking process we have for
>> hop-by-hop headers.
>>
>> Any other thoughts on this?
>>
>> Amos
> 



[RFC] remove JobCallback macro

2014-07-18 Thread Amos Jeffries
I have once again been foiled by brokenness in the JobCallback macro
provided apparently for convenience generating Job based AsyncCalls.

This time I seem to have found the problem though. It stems from
JobCalback requiring a Dialer+Job+method - with zero parameter objects
accepted. If one attempts to use it with UnaryMemFunT<> the compile
breaks with some fairly cryptic errors.

The resolution to this is to use the asyncCall() template function
directly which JobCallback was supposedly making easier. However, the
asyncCall is not particularly hard to use in the first place and has
lots of code examples to look at now.

Give that JobCallback() is only usable (and used) by the
NullaryMemFunT<> and where the CommCalls parameter hack/workaround means
the parameters variable need not be provided to the dialer constructor.
I am believing that this particualar macro is providing almost no
benefit outside of CommCalls, so we should do one of the following:

 a) move it to the CommCalls.h header and rename it specific to that API
to avoid others hitting the same confusion in future.

 b) remove JobCallback macro completely in favour of asyncCall.

 c) replace existing JobCallback with overloaded inline template
equivalents to asyncCall() so that UnaryMemFunT parameter can be provided.

I favour b or c.
 * b makes the code cleaner and more consistent regarding Call creation.
 * c offers the chance to hide lots of Unary/Nullary-MemFunT definitions
inside the inline function via overloading.

Opinions?

Amos


Re: [PATCH] validate -n parameter value

2014-07-18 Thread Amos Jeffries
On 16/07/2014 10:53 a.m., Alex Rousskov wrote:
> On 07/14/2014 07:02 AM, Amos Jeffries wrote:
>> On 23/06/2014 12:35 a.m., Amos Jeffries wrote:
>>> On 17/06/2014 4:06 a.m., Alex Rousskov wrote:
>>>> On 06/15/2014 12:05 AM, Amos Jeffries wrote:
>>>>> +if (optarg) {
>>>>> +SBuf t(optarg);
>>>>> +::Parser::Tokenizer tok(t);
>>>>
>>>> Pleaser remove "t" as used-once and lacking a descriptive name. If you
>>>> insist on keeping it, please make it "const" and try to give it a more
>>>> descriptive name such as rawName or serviceName.
> 
> 
>> No can do on removal. Without it we get: 
>> error: request for member 'prefix' in 'tok', which is of non-class type
>> 'Parser::Tokenizer(SBuf)'
> 
> 
> Yeah, we appear to hit some hairy C++ concept here that I cannot fully
> reconstruct. I could not find an explanation by quick googling (all
> results point to trivial cases of declaring a function instead of
> calling the default constructor).
> 
> The following works for me:
> 
>   ::Parser::Tokenizer tok(SBuf(optarg, SBuf::npos));
> 
> However, at this point, it is not clear whether the extra "t" temporary
> is actually worse than the extra SBuf argument! If you keep "t", a more
> descriptive name for t would be nice but obviously not required.
> 
> 
>>>>> +if (!tok.prefix(service_name, chr) || !tok.atEnd())
>>>>
>>>> Please note that the above also rejects empty service names (-n "").
>>>> That may be good, but the error messages do not reflect that restriction.
>>>>
>>
>> Moved this to an explicit check on *optarg content so it is obviously
>> intentional and gets the right "missing service name" error.
> 
> The adjusted if condition in the committed code appears to dereference a
> nil pointer when the service name is missing:
> 
>> if (optarg || *optarg == '\0') {
> 
> The code actually does not get that far because getopt() does not allow
> a missing service name (at least in my tests), but the condition should
> be fixed.

Ouch. Fixed now.


Amos


[RFC] bandwidth savigns via header eliding

2014-07-18 Thread Amos Jeffries
Some of the statisticas being brought up in the IETF HTTP/2 discussions
is highlighting certain garbage headers which are unfortunately quite
common.

I have wondered about creating a registry of known garbage and simply
dropping those headers on arrival in the parser. This would be in
addition to the header registry lookup and masking process we have for
hop-by-hop headers.

Any other thoughts on this?

Amos


Re: TOS values

2014-07-17 Thread Amos Jeffries
On 17/07/2014 10:25 p.m., Tsantilas Christos wrote:
> On 07/17/2014 02:51 AM, Amos Jeffries wrote:
>> On 17/07/2014 8:01 a.m., Alex Rousskov wrote:
>>> On 07/16/2014 11:39 AM, Tsantilas Christos wrote:
>>>> Hi all,
>>>>   Squid currently does not make a check for the TOS values used in
>>>> squid
>>>> configuration file. Squid will accept 8bit numbers as TOS values,
>>>> however:
>>>>   1) TOS values with 1st ad 2nd bit set can not be used. These bits
>>>> used
>>>> by the ECN. For Linux if someone try to set the 0x23 value as TOS value
>>>> (which sets 1st and 1nd bits), the 0x20 will be used instead, without
>>>> any warn for the user.
>>>>
>>>>   2) Some of the TOS values are already reserved for example those
>>>> which
>>>> are reserved from RFC2597.
>>>>
>>>> The above may confuse squid users.
>>>>
>>>> Maybe it is not bad idea to:
>>>>   - Warn users when try to use TOS value which uses the ECN bits
>>>>   - Warn users when try to use TOS values which are not reserved. The
>>>> user will know that this value is free for use.
>>>>
>>>> Opinions?
>>>
>>>
>>> This is not my area of expertise, but
>>>
>>> * the first proposed warning sound good to me, and
>>>
>>> * it is not clear to me whether Squid should avoid using ToS values from
>>> RFC 2597. It feels like Squid could, in some cases, _set_ those ToS
>>> values to use RFC 2597 features provided by its network.
>>>
>>
>> For now Squid still follows RFC 2474 and have the documented comment
>> about ECN problems for somewhat loose RFC 3168 (ECN) support.
>>
>> RFC 3260 section 4 "Definition of the DS Field" explicitly obsoletes the
>> name IPv4 "TOS" and IPv6 "TCLASS". They are both now defined as a 6-bit
>> "DS" value followed by separate ECN bits.
>>
>>
>> IMO, we should update Squid to RFC3260 support - ie mask out the ECN
>> bits and prevent configuring them.  Like so:
>>   1) replace all config detailes named "TOS" with "DS" ones that only
>> takes a hex bytecode so that,
>>   2) DS values always be masked with 0xFC and,
> 
> At least for linux this is done by OS. If you try to use 0x23 as TOS
> value the 0x20 will be used instead.
> The problem is that this is done silently, without any warn to the user.

Yes but not reliably for other OS. So we may as well do it ourselves and
get it right in regards to that warning.

> 
>>   3) when TOS named options are found display an upgrade warning and mask
>> out the ECN bits.
> 
> What do you mean with "TOS named options"? Are they the AF1x, AF2x
> referred in RFC2597?

No I mean our squid.conf directive names, parameters etc.

> 
>>
>> PS. need to add RFC3260 to the doc/rfcs/1-index.txt listing after this.
>>
>> Amos
>>
> 



Re: [PATCH] Comm::TcpAcceptor::doAccept fd limit handling is broken

2014-07-16 Thread Amos Jeffries
On 17/07/2014 11:10 a.m., Alex Rousskov wrote:
> On 07/16/2014 02:38 PM, Rainer Weikusat wrote:
>> Alex Rousskov  writes:
>>> On 07/16/2014 11:11 AM, Rainer Weikusat wrote:
 This is broken because it re-enables readable notifications even if no
 connection was accepted. 
> 
>>> In other words, it re-enables a check for new connections when
>>>
>>> * we know that we currently cannot accept any new connections,
>>> * Squid is under stress already, and
>>> * our code does not handle multiple deferences of the same TcpAcceptor
>>> object well.
> 
> 
>> There's an important 4th point: The check is re-enabled when it is known
>> that it will immediately return true again because the connection which
>> was not accepted is still pending (hence, the socket is still readable),
>> more details below.
> 
> Agreed!
> 
> 
>> I changed the TcpAcceptor/ AcceptLimiter code to act as if
>> only a single file descriptor was available for client connections and
>> created two connections via netcat. What it should do in this case is to
>> ignore the second connection until the first one was closed. Instead, it
>> went into an endless 'defer this connection' loop, as mentioned above,
>> because the pending 2nd connection meant the socket remained readable.
> 
> OK, I think I now better understand what you meant by "one connection at
> the same time". And even if I do not, we seem to agree on the fix to
> solve the problem, which is even more important :-).
> 
> 
>> updated patch:
>>
>> --- mad-squid/src/comm/TcpAcceptor.cc8 Jan 2013 17:36:44 -   
>> 1.1.1.2
>> +++ mad-squid/src/comm/TcpAcceptor.cc16 Jul 2014 20:36:35 -
>> @@ -204,7 +204,6 @@
>>  } else {
>>  afd->acceptNext();
>>  }
>> -SetSelect(fd, COMM_SELECT_READ, Comm::TcpAcceptor::doAccept, afd, 
>> 0);
>>  
>>  } catch (const std::exception &e) {
>>  fatalf("FATAL: error while accepting new client connection: %s\n", 
>> e.what());
>> @@ -262,6 +261,8 @@
>>  debugs(5, 5, HERE << "Listener: " << conn <<
>> " accepted new connection " << newConnDetails <<
>> " handler Subscription: " << theCallSub);
>> +
>> +if (!isLimited) SetSelect(conn->fd, COMM_SELECT_READ, 
>> Comm::TcpAcceptor::doAccept, this, 0);
>>  notify(flag, newConnDetails);
>>  }
> 
> 
> I am happy to commit the above (after splitting the new if-statement
> into two lines), but since this is not an emergency, I would prefer to
> wait a little for any objections or second opinions.

Looks like a reasonable temporary solution to me for the stable
releases. I have no problems with it going in if we continue to work on
the queue fixes for trunk.

Regarding that queue I think it can be cleaned up once and for all by
making doAccept() a proper Call and using the new comm/Read.h API. The
queue becomes a std::queue and the nasty
removeDead() drops in favour of a TcpAcceptor internal Call::cancel().
No more Job raw pointers!
 NP that this change will not fix the stable releases due to missing API.

Amos


Re: TOS values

2014-07-16 Thread Amos Jeffries
On 17/07/2014 8:01 a.m., Alex Rousskov wrote:
> On 07/16/2014 11:39 AM, Tsantilas Christos wrote:
>> Hi all,
>>  Squid currently does not make a check for the TOS values used in squid
>> configuration file. Squid will accept 8bit numbers as TOS values, however:
>>  1) TOS values with 1st ad 2nd bit set can not be used. These bits used
>> by the ECN. For Linux if someone try to set the 0x23 value as TOS value
>> (which sets 1st and 1nd bits), the 0x20 will be used instead, without
>> any warn for the user.
>>
>>  2) Some of the TOS values are already reserved for example those which
>> are reserved from RFC2597.
>>
>> The above may confuse squid users.
>>
>> Maybe it is not bad idea to:
>>  - Warn users when try to use TOS value which uses the ECN bits
>>  - Warn users when try to use TOS values which are not reserved. The
>> user will know that this value is free for use.
>>
>> Opinions?
> 
> 
> This is not my area of expertise, but
> 
> * the first proposed warning sound good to me, and
> 
> * it is not clear to me whether Squid should avoid using ToS values from
> RFC 2597. It feels like Squid could, in some cases, _set_ those ToS
> values to use RFC 2597 features provided by its network.
> 

For now Squid still follows RFC 2474 and have the documented comment
about ECN problems for somewhat loose RFC 3168 (ECN) support.

RFC 3260 section 4 "Definition of the DS Field" explicitly obsoletes the
name IPv4 "TOS" and IPv6 "TCLASS". They are both now defined as a 6-bit
"DS" value followed by separate ECN bits.


IMO, we should update Squid to RFC3260 support - ie mask out the ECN
bits and prevent configuring them.  Like so:
 1) replace all config detailes named "TOS" with "DS" ones that only
takes a hex bytecode so that,
 2) DS values always be masked with 0xFC and,
 3) when TOS named options are found display an upgrade warning and mask
out the ECN bits.

PS. need to add RFC3260 to the doc/rfcs/1-index.txt listing after this.

Amos


Re: r13497: Converts the PortCfgPointer to reference counted

2014-07-16 Thread Amos Jeffries
On 16/07/2014 9:31 a.m., Alex Rousskov wrote:
> On 07/14/2014 11:59 PM, Amos Jeffries wrote:
>> On 15/07/2014 5:02 a.m., Alex Rousskov wrote:
>>> On 07/14/2014 03:48 AM, Amos Jeffries wrote:
>>>> -if (!s.valid()) {
>>>> +if (!s) {
>>>>  // it is possible the call or accept() was still queued when the 
>>>> port was reconfigured
>>>>  debugs(33, 2, "HTTP accept failure: port reconfigured.");
>>>>  return;
>>>
>>> While the port pointer could get invalidated in the [fixed] old code, it
>>> cannot become nil in the new code. The above comment is no longer valid
>>> and the code itself no longer works as intended. Please fix both
>>> instances of the above logic: one in httpAccept() and one in httpsAccept().
>>>
>>> AFAICT, if you simply replace the above code with a comment warning that
>>> the port may no longer be in the current configuration, we will continue
>>> to start transactions previously accepted on no longer configured ports.
>>> Hopefully, this will not cause any new problems, but there may be a
>>> better solution.
>>
>> Good catch. I think this is actually dead code now and can be removed
>> completely now that it is a ref-counted port. Do you agree on that?
> 
> Instead of simply removing the now-dead code, I recommend replacing it
> with a comment which warns us that the port may no longer be in the
> current Config. Preserving that knowledge may be important for triaging
> obscure cases. I do not know of any specific case where this caveat is
> important, but that does not mean there are no such cases.
> 
> Dead code is not a big deal, of course. Breaking trunk build when SSL is
> enabled is a bigger deal.
> 

Okay. Done that replacement for both ports.

Amos



Re: r13497: Converts the PortCfgPointer to reference counted

2014-07-14 Thread Amos Jeffries
On 15/07/2014 5:02 a.m., Alex Rousskov wrote:
> On 07/14/2014 03:48 AM, Amos Jeffries wrote:
>>   Converts the PortCfgPointer to reference counted
> 
> This change should have gone through an audit IMO, but I appreciate
> having an entire Sunday to react to the "will commit shortly" notice for
> a not yet posted patch.
> 

The patch was posted on 23rd June (3 weeks). Only changes since then
were library link dependencies on unit tests. Sorry that the reminder
turned out to be Sunday. I keep forgetting the weekend overlap lag.

> 
>> -if (!s.valid()) {
>> +if (!s) {
>>  // it is possible the call or accept() was still queued when the 
>> port was reconfigured
>>  debugs(33, 2, "HTTP accept failure: port reconfigured.");
>>  return;
> 
> While the port pointer could get invalidated in the [fixed] old code, it
> cannot become nil in the new code. The above comment is no longer valid
> and the code itself no longer works as intended. Please fix both
> instances of the above logic: one in httpAccept() and one in httpsAccept().
> 
> AFAICT, if you simply replace the above code with a comment warning that
> the port may no longer be in the current configuration, we will continue
> to start transactions previously accepted on no longer configured ports.
> Hopefully, this will not cause any new problems, but there may be a
> better solution.

Good catch. I think this is actually dead code now and can be removed
completely now that it is a ref-counted port. Do you agree on that?

Amos



Re: [PATCH] validate -n parameter value

2014-07-14 Thread Amos Jeffries
On 23/06/2014 12:35 a.m., Amos Jeffries wrote:
> On 17/06/2014 4:06 a.m., Alex Rousskov wrote:
>> On 06/15/2014 12:05 AM, Amos Jeffries wrote:
>>> +if (optarg) {
>>> +SBuf t(optarg);
>>> +::Parser::Tokenizer tok(t);
>>
>> Pleaser remove "t" as used-once and lacking a descriptive name. If you
>> insist on keeping it, please make it "const" and try to give it a more
>> descriptive name such as rawName or serviceName.
> 

No can do on removal. Without it we get:

error: request for member 'prefix' in 'tok', which is of non-class type
'Parser::Tokenizer(SBuf)'
 if (!tok.prefix(service_name, chr) || !tok.atEnd())
  ^
error: request for member 'atEnd' in 'tok', which is of non-class type
'Parser::Tokenizer(SBuf)'
 if (!tok.prefix(service_name, chr) || !tok.atEnd())
^

const seems to be fine though.

>>
>>> +const CharacterSet chr = 
>>> CharacterSet::ALPHA+CharacterSet::DIGIT;
>>
>> Is there a document stating that only those characters are valid? My
>> quick search resulted in [1] that seems to imply many other characters
>> are accepted. However, [2] lists more prohibited characters. Both seem
>> to allow "-" and "_" though. It would be good to provide a reference in
>> the code or commit message to explain why we are restricting its value.
>>
> 
> Arbitrary design choice for guaranteed portability. Other characters can
> be added if necessary, but most lack cross-platform usability for all
> the situations the service name string is being used.
> 
> 
>> Do you want to validate the total service name length? [1] says the
>> limit is 256 characters.
>>
>> None of the above applies to -n used on Unix. Do we need to validate the
>> service name (whatever that is) there?
> 
> IMO portable consistency is better for this type of thing than being
> pedantically accepting for the OS-specific character sets.
> 
> Yes, length needs to be checked. Thank you.
> 
> Does an arbitrary 32 bytes seem acceptible since this is a prefix on the
> UDS sockets and paths which the final total still needs to fit within
> the 256 or so for some OS?

Taking absence of a response as a yes. Fixed.


>>
>>> +if (!tok.prefix(service_name, chr) || !tok.atEnd())
>>
>> Please note that the above also rejects empty service names (-n "").
>> That may be good, but the error messages do not reflect that restriction.
>>

Moved this to an explicit check on *optarg content so it is obviously
intentional and gets the right "missing service name" error.

>>
>>> +fatal("Need to provide alphanumeric service name when 
>>> using -n option");
>>> +opt_signal_service = true;
>>> +} else {
>>> +fatal("Need to provide alphanumeric service name when 
>>> using -n option");
>>> +}
>>
>> I recommend adjusting the fatal() messages to distinguish the two errors
>> and avoid a possible misunderstanding that the service name was provided
>> but was not alphanumeric in the second case:
>>
>>   * Expected alphanumeric service name for the -n option but got: ...
>>   * A service name is required for the -n option.
>>
> 
> "" is not an alphanumeric string. But okay.
> 
>>
>> I continue to dislike the undefined -n option meaning on non-Windows
>> boxes, but I have no objection to this patch and believe the above
>> changes can be done without a new review cycle.

-n means the same on all boxes. Windows default is "squid" just like the
rest.

Updates made and applied to trunk.

Amos


Re: [PATCH 5/8] reconfiguration leaks: objects tied to http_port

2014-07-14 Thread Amos Jeffries
On 13/07/2014 4:59 p.m., Amos Jeffries wrote:
> On 23/06/2014 12:30 a.m., Amos Jeffries wrote:
>> On 17/06/2014 9:15 a.m., Alex Rousskov wrote:
>>>
>>> PortCfgPointer is not a reference counting pointer.
>>
>> There is no remaining reason for that since we converted the TcpAcceptor
>> to emitting MasterXaction. The PortCfg pointer is not actually passed as
>> a parameter anywhere. Just one buggy piece of code which should have
>> been implemented differently.
>>
>>
>> 
>>>
>>> Most likely, we should use the refcounting API for port pointers. Until
>>> that (or a better) solution is implemented, we should either
>>>
>>
>> The attached (rough) patch converts the PortCfgPointer to reference
>> counted and fixes all parsing errors resulting from the change. Most of
>> the issues were due to use of raw-pointers and explicit
>> cbdataReference*() API.
>>
>> Still have to add stubs to fix make check linkage errors and do some run
>> testing.
>>
> 
> If there are no objections I will commit the slightly more polished
> version of this patch shortly.
> 
> Amos
> 

Applied.

SSL state details may still be leaking, but I no longer see any issue
with SSL cleanup patches being applied so long as the relevant cleanup
is performed via the AnyP::PortCfg destructor.

Amos



Re: [PATCH 5/8] reconfiguration leaks: objects tied to http_port

2014-07-12 Thread Amos Jeffries
On 23/06/2014 12:30 a.m., Amos Jeffries wrote:
> On 17/06/2014 9:15 a.m., Alex Rousskov wrote:
>>
>> PortCfgPointer is not a reference counting pointer.
> 
> There is no remaining reason for that since we converted the TcpAcceptor
> to emitting MasterXaction. The PortCfg pointer is not actually passed as
> a parameter anywhere. Just one buggy piece of code which should have
> been implemented differently.
> 
> 
> 
>>
>> Most likely, we should use the refcounting API for port pointers. Until
>> that (or a better) solution is implemented, we should either
>>
> 
> The attached (rough) patch converts the PortCfgPointer to reference
> counted and fixes all parsing errors resulting from the change. Most of
> the issues were due to use of raw-pointers and explicit
> cbdataReference*() API.
> 
> Still have to add stubs to fix make check linkage errors and do some run
> testing.
> 

If there are no objections I will commit the slightly more polished
version of this patch shortly.

Amos



Re: [PATCH] Support PROXY protocol

2014-07-12 Thread Amos Jeffries
Attaced patch contains all updated except the config option renaming,
which remains to be sorted out.


On 26/06/2014 7:41 a.m., Alex Rousskov wrote:
> On 06/21/2014 11:15 PM, Amos Jeffries wrote:
>> Support receiving PROXY protocol version 1 and 2.
> 
> 
>> +   proxy-surrogate
>> +Support for PROXY protocol version 1 or 2 connections.
>> +The proxy_forwarded_access is required to whitelist
>> +downstream proxies which can be trusted.
> 
> 
> Could you suggest some alternative names to "proxy-surrogate", which
> sounds like nothing-on-top-of-nothing to me in Squid context?

We are a surrogate for the PROXY protocol. I know its a bit nasty.

> The
> terrible name for the PROXY protocol itself is clearly not your fault,
> but perhaps we can find a more intuitive way to call this new option?
> Here are some suggestions:
> 
> require-PROXY
> expect-PROXY
> require-PROXY-header
> expect-PROXY-header
> 
> 
> All-CAPS in option names are unfortunate as it goes against Squid style,
> but the poor choice the protocol name essentially forces us to do that.
> 

In all seriousness "haproxy-protocol" is probably the most correct
descriptive right now. But I am trying hard to avoid naming a competitor
in our most visible documentations.

"forwarded" would be wonderful, but I can forsee some confusion with
forward-proxy mode.

How about referred, referral, or proxy-referral ?

> 
>> -NAME: follow_x_forwarded_for
>> +NAME: proxy_forwarded_access follow_x_forwarded_for
> 
> The new name sounds worse than the old one. Hopefully this can be left
> as is or renamed to something better after the "proxy-surrogate" issue
> is resolved.
> 

Is "forwarded_access" obectionable ?
The XFF header has been deprecated by the Forwarded: header now, so a
name-changing was in the cards whatever this patch does.

> 
>> +bool
>> +ConnStateData::proxyProtocolError(bool fatalError)
>> +{
>> +if (fatalError) {
>> +// terminate the connection. invalid input.
>> +stopReceiving("PROXY protocol error");
>> +stopSending("PROXY protocol error");
>> +}
>> +return false;
>> +}
> 
> I recommend using a "const char *" argument type so that you can log a
> meaningful fatalError. Helps with caller code self-documentation too.
> Nil argument would mean non-fatal error, of course.
> 

Done. And calling mustStop() now.

> 
>> +bool
>> +ConnStateData::parseProxyProtocolMagic()
> 
> This appears to parse a lot more than just the magic characters. Please
> rename to parseProxyProtocolHeader() or similar.
> 


The entire PROXY protocol is "magic" connection header. I get the point
though, splitting into three functions. The main one being
findProxyProtocolMagic().

> 
>> +
>> +needProxyProtocolHeader_ = xact->squidPort->flags.proxySurrogate;
>> +if (needProxyProtocolHeader_)
>> +proxyProtocolValidateClient(); // will close the connection on 
>> failure
> 
> Please do not place things that require job protection in a class
> constructor. Calling things like stopSending() and stopReceiving() does
> not (or will not) work well when we are just constructing a
> ConnStateData object. Nobody may notice (at the right time) that you
> "stopped" something because nothing has been started yet.

Did not require job protection until converting
stopSending()/stopReceiving() into mustStop() calls.

Also, note that ConnStateData is not a true AsyncJob. It never started
with AsynJob::Start() and cleanup is equally nasty as this setup
constructor (prior to any changes I am adding).

I have done as suggested and created the AsyncJob::Start() functionality
for it.

However, this means that the PROXY protocol is no longer capable of
being used on https_port. The PROXY protocol header comes before TLS
negotiation on the wire, but ConnStateData::start() is only called by
our code after SSL/TLS negotiation and SSL-bump operations complete.


> 
>> +// TODO: we should also pass the port details for myportname here.
> 
> Is there a good reason not to pass the port details now? Are they not
> available to ConnStateData?
> 

No reason. Done.

> 
>> +if (ch.fastCheck() != ACCESS_ALLOWED) {
>> +// terminate the connection. invalid input.
>> +stopReceiving("PROXY client not permitted by ACLs");
>> +stopSending("PROXY client not permitted by ACLs");
>> +}
> 
> The copied(?) comment is wrong in this context. It is not the input that
> is invalid in this case. However,

Re: [RFC] post-cache REQMOD

2014-07-11 Thread Amos Jeffries
On 12/07/2014 2:47 a.m., Alex Rousskov wrote:
> On 07/11/2014 05:27 AM, Tsantilas Christos wrote:
> 
>> The PageSpeed example fits better to a post-cache RESPMOD feature. 
> 
> I do not think so. Post-cache RESPMOD does not allow Squid to cache the
> adapted variants. Please let me know if I missed how post-cache RESPMOD
> can do that.

post-cache RESPMOD should be caching the large unfiltered object and
Squid cache supplying it to the adaptation module for each adaptation task.
 Adaptation operations on each request, but no upstream contact necessary.

post-cache REQMOD the cache stores the shrunk version of objects.
Adaptors at this point cannot pull form cache, so request the larger
object from upstream on each MISS in order to adapt befor caching.
 Adaptation operations only on MISS, but upstream fetch of the
unfiltered large object on each adaptation.

I guess you are assuming that the ICAP service stores the unfiltered
object in its own cache and delivers the shrunk objects to Squid as
reply responses to REQMOD. This is the only case in which post-cache
REQMOD is more efficient overall than pre-cache REQMOD.



> 
> The key here is that PageSpeed and similar services want to create (and
> cache) many adapted responses out of a single virgin response. Neither
> HTTP itself nor the Squid architecture support that well. Post-cache
> REQMOD allows basic PageSpeed support (the first request for "small"
> adapted content gets "large" virgin content, but the second request for
> small content fetches it from the PageSpeed cache, storing it in Squid
> cache). To optimize PageSpeed support further (so that the first request
> can get small content), we will need to add another generally useful
> feature, but I would rather not bring it into this discussion (there
> will be a separate RFC if we get that far).
> 
> The alternative is to create a completely new interface (not a true
> vectoring point) that allows an adaptation service to push multiple
> adapted responses into the Squid cache _and_ tell Squid which of those
> responses to use for the current request. While I have considered
> proposing that, I still think we would be better off supporting
> "standard" and "well understood" building blocks (such as standard
> adaptation vectoring points) rather than such highly-specialized
> interfaces. Please let me know if you disagree.
> 

IMHO if this feature is provided the persons requesting it will find
that PageSpeed works no faster than pre-created shrunk variants over
standard HTTP with working Vary caching. They are saving cheap disk
storage by spending expensive CPU and network latency. Probably in an
effort to speed up all those very old proxies that incorrectly implement
Cache-Control:no-cache.

Vary caching is after all the design of providing client-specific
variants without all the work of realtime adaptation.

(thats just me getting old though as I look desparingly on an ever more
inefficient network - "back in my day...").

> 
>> Is
>> the post-cacge REQMOD just a first step to support all post-cache
>> vectoring points?
> 
> You can certainly view it that way, but I do not propose or promise
> adding post-cache RESPMOD :-).
> 
> 
> Thank you,
> 
> Alex.


Amos




Re: [RFC] post-cache REQMOD

2014-07-11 Thread Amos Jeffries
On 12/07/2014 3:24 a.m., Alex Rousskov wrote:
> On 07/11/2014 03:16 AM, Amos Jeffries wrote:
>> On 11/07/2014 4:27 p.m., Alex Rousskov wrote:
>>> This is unrelated to caching rules. HTTP does not have a notion of
>>> creating multiple responses from a single response, and that is exactly
>>> what PageSpeed and similar adaptations do: For example, they convert a
>>> large origin server image or page into several variants, one for each
>>> class of clients.
> 
> 
>> Indeed. So this implementation of PageSpeed by requiring HTTP agents to
>> transform traffic mid-transit from single to multiple responses is a
>> violation.
> 
> I do not think so. This is just a content adaptation outside of the HTTP
> domain. If this is a "violation", then dreaming of unicorns is an HTTP
> violation because HTTP does not have a notion of a unicorn.
> 
> 
>> Or, PageSpeed is completely unnecessary mid-transit and has nothing to
>> do with Squid and caching. 
> 
> Evidently, folks deploying proxies say otherwise. We can tell them that
> they are "doing it wrong", but I do not think we should in this case.
> For some use cases, PageSpeed adaptation is the right thing to do. Also,
> this is not necessarily used "mid-transit" as discussed below.
> 

We've been advising alternative to post-cache vector points as you say
for some time. This does not appear to be much different to the other
requests. There is an alternative. Question is whether the benefit
gained by one more piece of software in the installed chain outweights
the development effort and maintenance of a new vector point in Squid code.
 IMHO they are pretty evenly balanced at present. So its up to you
whether you want to take on that workload. My opinion on where and how
to integrate was mentionend in the initial email.


> 
>> Which can cache either the small shrunk
>> objects, or the single large one just fine. It is instead the attempt to
>> perform end-server operations in a proxy/gateway which is driving this
>> change.
> 
> The need to better serve a diverse population of web clients is driving
> this change. An attempt to confine useful content adaptations to
> "end-servers" is futile at best. BTW, many want to deploy PageSpeed at
> the reverse proxy, which is an end-server from the HTTP client point of
> view so even if we adopt the "only end-servers can do that!" law, Squid
> should still support this kind of adaptation.
> 

Perhapse. Dont let my doubts there block you though.

Amos


Re: [PATCH] Support PROXY protocol

2014-07-11 Thread Amos Jeffries
On 12/07/2014 3:04 a.m., Alex Rousskov wrote:
> On 07/11/2014 02:27 AM, Amos Jeffries wrote:
>>  - supports non-TCP protocols.
>>  - security section says it could be full of lies. So the A[ctually] is 
>> incorrect.
> 
> IMHO, you are being too literate with the words in the protocol name
> while being very permissive with the protocol specs. Most of the ones
> you rejected can be used without harm IMHO.
> 
> 
>> I was thinking you had something funny along the lines of:
>>
>> * Traffic Envelope Annex protocol (TEA p'ot)
> 
> I did not have anything like that in mind. Personally, I would not call
> it "envelop" because the protocol does not envelop the message, it only
> provides a prefix, header, or "top line". This is why I suggested
> "letterhead" rather than "envelop".

It is not message (Layer 3) related in the slightest. This is an Layer
2.1 envelope for the entire connection for delivery between softwares.
Below even TLS (Layer 2.5).

Amos

> 
> Other related variants are:
> 
>   * Client Letterhead Protocol
>   * Client Annex Protocol
> 
> Pretty much anything would be better than PROXY though :-).
> 
> 
> Cheers,
> 
> Alex.
> 



Re: [RFC] post-cache REQMOD

2014-07-11 Thread Amos Jeffries
On 11/07/2014 4:27 p.m., Alex Rousskov wrote:
> On 07/10/2014 09:12 PM, Amos Jeffries wrote:
>> On 11/07/2014 10:15 a.m., Alex Rousskov wrote:
>>> I propose adding support for a third adaptation vectoring point:
>>> post-cache REQMOD. Services at this new point receive cache miss
>>> requests and may adapt them as usual. If a service satisfies the
>>> request, the service response may get cached by Squid. As you know,
>>> Squid currently support pre-cache REQMOD and pre-cache RESPMOD.
>>
>> Just to clarify you mean this to be the vectoring point which receives
>> MISS-only traffic, as the existing one(s) receive HIT+MISS traffic?
> 
> + pre-cache REQMOD receives HIT+MISS request traffic.
> + pre-cache RESPMOD receives MISS response traffic.
> * post-cache REQMOD receives MISS request traffic.
> - post-cache RESPMOD receives HIT+MISS response traffic.
> 
> All four vectoring points are documented in RFC 3507. Squid currently
> supports the first two. I propose adding support for the third. Each of
> the four points (and probably other!) is useful for some use cases.
> 
> Besides getting different HIT/MISS traffic mixture, there are other
> differences between these vectoring points. For example, pre-cache
> REQMOD responses go straight to the HTTP client while post-cache REQMOD
> responses may be cached by Squid (this example is about request
> satisfaction mode of REQMOD).
> 
> 
>>> We have received many requests for post-cache adaptation support
>>> throughput the years, and I personally resisted the temptation of adding
>>> another layer of complexity (albeit an optional one) because it is a lot
>>> of work and because many use cases could be addressed without post-cache
>>> adaptation support.
>>>
>>> The last straw (and the motivation for this RFC) was PageSpeed[1]
>>> integration. With PageSpeed, one can generate various variants of
>>> "optimized" content. For example, mobile users may receive smaller
>>> images. Apache and Nginx support PageSpeed modules. It is possible to
>>> integrate Squid with PageSpeed (and similar services) today, but it is
>>> not possible for Squid to _cache_ those generated variants unless one is
>>> willing to pay for another round trip to the origin server to get
>>> exactly the same unoptimized content.
>>
>> Can you show how they are violating standard HTTP variant caching? the
>> HTTPbis should probably be informed of the problem.
>> If it is actually within standard then it would seem to be a missing
>> feature of Squid to cache them properly. We could improve better by
>> fixing Squid to cache more compliant traffic.
> 
> This is unrelated to caching rules. HTTP does not have a notion of
> creating multiple responses from a single response, and that is exactly
> what PageSpeed and similar adaptations do: For example, they convert a
> large origin server image or page into several variants, one for each
> class of clients.
> 


Indeed. So this implementation of PageSpeed by requiring HTTP agents to
transform traffic mid-transit from single to multiple responses is a
violation.

Or, PageSpeed is completely unnecessary mid-transit and has nothing to
do with Squid and caching. Which can cache either the small shrunk
objects, or the single large one just fine. It is instead the attempt to
perform end-server operations in a proxy/gateway which is driving this
change.

Amos


Re: [PATCH] Support PROXY protocol

2014-07-11 Thread Amos Jeffries
cc'ing Willy so he can get in on this.

On 11/07/2014 5:03 p.m., Alex Rousskov wrote:
> On 06/25/2014 01:41 PM, Alex Rousskov wrote:
>> On 06/21/2014 11:15 PM, Amos Jeffries wrote:
>>>> Support receiving PROXY protocol version 1 and 2.
> 
>> sounds like nothing-on-top-of-nothing to me in Squid context? The
>> terrible name for the PROXY protocol itself is clearly not your fault
> 
> 
> Per Amos request on IRC, here are some suggestions for renaming the new
> protocol itself (in case the protocol author, Willy Tarreau, might be
> open to suggestions):
> 
>   * On Behalf Of Protocol
> 
>   * True Client Information Disclosure Protocol

  - truth remains unknown despite ths protocol.

>   * TCP Client Information Disclosure Protocol

 - supports non-TCP protocols.

> 
>   * The Right To Be Remembered Protocol
>   * Remember The Right Client Protocol
> 
>   * From and To Protocol
>   * Actually From and To Protocol

  - security section says it could be full of lies. So the A is incorrect.

> 
>   * Letterhead Protocol
>   * True Client Letterhead Protocol

  - ditto on the truth issue.

>   * TCP Client Letterhead Protocol

 - ditto on the TCP issue.

> 
>   * Top Line Protocol
>   * Top Header Protocol
>   * Top Shelf Protocol

  - taken.


I was thinking you had something funny along the lines of:

* Traffic Envelope Annex protocol (TEA p'ot)

We could also reply with HTTP 418 and close the connection on protocol
failures.

The semantics actually fits the dictionary definitions of the words too :-)

Amos

> 
> 
> Most can be creatively "acronymized" for brevity, of course.
> 
> I have not validated any of these suggestions against prior art and
> cultural clashes (although some puns were intentional).
> 
> 
> The protocol renaming can be combined with renaming the PROXY magic word
> in the top line as well, but they do not have to.
> 
> 
> HTH,
> 
> Alex.
> 



Re: Possible memory leak.

2014-07-10 Thread Amos Jeffries
On 11/07/2014 6:10 a.m., Eliezer Croitoru wrote:
> OK so I started this reverse proxy for a bandwidth testing site and it
> seems odd that it using more then 400MB when the only difference in the
> config is maximum_object_size_in_memory to 150MB and StoreID
> 
> I have extracted mgr:mem and mgr:info at these urls:
> http://www1.ngtech.co.il/paste/1169/raw/
> http://www1.ngtech.co.il/paste/1168/raw/
> 
> A top snapshot:
> http://www1.ngtech.co.il/paste/1170/raw/
> 
> The default settings are 256MB for ram cache and this instance is ram
> only..
> squid.conf at:
> http://www1.ngtech.co.il/paste/1171/raw/
> 
> I started the machine 29 days ago while squid is up for less then that.
> 
> Any direction is welcomed but test cannot be done on this machine
> directly for now.
> 
> Eliezer

This may be what Martin Sperl is reporting in the squid-users thread
"squid: Memory utilization higher than expected since moving from 3.3 to
3.4 and Vary: working"

What I'm trying to get from him there is a series of mgr:mem reports
over time to see if any particular object type is growing unusually. And
mgr:filedescriptors in case its a side effect of the hung connections
Christos identified recently.

If we are lucky enough that the squid was built with valgrind support
there should be a valgrind leak trace available in one of the info and
mem reports. This will only catch real leaks though, not ref-counting
holding things active.

Amos



Re: [RFC] post-cache REQMOD

2014-07-10 Thread Amos Jeffries
On 11/07/2014 10:15 a.m., Alex Rousskov wrote:
> Hello,
> 
> I propose adding support for a third adaptation vectoring point:
> post-cache REQMOD. Services at this new point receive cache miss
> requests and may adapt them as usual. If a service satisfies the
> request, the service response may get cached by Squid. As you know,
> Squid currently support pre-cache REQMOD and pre-cache RESPMOD.

Just to clarify you mean this to be the vectoring point which receives
MISS-only traffic, as the existing one(s) receive HIT+MISS traffic?


> We have received many requests for post-cache adaptation support
> throughput the years, and I personally resisted the temptation of adding
> another layer of complexity (albeit an optional one) because it is a lot
> of work and because many use cases could be addressed without post-cache
> adaptation support.
> 
> The last straw (and the motivation for this RFC) was PageSpeed[1]
> integration. With PageSpeed, one can generate various variants of
> "optimized" content. For example, mobile users may receive smaller
> images. Apache and Nginx support PageSpeed modules. It is possible to
> integrate Squid with PageSpeed (and similar services) today, but it is
> not possible for Squid to _cache_ those generated variants unless one is
> willing to pay for another round trip to the origin server to get
> exactly the same unoptimized content.

Can you show how they are violating standard HTTP variant caching? the
HTTPbis should probably be informed of the problem.
If it is actually within standard then it would seem to be a missing
feature of Squid to cache them properly. We could improve better by
fixing Squid to cache more compliant traffic.

> 
> The only way to support Squid caching of PageSpeed variants without
> repeated round trips to the origin server is using two Squids. The
> parent Squid would cache origin server responses while the child Squid
> would adapt parent's responses and cache adapted content. Needless to
> say, running two Squids (each with its own cache) instead of one adds
> significant performance/administrative overheads and complexity.
> 
> 
> As far as internals are concerned, I am currently thinking of launching
> adaptation job for this vectoring point from FwdState::Start(). This
> way, its impact on the rest of Squid would be minimal and some adapters
> might even affect FwdState routing decisions. The initial code name for
> the new class is MissReqFilter, but that may change.
> 

Given that FwdState is the global selector to determine where MISS
content comes from this sounds reasonable.

I think after the miss_access tests is best position. We need to split
miss_access lookup off into an async step to be a slow lookup anyway.


> 
> The other candidate location for plugging in the new vectoring point is
> the Server class. However, that class is already complex. It handles
> communication with the next hop (with child classes doing
> protocol-specific work and confusing things further) as well as
> pre-cache RESPMOD vectoring point with caching initiation on top of
> that. The Server code already has trouble distinguishing various content
> streams it has to juggle. I am worried that adding another vectoring
> point there would make that complexity significantly worse.

Agreed. Bad idea.

> 
> It is possible that we would be able to refactor/encapsulate some of the
> code so that it can be reused in both the existing Server and the new
> MissReqFilter classes. I will look out for such opportunities while
> trying to keep the overall complexity in check.
> 
> 
> Any objections to adding post-cache REQMOD or better implementation ideas?

Just the above details about variant caching.

Amos



Re: [PATCH] SSL Server connect I/O timeout

2014-07-10 Thread Amos Jeffries
On 28/06/2014 3:38 a.m., Tsantilas Christos wrote:
> Hi all,
> 
> Currently FwdState::negotiateSSL() operates on a TCP connection without
> a timeout. If, for example, the server never responds to Squid SSL
> Hello, the connection getstuck forever. This happens in real world when,
> for example, a client is trying to establish an SSL connection through
> bumping Squid to an HTTP server that does not speak SSL and does not
> detect initial request garbage (from HTTP point of view)
> 
> Moreover, if the client closes the connection while Squid is fruitlessly
> waiting for server SSL negotiation, the client connection will get into
> the CLOSE_WAIT state with a 1 day client_lifetime timeout.  This patch
> does not address that CLOSE_WAIT problem directly.
> 
> This patch adds an SSL negotiation timeout for the server SSL connection
> and try to not exceed forword_timeout or peer_timeout while connecting
> to an SSL server.
> 
> Some notes:
>  - In this patch still the timeouts used for Ssl::PeerConnector are not
> accurate, they may be 5 secs more then the forward timeout or 1 second
> more than peer_connect timeout, but I think are enough reasonable.
> 
>  - Please check and comment the new
> Comm::Connection::startTime()/::noteStart() mechanism.
> Now the Comm::Connection::startTime_ computed in Comm::Connection
> constructor and resets in Comm::ConnOpener::start() and
> Comm::TcpAcceptor::start()
> 
> 
> This is a Measurement Factory project.


+1. Please apply ASAP.

Amos


Re: What would be the right path for bug 3221? ICAP date header related issue.

2014-07-07 Thread Amos Jeffries

On 2014-07-08 08:33, Eliezer Croitoru wrote:

http://bugs.squid-cache.org/show_bug.cgi?id=3221

On the bug report there is a description of squid denying the ICAP
server due to some logic of wrong Date header.
Since squid and ICAP server times are too far apart the service cannot
be considered reliable for use.



That depends on where the Date header is.

 * Date header in the ICAP headers is a bit of a problem. But I think we 
can work around it my noting the offset and reporting an error for the 
admin to fix.


 * Date header in the produced HTTP request/reply headers is kind of 
bad. It will screw up the HTTP caching algorithms, - BUT we already have 
a prescribed way of handling this in HTTP.
  Simply be pessimistic and use the oldest of the available Date values 
in the cache algorithm. Worst-case regardless of future or past Date 
value is early expiry.




The effect can be for example on an ICAP service that provides time
based which would note make any sense if squid has the wrong time or
the ICAP service has the wrong time.
Another effect is that ICAP service often helps to alter the response
and also can affect the Date headers of the object.
In any case squid time should be as accurate as it can to provide
accurate and valid cache.


When being pessimistic with time/Date this does not matter. Worst case 
is early expiry and request gets passed to the backend server earlier 
than expected.
 If ICAP is presenting a Reply headers with wrong Date the same 
pessimistic output causes the downstream recipient to act pessimistic as 
well, or handle the timing error properly if it can.





At this point I am looking for another opinions and options\scenarios
which this issue effect can be understood properly.

(One example of a service that doesn't tolerate time "glitches" is
dovecot which recognizes that the server is too far behind or too far
ahead and stops functioning in couple scenarios.)

Leaving the report by itself:
Can we define what would a good behavior be from squid side?
If for example the ICAP server is far ahead 20 years from the squid
server, should it be considered right?

The basic scenario of ICAP servers with different times is for now on
the 24 hours clockwise due to the fact that a ICAP service can sit in
one place on earth while the proxy is in another or the two servers
local time is configured using a local NTP and with the local time and
not UTC\GMT etc related.

I did not reviewed the relevant code yet and hence I do not know the
exact reason for that but once we will have a basic idea of what is
right and wrong we can decide on the right path for a solution.

Thanks,
Eliezer


Re: [PATCH] Support PROXY protocol

2014-06-25 Thread Amos Jeffries
On 26/06/2014 4:53 a.m., Eliezer Croitoru wrote:
> I was not expecting this patch due to old emails about the proxy
> protocol implementation.
> I understand from the email that after this patch we can use STUNNEL and
> HAPROXY in-front of squid. right?

Right. stunnel, HAProxy and any other gateway software which supports
sending the protocol.

I was also not expecting it to happen for a version for two either, but
Willy and I got talking about it the other day and when I looked closer
the work already done on the parser and client-side cleanup happens to
be enough to make it quite a relatively clean and simple addition.

Amos

> +1 (for the idea and looked a bit at the code itself)
> 
> Eliezer
> 
> On 06/22/2014 08:15 AM, Amos Jeffries wrote:
>> Support receiving PROXY protocol version 1 and 2.
>>
>> PROXY protocol has been developed by Willy Tarreau of HAProxy for
>> communicating original src and dst IP:port details between proxies and
>> load balancers in a protocol-agnostic way.
>>
>> stunnel, HAProxy and some other HTTP proxying software are already
>> enabled and by adding support to Squid we can effectively chain these
>> proxies without having to rely on X-Forwarded-For headers.
>>
>> This patch adds http(s)_port mode flag (proxy-surrogate) to signal the
>> protocol is in use, parsing and processing logics for the PROXY protocol
>> headers on new connections, and extends the follow_x_forwarded_for
>> (renamed proxy_forwarded_access) access control to manage inbound
>> connections.
>>   The indirect client security/trust model remains unchanged. As do all
>> HTTP related logics on the connection once PROXY protocol header has
>> been received.
>>
>>
>> Furture Work:
>>   * support sending PROXY protocol to cache_peers
>>   * rework the PROXY parse logics as a Parser-NG child parser.
>>
>> Amos
> 



Re: [RFC] connections logging

2014-06-23 Thread Amos Jeffries
On 24/06/2014 4:41 p.m., Kinkie wrote:
> On Tue, Jun 24, 2014 at 2:26 AM, Alex Rousskov
>  wrote:
>> On 06/20/2014 03:04 AM, Amos Jeffries wrote:
>>> I am playing with the idea of adding a new log file to record just the
>>> connections handled by Squid (rather than the HTTP request/reply
>>> transactions).
>>>
>>> This log would include connections opened but never used, port details,
>>> connection lifetimes, re-use counts on persistent connections, SSL/TLS
>>> presence/version (or not), and other connection details we find a need
>>> for later.
>>
>>
>>> The driver behind this is to help resolve a growing amount of user
>>> queries regarding "happy eyeballs" idle connections and broken TLS
>>> connections. We are also adding other potentially never-used connections
>>> ourselves with the standby pools.
>>
>> A couple of days ago, I came across another use case for logging
>> connections unrelated to Happy Eyeballs. It sounds like this is going to
>> be a generally useful feature. My use case is still being clarified, but
>> here are the already known non-obvious requirements:
> 
> I have an use case, as well: timing logging.
> It would be useful to have a chance to log the timing of certain key
> moments in a request's processing path. Things like accept, end of
> slow acl matching, end of dns resolution(s), connected to uplink, and
> so on. This could help administrators identify congestion issues in
> their infrastructure.
> 
>Kinkie
> 


And that use case gives us a good potential name for it. event.log ?

Amos


Re: [RFC] connections logging

2014-06-23 Thread Amos Jeffries
On 24/06/2014 12:26 p.m., Alex Rousskov wrote:
> On 06/20/2014 03:04 AM, Amos Jeffries wrote:
>> I am playing with the idea of adding a new log file to record just the
>> connections handled by Squid (rather than the HTTP request/reply
>> transactions).
>>
>> This log would include connections opened but never used, port details,
>> connection lifetimes, re-use counts on persistent connections, SSL/TLS
>> presence/version (or not), and other connection details we find a need
>> for later.
> 
> 
>> The driver behind this is to help resolve a growing amount of user
>> queries regarding "happy eyeballs" idle connections and broken TLS
>> connections. We are also adding other potentially never-used connections
>> ourselves with the standby pools.
> 
> A couple of days ago, I came across another use case for logging
> connections unrelated to Happy Eyeballs. It sounds like this is going to
> be a generally useful feature. My use case is still being clarified, but
> here are the already known non-obvious requirements:
> 
> 1. Logging when the connection is received (unlike transactions that are
> logged when the transaction is completed).
> 
> 2. Logging client-to-Squid connections.
> 
> 3. Introduction of some kind of unique connection ID that can be
> associated with HTTP/etc transactions on that connection, correlating
> access.log and connection.log entries. [ uniqueness scope is being
> clarified, but I hope it would just be a single worker lifetime ]
> 

For IDs I was thinking of just emitting the MasterXaction::id value sent
in by TcpAcceptor. Assumign this is defined as the TCP transaction (see
below).

> 
> Let's coordinate our efforts!
> 

Sure.

> 
>> The Squid native access.log appears unsuitable for adjustment given its
>> widely used format and the number of details to be logged.
> 
> Agreed. We should support a similarly configurable log format and other
> existing logging knobs but use a different, dedicated log (or logs if we
> want to record from-Squid connections and to-Squid connections
> differently). Reusable logformat codes such as %la and %lp should be
> reused, but new ones will probably be needed as well.
> 

Ack.

> 
>> There is also a useful side effect of MasterXaction. At present the
>> TcpAcceptor generates a MasterXaction object to record initial
>> connection details for the handler, this creates a bit of complexity in
>> the recipient regarding whether the MasterXactino object applies to a
>> future HTTP request or one which is currently being parsed. This could
>> be simplified if the connection was a transaction and the HTTP requests
>> on it each a different transaction spawned from the connection
>> MasterXaction when parsing a new requst.
> 
> I am not sure what you mean by "if the connection was a transaction".

I mean defining the TCP actions of opening a socket/connnection, its
entire lifetime, and close() as being one transaction. With the
TcpAcceptor produced MasterXaction representing that cycle.

HTTP requests arriving on the connection being separate transactions
spawned by the parsing actions. (unfortunately the read(2) does not
always nicely match a parse event identifying the exact HTTP request start).

So we have two new transaction types there. TCP transaction and HTTP
transaction, to go alongside ICAP and eCAP transaction etc.


> The MasterXaction is usually associated with at least one connection
> (via the small "t" transaction using that connection), and one
> connection is usually associated with at least one MasterTransaction,
> but transactions and connections are completely different concepts at
> Squid level IMO. I doubt we should try to merge the concept of a
> transaction and a connection somehow.

Generating the HTTP transaction MasterXaction object (child of
MasterXaction?) by cloning it from the TCP connection one represents
that relationship. The clone could be seeded with the parents id value
to persist that info for later use (eg your start of HTTP transaction
log entry).

> 
> MasterTransaction is created at connection acceptance time because that
> is when the first HTTP/etc. transaction starts (parsing headers is not
> the first step) and, hence, that is when the master transaction starts.

Hmm. We have some issue here satisfying the people who want HTTP
transation only to represent the time of request+reply versus browser
happy eyeballs (and now Squid) opening standby connections with long
durations before first byte. That is one reason supporting separate
transaction MasterXaction for TCP and HTTP.

> 
> We could make the connection acceptance event itself a transaction (an
> object of some new AcceptXaction class inherited from a gener

Re: [PATCH] Fixes to get more collapsed forwarding hits

2014-06-23 Thread Amos Jeffries
On 24/06/2014 4:07 p.m., Alex Rousskov wrote:
> Hello,
> 
> The attached patch contains several changes to improve the
> probability of getting a collapsed forwarding hit:
> 
> * Allow STORE_MEMORY_CLIENTs to open disk files if needed and possible.
> STORE_*_CLIENT designation is rather buggy (several known XXXs). Some
> collapsed clients are marked as STORE_MEMORY_CLIENTs (for the lack of
> info at determination time) but their hit content may actually come from
> a disk cache.
> 
> * Do not abandon writing a collapsed cache entry when we cannot cache
> the entry in RAM if the entry can be cached on disk instead. Both shared
> memory cache and the disk cache have to refuse to cache the entry for it
> to become non-collapsible. This dual refusal is difficult to detect
> because each cache may make the caching decision at different times.
> Added StoreEntry methods to track those decisions and react to them.

Hmm. FTR: IMHO the only reasons a response should be non-collapsible is
if it is a) private, b) authenticated to another user, c) explicitly
known non-cacheable, d) a non-matching variant of the URL, or e) we
already discarded some body bytes.

The factor of memory/disk caches not having provided cacheable size
allow/reject result yet is no reason to be non-collapsible. It is a
reason to wait on that result to determine the cacheability condition (c).

> 
> * Recognize disk cache as a potential source of the collapsed entry when
> the memory cache is configured. While collapsed entries would normally
> be found in the shared memory cache, caching policies and other factors
> may prohibit memory caching but still allow disk caching. Memory cache
> is still preferred.
> 
> * Do not use unknown entry size in StoreEntry::checkTooSmall()
> determination. The size of collapsed entries is often unknown, even when
> they are STORE_OK (because swap_hdr_sz is unknown when the other worker
> has created the cache entry).  The same code has been using this
> ignore-unknowns logic for the Content-Length header value, so the
> rejection of unknown entry size (added as a part of C++ conversion
> without a dedicated message in r5766) could have been a typo.
> 
> 
> If approved, I will commit the change for the last bullet separately
> because it is easier to isolate and is less specific to collapsed
> forwarding.
> 
> 
> The patch does not attempt to cover all possible collapsing cases that
> Squid currently misses, of course. More work remains in this area
> (including a serious STORE_*_CLIENT redesign), but the proposed fixes
> should make a dent.
> 
> These changes were developed for an earlier trunk version but the
> trunk-ported patch posted here also passed simple lab tests.
> 
> 
> Thank you,
> 
> Alex.
> 

+1. Although I do not know enough about store yet to do much in-depth
audit of the logic changes.

Amos


Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-23 Thread Amos Jeffries
On 24/06/2014 1:44 a.m., Tsantilas Christos wrote:
> On 06/23/2014 09:50 AM, Amos Jeffries wrote:
>> On 23/06/2014 5:43 a.m., Tsantilas Christos wrote:
>>> Hi all,
>>>
>>> The attached patch replaces existin annotation values with the new one
>>> received from helpers.
>>>
>>> Just one question. We are documenting key-value pairs in cf.data.pre
>>> only for url-rewriter helpers, but annotations works for all squid
>>> helpers.
>>> Should we move the related url-rewriter section to a global section? If
>>> yes where?
>>>
>>> For example something like the following in a global section should be
>>> enough:
>>>
>>> The interface for all helpers has been extended to support arbitrary
>>> lists of key=value pairs, with the syntax  key=value . Some keys have
>>> special meaning to Squid, as documented here.
>>>
>>> Currently Squid understands the following optional key=value pairs
>>> received from URL rewriters:
>>>   clt_conn_id=ID
>>>  Associates the received ID with the client TCP connection.
>>>  The clt_conn_id=ID pair is treated as a regular annotation but
>>>  it persists across all transactions on the client connection
>>>  rather than disappearing after the current request. A helper
>>>  may update the client connection ID value during subsequent
>>>  requests by returning a new ID value. To send the connection
>>>  ID to the URL rewriter, use url_rewrite_extras:
>>>  url_rewrite_extras clt_conn_id=%{clt_conn_id}note ...
>>>
>>
>> Design issue:
>>   * can we call this "client-tag=" or "connection-tag=" or
>> "connection-label=". We have ID used internally via InstanceId<>
>> template members, which are very different to this "ID" being
>> user-assigned.
> 
> The clt_conn_id is not a client tag, but it can be a connection-tag or
> connection-label.
> Lets wait Alex to answer on this.
> 

See my answer to Alex. By using this note the helper is explicitly
defining that connection == 1 client.


>>
>>
>> in src/Notes.cc:
>> * replaceOrAdd iterates over the notes list twice in the event of
>> removals (due to remove() iterating over it from scratch). Please fix
>> this to iterate over the list only once and do an in-place replace if
>> the key is found already there, add if end() is reached.
>>
>> * remove() appears to be unnecessary after the above correction.
> 
> I can fix a litle replaceOrAdd, but the remove() method still is needed.
> 

By what code?  I see it only being used by replaceOrAdd(), and only to
perform the unnecessary loop. It is fine as a remove() implementation,
but to me looks needless at this time.


> The replaceOrAdd() method method will be:
> void
> NotePairs::replaceOrAdd(const NotePairs *src)
> {
>  for (std::vector::const_iterator  i =
> src->entries.begin(); i != src->entries.end(); ++i) {
> remove((*i)->name.termedBuf());
> entries.push_back(new NotePairs::Entry((*i)->name.termedBuf(),
> (*i)->value.termedBuf()));
> }
> }
> 
>>
>>
>> in Notes.*:
>> * findFirst() no longer makes sense. There being no "second" now.
>>   - Please rename to find().
> 
> This is not true. We can still support multiple values per key name.
> I am suggesting to not touch this one.
> 

Every helper callback is now using Update*() which uses replaceOrAddd()
to ensure uniqueness.

I see that the list output directly from the helper may contain
duplicates and might be used by the handler instead of the HttpRequest
merged list. We may want to change that to make handler behaviour match
behaviour of the ACL testing the HttpRequest list.


>>
>>
>> in src/client_side_request.cc:
>> * ClientRequestContext::clientRedirectDone was previously calling
>> SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes
>> list before updating the request. Order is important here because Sync
>> will kill Squid with an assert if the objects have different notes list
>> objects.
>> Until that is fixed please retain the order of:
>>   - sync first, then update the request.
>>
>> * same in ClientRequestContext::clientStoreIdDone
> 
> It is the same, it is not make difference.
> UpdateRequestNotes will just copy helper notes to the request notes. The
> SyncNotes will check if AccessLogEntry and HttpRequest object points to
> the same NotePairs object.

If you are going to retain it the way it is please add a note to the
AccessLogEntry::notes member indicating that it must NEVER be set other
than by calling SyncNotes() with the current transactions HttpRequest
object. Doing so will guarantee crashes after this patch goes in.

The burden will also be placed on auditors to check this with every
notes related patch. (I would rather avoid that extra work TBH).

Amos


Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-23 Thread Amos Jeffries
On 24/06/2014 8:19 a.m., Alex Rousskov wrote:
> On 06/23/2014 07:44 AM, Tsantilas Christos wrote:
>> On 06/23/2014 09:50 AM, Amos Jeffries wrote:
>>> On 23/06/2014 5:43 a.m., Tsantilas Christos wrote:
>>>>   clt_conn_id=ID
>>>>  Associates the received ID with the client TCP connection.
>>>>  The clt_conn_id=ID pair is treated as a regular annotation but
>>>>  it persists across all transactions on the client connection
>>>>  rather than disappearing after the current request. A helper
>>>>  may update the client connection ID value during subsequent
>>>>  requests by returning a new ID value. To send the connection
>>>>  ID to the URL rewriter, use url_rewrite_extras:
>>>>  url_rewrite_extras clt_conn_id=%{clt_conn_id}note ...
> 
> 
>>>   * can we call this "client-tag=" or "connection-tag=" or
>>> "connection-label=". We have ID used internally via InstanceId<>
>>> template members, which are very different to this "ID" being
>>> user-assigned.
> 
> 
>> The clt_conn_id is not a client tag, but it can be a connection-tag or
>> connection-label.
> 
> 
> "client-tag" does not work well indeed because the tag is specific to
> the connection, not the client, for some popular definitions of a
> "client". For example, many differently authenticated "clients" may
> share the same tagged connection (when that connection comes from a peer
> that aggregates traffic from many end-users).


My understanding was that a major use-case for this feature was
"
for faking various forms of
connection-based "authentication" when standard HTTP authentication
cannot be used.
"

Allowing client A to "login" client B, C, ...Z etc only makes sense if
the "client" and "user" are detached concepts at this level. Client
being strictly the client-server model type of client for whom any
number of "users" may exist, or proxying for other further away clients
in a multi-hop setup.

Either way by using this tag the helper is assuming that it is relevant
to all future requests on the whole connection. ie that there is either
one "client", or one "user" which the note represents.


> 
> "connection-tag" does not work well because the proposed feature is
> specific to the connection coming from an HTTP client. We may decide to
> tag connections to servers later.
> 
> 
> The internal instance IDs are not particularly relevant to this
> discussion about an external interface name IMO, but yes, there are many
> IDs used in Squid, for various needs. The "connectionId_" name of the
> new data member avoids confusion with the existing InstanceId members
> IMO, but if the proposed option is renamed, we may need to rename the
> data member as well. Please note that there is no "client" in the data
> member name because the class "manages a connection to a client" so the
> "client" scope should be assumed by default.
> 
> 
> Amos, does clt_conn_tag, clt_conn_mark, clt_conn_label, or clt_conn_note
> (or their longer, partially spelled out versions) sound better than
> "clt_conn_id" to you?

Yes. Any of them are better. "tag" has existing history from
exetrnal_acl_type interface that may want to be considered.

Amos



Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-22 Thread Amos Jeffries
On 23/06/2014 5:43 a.m., Tsantilas Christos wrote:
> Hi all,
> 
> The attached patch replaces existin annotation values with the new one
> received from helpers.
> 
> Just one question. We are documenting key-value pairs in cf.data.pre
> only for url-rewriter helpers, but annotations works for all squid helpers.
> Should we move the related url-rewriter section to a global section? If
> yes where?
> 
> For example something like the following in a global section should be
> enough:
> 
> The interface for all helpers has been extended to support arbitrary
> lists of key=value pairs, with the syntax  key=value . Some keys have
> special meaning to Squid, as documented here.
> 
> Currently Squid understands the following optional key=value pairs
> received from URL rewriters:
>  clt_conn_id=ID
> Associates the received ID with the client TCP connection.
> The clt_conn_id=ID pair is treated as a regular annotation but
> it persists across all transactions on the client connection
> rather than disappearing after the current request. A helper
> may update the client connection ID value during subsequent
> requests by returning a new ID value. To send the connection
> ID to the URL rewriter, use url_rewrite_extras:
> url_rewrite_extras clt_conn_id=%{clt_conn_id}note ...
> 

Design issue:
 * can we call this "client-tag=" or "connection-tag=" or
"connection-label=". We have ID used internally via InstanceId<>
template members, which are very different to this "ID" being user-assigned.


in src/Notes.cc:
* replaceOrAdd iterates over the notes list twice in the event of
removals (due to remove() iterating over it from scratch). Please fix
this to iterate over the list only once and do an in-place replace if
the key is found already there, add if end() is reached.

* remove() appears to be unnecessary after the above correction.


in Notes.*:
* findFirst() no longer makes sense. There being no "second" now.
 - Please rename to find().


in src/cf.data.pre:
* "kv-pairs" is a format syntax definition used throughout the
documentation. Please retain it. If the term needs re-defining that is
an issue outside this patch scope.

* Please reword "Squid understands the following optional key=value
pairs received from URL rewriters:". Since there are optional kv-pairs
already documented in the rewrite vs redirect vs no-change syntax we
need to specify that these are optional extensions.
Perhapse this:
 "In addition to the above kv-pairs Squid also understands the following
optional kv-pairs received from URL rewriters:"

* Please update the kv-pair description like so (where FOO is the term
tag or label selected to replace ID by the above mentioned design issue)
"
  Associates a FOO with the client TCP connection.
  The FOO is treated as a regular annotation but persists
  across future requests on the client connection
  rather than just the current request.
  A helper may update the FOO during subsequent
  requests by returning a new kv-pair.
"

* Reference to url_rewrite_extras should be directy after the helper
input syntax line. Like so:
"
  [channel-ID ] URL [ extras]

see url_rewrite_extras on how to send "extras" with
optional values to the helper.

"


in src/client_side_request.cc:
* ClientRequestContext::clientRedirectDone was previously calling
SyncNotes to ensure the AccessLogEntry and HttpRequest shared a notes
list before updating the request. Order is important here because Sync
will kill Squid with an assert if the objects have different notes list
objects.
Until that is fixed please retain the order of:
 - sync first, then update the request.

* same in ClientRequestContext::clientStoreIdDone

* in ClientHttpRequest::doCallouts() please use SBuf::isEmpty() instead
of length() to check for existence.


Amos




Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-22 Thread Amos Jeffries
On 23/06/2014 5:43 a.m., Tsantilas Christos wrote:
> Hi all,
> 
> The attached patch replaces existin annotation values with the new one
> received from helpers.
> 
> Just one question. We are documenting key-value pairs in cf.data.pre
> only for url-rewriter helpers, but annotations works for all squid helpers.
> Should we move the related url-rewriter section to a global section? If
> yes where?
> 

The external_acl_type and store_id_program document their protocol and
keys as well. Those interfaces are one where users commonly are required
to create custom helpers so having it in these docs is helpful if a bit
odd. The other interfaces (auth, logging, unlinkd, diskd, pinger, etc)
which do not document the protocol in detail generally the Squid bundled
helpers are used and the docs reference the wiki for the more advanced
users.

Amos



Re: [PATCH] validate -n parameter value

2014-06-22 Thread Amos Jeffries
On 17/06/2014 4:06 a.m., Alex Rousskov wrote:
> On 06/15/2014 12:05 AM, Amos Jeffries wrote:
>> +if (optarg) {
>> +SBuf t(optarg);
>> +::Parser::Tokenizer tok(t);
> 
> Pleaser remove "t" as used-once and lacking a descriptive name. If you
> insist on keeping it, please make it "const" and try to give it a more
> descriptive name such as rawName or serviceName.


> 
>> +const CharacterSet chr = 
>> CharacterSet::ALPHA+CharacterSet::DIGIT;
> 
> Is there a document stating that only those characters are valid? My
> quick search resulted in [1] that seems to imply many other characters
> are accepted. However, [2] lists more prohibited characters. Both seem
> to allow "-" and "_" though. It would be good to provide a reference in
> the code or commit message to explain why we are restricting its value.
> 

Arbitrary design choice for guaranteed portability. Other characters can
be added if necessary, but most lack cross-platform usability for all
the situations the service name string is being used.


> Do you want to validate the total service name length? [1] says the
> limit is 256 characters.
> 
> None of the above applies to -n used on Unix. Do we need to validate the
> service name (whatever that is) there?

IMO portable consistency is better for this type of thing than being
pedantically accepting for the OS-specific character sets.

Yes, length needs to be checked. Thank you.

Does an arbitrary 32 bytes seem acceptible since this is a prefix on the
UDS sockets and paths which the final total still needs to fit within
the 256 or so for some OS?


> 
>> +if (!tok.prefix(service_name, chr) || !tok.atEnd())
> 
> Please note that the above also rejects empty service names (-n "").
> That may be good, but the error messages do not reflect that restriction.
> 
> 
>> +fatal("Need to provide alphanumeric service name when 
>> using -n option");
>> +opt_signal_service = true;
>> +} else {
>> +fatal("Need to provide alphanumeric service name when using 
>> -n option");
>> +}
> 
> I recommend adjusting the fatal() messages to distinguish the two errors
> and avoid a possible misunderstanding that the service name was provided
> but was not alphanumeric in the second case:
> 
>   * Expected alphanumeric service name for the -n option but got: ...
>   * A service name is required for the -n option.
> 

"" is not an alphanumeric string. But okay.

> 
> I continue to dislike the undefined -n option meaning on non-Windows
> boxes, but I have no objection to this patch and believe the above
> changes can be done without a new review cycle.
> 
> 
> Cheers,
> 
> Alex.
> [1] http://msdn.microsoft.com/en-us/library/ms682450%28VS.85%29.aspx
> [2] http://www.anzio.com/resources/cannot-install-or-start-windows-service
> 



Re: [PATCH] remove cruft from libTrie

2014-06-21 Thread Amos Jeffries
On 4/06/2014 2:28 a.m., Kinkie wrote:
> Hi,
>   the attached patch removes some cruft from libTrie (c bindings and
> .cci files). No functional changes apart from that.
> 


+1, although calling the commit "cleanup: simplify libTrie" would be better.

Amos


[PATCH] Support PROXY protocol

2014-06-21 Thread Amos Jeffries
Support receiving PROXY protocol version 1 and 2.

PROXY protocol has been developed by Willy Tarreau of HAProxy for
communicating original src and dst IP:port details between proxies and
load balancers in a protocol-agnostic way.

stunnel, HAProxy and some other HTTP proxying software are already
enabled and by adding support to Squid we can effectively chain these
proxies without having to rely on X-Forwarded-For headers.

This patch adds http(s)_port mode flag (proxy-surrogate) to signal the
protocol is in use, parsing and processing logics for the PROXY protocol
headers on new connections, and extends the follow_x_forwarded_for
(renamed proxy_forwarded_access) access control to manage inbound
connections.
 The indirect client security/trust model remains unchanged. As do all
HTTP related logics on the connection once PROXY protocol header has
been received.


Furture Work:
 * support sending PROXY protocol to cache_peers
 * rework the PROXY parse logics as a Parser-NG child parser.

Amos
=== modified file 'doc/release-notes/release-3.5.sgml'
--- doc/release-notes/release-3.5.sgml  2014-06-04 15:30:16 +
+++ doc/release-notes/release-3.5.sgml  2014-06-22 04:45:09 +
@@ -26,40 +26,41 @@
 Known issues
 
 Although this release is deemed good enough for use in many setups, please 
note the existence of 
 http://bugs.squid-cache.org/buglist.cgi?query_format=advanced&product=Squid&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&version=3.5";
 name="open bugs against Squid-3.5">.
 
 Changes since earlier releases of Squid-3.5
 
 The 3.5 change history can be http://www.squid-cache.org/Versions/v3/3.5/changesets/"; name="viewed 
here">.
 
 
 Major new features since Squid-3.4
 Squid 3.5 represents a new feature release above 3.4.
 
 The most important of these new features are:
 
Support libecap v1.0
Authentication helper query extensions
Support named services
Upgraded squidclient tool
Helper support for concurrency channels
+   Support PROXY protocol
 
 
 Most user-facing changes are reflected in squid.conf (see below).
 
 
 Support libecap v1.0
 Details at http://wiki.squid-cache.org/Features/BLAH";>.
 
 The new libecap version allows Squid to better check the version of
   the eCAP adapter being loaded as well as the version of the eCAP library
   being used.
 
 Squid-3.5 can support eCAP adapters built with libecap v1.0,
but no longer supports adapters built with earlier libecap versions
due to API changes.
 
 
 Authentication helper query extensions
 Details at http://www.squid-cache.org/Doc/config/auth_param/";>.
 
@@ -146,71 +147,99 @@
The default is to use X.509 certificate encryption instead.
 
 When performing TLS/SSL server certificates are always verified, the
results shown at debug level 3. The encrypted type is displayed at debug
level 2 and the connection is used to send and receive the messages
regardless of verification results.
 
 
 Helper support for concurrency channels
 Helper concurrency greatly reduces the communication lag between Squid
and its helpers allowing faster transaction speeds even on sequential
helpers.
 
 The Digest authentication, Store-ID, and URL-rewrite helpers packaged
with Squid have been updated to support concurrency channels. They will
auto-detect the channel-ID field and will produce the appropriate
response format.
With these helpers concurrency may now be set to 0 or any higher number as 
desired.
 
 
+Support PROXY protocol
+More info at http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt";>
+
+PROXY protocol provides a simple way for proxies and tunnels of any kind to
+   relay the original client source details without having to alter or 
understand
+   the protocol being relayed on the connection.
+
+Squid currently supports receiving version 1 or 2 of the protocol.
+   A port which has been configured to receive this protocol may only be used 
to
+   receive traffic from client software sending in this protocol.
+   Regular forward-proxy HTTP traffic is not accepted.
+
+Squid can be configured by adding an http_port or 
https_port
+   with the proxy-surrogate mode flag. The 
proxy_forwarded_access
+   must also be configured with src ACLs to whitelist proxies which 
are
+   trusted to send correct client details.
+
+
+
+ http_port 3128 proxy-surrogate
+ proxy_forwarded_access allow localhost
+
+
+
 Changes to squid.conf since Squid-3.4
 
 There have been changes to Squid's configuration file since Squid-3.4.
 
 Squid supports reading configuration option parameters from external
files using the syntax parameters("/path/filename"). For example:
 
 acl whitelist dstdomain parameters("/etc/squid/whitelist.txt")
 
 
 The squid.conf macro ${service_name} is added to provide the service name
of the process parsing the config.
 
 There have also been changes to individual directives in the config file.
 
 This section gi

[RFC] connections logging

2014-06-20 Thread Amos Jeffries
I am playing with the idea of adding a new log file to record just the
connections handled by Squid (rather than the HTTP request/reply
transactions).

This log would include connections opened but never used, port details,
connection lifetimes, re-use counts on persistent connections, SSL/TLS
presence/version (or not), and other connection details we find a need
for later.

The driver behind this is to help resolve a growing amount of user
queries regarding "happy eyeballs" idle connections and broken TLS
connections. We are also adding other potentially never-used connections
ourselves with the standby pools.

The Squid native access.log appears unsuitable for adjustment given its
widely used format and the number of details to be logged.


There is also a useful side effect of MasterXaction. At present the
TcpAcceptor generates a MasterXaction object to record initial
connection details for the handler, this creates a bit of complexity in
the recipient regarding whether the MasterXactino object applies to a
future HTTP request or one which is currently being parsed. This could
be simplified if the connection was a transaction and the HTTP requests
on it each a different transaction spawned from the connection
MasterXaction when parsing a new requst.

Amos


Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-19 Thread Amos Jeffries
On 20/06/2014 6:07 a.m., Tsantilas Christos wrote:
> On 06/16/2014 06:36 PM, Alex Rousskov wrote:
>> On 06/15/2014 12:07 AM, Amos Jeffries wrote:
>>> On 15/06/2014 4:58 a.m., Alex Rousskov wrote:
>>>> On 06/11/2014 08:52 AM, Tsantilas Christos wrote:
>>>>
>>>>> I must also note that this patch adds an inconsistency. All annotation
>>>>> key=values  pairs received from helpers, accumulated to the
>>>>> existing key
>>>>> notes values. The clt_conn_id=Id pair is always unique and replaces
>>>>> the
>>>>> existing clt_conn_id=Id annotation pair.
>>>>> We may want to make all annotations unique, or maybe implement a
>>>>> configuration mechanism to define which annotations are overwriting
>>>>> their previous values and which appending the new values.
>>>>
>>>> I suggest making all annotations unique (i.e., new values overwrite old
>>>> ones) because helpers that want to accumulate annotation values can do
>>>> that by returning old values along with new ones:
>>>>
>>>>received by helper: name=v1
>>>>returned by helper: name=v1,v2
>>>>
>>>> Please note that the opposite does not work: If annotation values are
>>>> always accumulated, a helper cannot overwrite/remove the old value.
>>
>>
>>> Doing that would mean passing all existing annotations to every helper
>>> lookup.
>>
>> Why would that mean the above?

All helpers may return message= or log=. Replacing existing notes means
we have to pass at least those to every helper explicitly.

>>
>> AFAICT, the helper should get only the annotations it needs. That need
>> is helper-specific and, hence, is configurable via the various _extras
>> and equivalent directives. That is already supported and does not need
>> to change.

Ideally yes. However a helper requiring previous helpers annotations
because it *might* replace them is a new detail created by this proposed
replacment model. In the existing model of appending notes the secondary
helper only requires what it needs for its core action logics.

With the message= example, any 1-byte variance on first helpers message=
output will result in extra churn on the second helpers response cache
and at least an extra lookup loading on the helper.


>>
>> Here is the overall sketch for supporting "unique annotations":
>>
>> 1. Send the helper the annotations it is configured to get
>> (no changes here).
>>
>> 2. For each unique annotation key received from the helper,
>> remove any old annotation(s) with the same key.
>>
>> 3. Store annotations received from the helper
>> (no changes here).
>>
>> To support explicit annotation deletion, we can adjust #3 to skip
>> key-value pairs with the value equal to '-'.
> 
> If there is not any objection  I will implement this scenario.
> 
> Looks that this approach is the best and cover more cases than the
> accumulated Notes values.
> If someones need to accumulate Note values he can configure squid to
> send old note value to helper and helper include it in its response.
> This is simple.

The only case it adds coverage for is giving helpers ability to erase
previous helpers annotations. While reducing the helper response cache
HIT ratio.

That said, I am okay with the design change. AFAIK there is nobody
making much complicated use of helper notes (yet).

Amos


Re: [PATCH] validate -n parameter value

2014-06-16 Thread Amos Jeffries
On 16/06/2014 9:32 p.m., Kinkie wrote:
> LGTM, only:
> Are the first two columns really necessary in the statement
> ::Parser::Tokenizer tok
> ?
> 

Maybe not in this location. But its required in this "global" namespace
location other namespaces I've been using it there was clashes with
component-secific Parser classes. So I've been keeping consistency.

Amos



Re: [PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-14 Thread Amos Jeffries
On 15/06/2014 4:58 a.m., Alex Rousskov wrote:
> On 06/11/2014 08:52 AM, Tsantilas Christos wrote:
> 
>> I must also note that this patch adds an inconsistency. All annotation
>> key=values  pairs received from helpers, accumulated to the existing key
>> notes values. The clt_conn_id=Id pair is always unique and replaces the
>> existing clt_conn_id=Id annotation pair.
>> We may want to make all annotations unique, or maybe implement a
>> configuration mechanism to define which annotations are overwriting
>> their previous values and which appending the new values.
> 
> I suggest making all annotations unique (i.e., new values overwrite old
> ones) because helpers that want to accumulate annotation values can do
> that by returning old values along with new ones:
> 
>   received by helper: name=v1
>   returned by helper: name=v1,v2
> 
> Please note that the opposite does not work: If annotation values are
> always accumulated, a helper cannot overwrite/remove the old value.
> 

Doing that would mean passing all existing annotations to every helper
lookup. Which would seriously degrade the helper result caching.

Amos


[PATCH] validate -n parameter value

2014-06-14 Thread Amos Jeffries
Update the service_name global to an SBuf and use Tokenizer to validate
the -n argument is a pure alphanumeric.

Amos
=== modified file 'src/WinSvc.cc'
--- src/WinSvc.cc   2014-01-24 01:57:15 +
+++ src/WinSvc.cc   2014-06-14 14:20:56 +
@@ -493,41 +493,41 @@
 
 #if defined(_MSC_VER) /* Microsoft C Compiler ONLY */
 
 newHandler = Squid_Win32InvalidParameterHandler;
 
 oldHandler = _set_invalid_parameter_handler(newHandler);
 
 _CrtSetReportMode(_CRT_ASSERT, 0);
 
 #endif
 #if USE_WIN32_SERVICE
 
 if (WIN32_run_mode == _WIN_SQUID_RUN_MODE_SERVICE) {
 char path[512];
 HKEY hndKey;
 
 if (signal(SIGABRT, WIN32_Abort) == SIG_ERR)
 return 1;
 
 /* Register the service Handler function */
-svcHandle = RegisterServiceCtrlHandler(service_name, WIN32_svcHandler);
+svcHandle = RegisterServiceCtrlHandler(service_name.c_str(), 
WIN32_svcHandler);
 
 if (svcHandle == 0)
 return 1;
 
 /* Set Process work dir to directory cointaining squid.exe */
 GetModuleFileName(NULL, path, 512);
 
 WIN32_module_name=xstrdup(path);
 
 path[strlen(path) - 10] = '\0';
 
 if (SetCurrentDirectory(path) == 0)
 return 1;
 
 safe_free(ConfigFile);
 
 /* get config file from Windows Registry */
 if (RegOpenKeyEx(HKEY_LOCAL_MACHINE, REGKEY, 0, KEY_QUERY_VALUE, 
&hndKey) == ERROR_SUCCESS) {
 DWORD Type = 0;
 DWORD Size = 0;
@@ -654,192 +654,195 @@
 status = GetLastError();
 debugs(1, DBG_IMPORTANT, "SetServiceStatus error " << status);
 }
 
 debugs(1, DBG_IMPORTANT, "Leaving Squid service");
 break;
 
 default:
 debugs(1, DBG_IMPORTANT, "Unrecognized opcode " << Opcode);
 }
 
 return;
 }
 
 void
 WIN32_RemoveService()
 {
 SC_HANDLE schService;
 SC_HANDLE schSCManager;
 
-if (!service_name)
-service_name = xstrdup(APP_SHORTNAME);
+if (service_name.isEmpty())
+service_name = SBuf(APP_SHORTNAME);
 
-strcat(REGKEY, service_name);
+const char *service =  service_name.c_str();
+strcat(REGKEY, service);
 
-keys[4] = service_name;
+keys[4] = service;
 
 schSCManager = OpenSCManager(NULL, /* machine (NULL == local)*/
  NULL, /* database (NULL == 
default) */
  SC_MANAGER_ALL_ACCESS /* access required  
  */
 );
 
 if (!schSCManager)
 fprintf(stderr, "OpenSCManager failed\n");
 else {
-schService = OpenService(schSCManager, service_name, 
SERVICE_ALL_ACCESS);
+schService = OpenService(schSCManager, service, SERVICE_ALL_ACCESS);
 
 if (schService == NULL)
 fprintf(stderr, "OpenService failed\n");
 
 /* Could not open the service */
 else {
 /* try to stop the service */
 
 if (ControlService(schService, _WIN_SQUID_SERVICE_CONTROL_STOP,
&svcStatus)) {
 sleep(1);
 
 while (QueryServiceStatus(schService, &svcStatus)) {
 if (svcStatus.dwCurrentState == SERVICE_STOP_PENDING)
 sleep(1);
 else
 break;
 }
 }
 
 /* now remove the service */
 if (DeleteService(schService) == 0)
 fprintf(stderr, "DeleteService failed.\n");
 else
-printf("Service %s deleted successfully.\n", service_name);
+printf("Service " SQUIDSBUFPH " deleted successfully.\n", 
SQUIDSBUFPRINT(service_name));
 
 CloseServiceHandle(schService);
 }
 
 CloseServiceHandle(schSCManager);
 }
 }
 
 void
 WIN32_SetServiceCommandLine()
 {
-if (!service_name)
-service_name = xstrdup(APP_SHORTNAME);
+if (service_name.isEmpty())
+service_name = SBuf(APP_SHORTNAME);
 
-strcat(REGKEY, service_name);
+const char *service = servie_name.c_str();
+strcat(REGKEY, service);
 
-keys[4] = service_name;
+keys[4] = service;
 
 /* Now store the Service Command Line in the registry */
 WIN32_StoreKey(COMMANDLINE, REG_SZ, (unsigned char *) WIN32_Command_Line, 
strlen(WIN32_Command_Line) + 1);
 }
 
 void
 WIN32_InstallService()
 {
 SC_HANDLE schService;
 SC_HANDLE schSCManager;
 char ServicePath[512];
 char szPath[512];
 int lenpath;
 
-if (!service_name)
-service_name = xstrdup(APP_SHORTNAME);
+if (service_name.isEmpty())
+service_name = SBuf(APP_SHORTNAME);
 
-strcat(REGKEY, service_name);
+const char *service = service_name.c_str();
+strcat(REGKEY, service);
 
-keys[4] = service_name;
+keys[4] = service;
 
 if ((lenpath = GetModuleFileName(NULL, ServicePath, 512)) == 0) {
 

Re: [PATCH 5/8] reconfiguration leaks: objects tied to http_port

2014-06-14 Thread Amos Jeffries
On 15/06/2014 4:44 a.m., Alex Rousskov wrote:
> On 06/13/2014 07:25 PM, Amos Jeffries wrote:
>> On 14/06/2014 8:07 a.m., Alex Rousskov wrote:
>>> On 04/25/2014 02:59 AM, Amos Jeffries wrote:
>>>> On 25/04/2014 12:55 p.m., Alex Rousskov wrote:
>>>>> Do not leak [SSL] objects tied to http_port and https_port on reconfigure.
>>>>>
>>>>> PortCfg objects were not destroyed at all (no delete call) and were
>>>>> incorrectly stored (excessive cbdata locking). This change adds
>>>>> destruction and removes excessive locking to allow the destructed
>>>>> object to be freed. It also cleans up forgotten(?) clientca and crlfile
>>>>> PortCfg members.
>>>>>
>>>>> This change fixes a serious leak but also carries an elevated risk:
>>>>> There is a lot of code throughout Squid that does not check the pointers
>>>>> to the objects that are now properly destroyed. It is possible that some
>>>>> of that code will crash some time after reconfigure. It is not possible
>>>>> to ensure that this does not happen without rewriting/fixing the
>>>>> offending code to use refcounting. Such a rewrite would be a relatively
>>>>> large change outside this patch scope. We may decide that it is better
>>>>> to leak than to take this additional risk.
>>>>>
>>>>> Alex.
>>>>>
>>>>
>>>> -0.
>>>>
>>>> I have a patch moving the SSL config options into a standalone
>>>> ref-counted object. That can be polished up and references added to each
>>>> ConnStateData fairly easily.
>>>
>>> Amos, what is the status of that patch? Any ETA? Do you expect your
>>> changes to be easily portable to v3.3?
>>
>> Stalled behind the larger works. If it is urgent I can did it out and
>> polish it up.
>>
>> It could be back-ported to 3.3 if you like. The design is a new
>> Ref-Countable class to hold all the SSL options (and generated state)
>> leaving just a Pointer to it in the main config class.
>>  * Ports which needed a clone operation took a copy of the pointer and
>> share the context.
>>  * client/server context initialization functions take a Pointer to the
>> class and update its state content.
> 
> Why treat SSL options differently from all other port options? Should
> not the whole PortCfg object be refcounted instead?
> 
> Please note that if you refcount the SSL options and do nothing else,
> the leak will not be fixed. The SSL options (and other things) will
> continue to leak because the leaking port structure will keep pointing
> to them. Thus, as far as I can see, the patch you described will need
> the proposed leak fix anyway.

The other options are POD-style options. They can be reset relatively
okay by the parsers memset() and are filled out by the config parse
routines.

SSL options are partially those type of options and partially SSL
library objects generated from those options. Generated options should
be in SquidConfig2 or one of the new standalone config gobals.

Also, the same set of config settings are needed by *_port, cache_peer,
and sslproxy_* directives. It makes sense to have one class type for SSL
config which is used by all three.


On the other side AnyP::PortCfgPointer already exists to reference count
the port objects. We could spend time converting the raw pointers to use
it instead so your patch here is not dangerous.

Amos


Re: [PATCH 5/8] reconfiguration leaks: objects tied to http_port

2014-06-13 Thread Amos Jeffries
On 14/06/2014 8:07 a.m., Alex Rousskov wrote:
> On 04/25/2014 02:59 AM, Amos Jeffries wrote:
>> On 25/04/2014 12:55 p.m., Alex Rousskov wrote:
>>> Do not leak [SSL] objects tied to http_port and https_port on reconfigure.
>>>
>>> PortCfg objects were not destroyed at all (no delete call) and were
>>> incorrectly stored (excessive cbdata locking). This change adds
>>> destruction and removes excessive locking to allow the destructed
>>> object to be freed. It also cleans up forgotten(?) clientca and crlfile
>>> PortCfg members.
>>>
>>> This change fixes a serious leak but also carries an elevated risk:
>>> There is a lot of code throughout Squid that does not check the pointers
>>> to the objects that are now properly destroyed. It is possible that some
>>> of that code will crash some time after reconfigure. It is not possible
>>> to ensure that this does not happen without rewriting/fixing the
>>> offending code to use refcounting. Such a rewrite would be a relatively
>>> large change outside this patch scope. We may decide that it is better
>>> to leak than to take this additional risk.
>>>
>>> Alex.
>>>
>>
>> -0.
>>
>> I have a patch moving the SSL config options into a standalone
>> ref-counted object. That can be polished up and references added to each
>> ConnStateData fairly easily.
> 
> Amos, what is the status of that patch? Any ETA? Do you expect your
> changes to be easily portable to v3.3?

Stalled behind the larger works. If it is urgent I can did it out and
polish it up.

It could be back-ported to 3.3 if you like. The design is a new
Ref-Countable class to hold all the SSL options (and generated state)
leaving just a Pointer to it in the main config class.
 * Ports which needed a clone operation took a copy of the pointer and
share the context.
 * client/server context initialization functions take a Pointer to the
class and update its state content.

Amos



Re: [PATCH 1/8] reconfiguration leaks: implicit ACLs

2014-06-13 Thread Amos Jeffries
On 14/06/2014 7:57 a.m., Alex Rousskov wrote:
> On 04/25/2014 01:58 AM, Amos Jeffries wrote:
>> On 25/04/2014 12:46 p.m., Alex Rousskov wrote:
>>> Do not leak implicit ACLs during reconfigure.
>>>
>>> Many ACLs implicitly created by Squid when grouping multiple ACLs were
>>> not destroyed because those implicit ACLs where not covered by the
>>> global ACL registry used for ACL destruction.
>>>
>>> See also: r13210 which did not go far enough because it incorrectly
>>> assumed that all InnerNode() children are aclRegister()ed and, hence,
>>> will be centrally freed.
> 
> 
>> -0.
> 
> Is this a "negative" vote from "Squid3 voting" rules point of view?
> http://wiki.squid-cache.org/MergeProcedure#Squid3_Voting

It is "I don't like it but not objecting to a commit if you do it".


> 
> 
>> I believe we should move to reference counting ACLs instead of
>> continuing to work around these edge cases.
> 
> I agree that reference counting is an overall better design for ACLs, of
> course. However, since refcounting ACLs would be a large change that
> nobody has volunteered to implement in the foreseeable future (AFAIK), I
> suggest that this [significant] leak fix should go in now.
> 
> Any other votes/opinions?
> 
> 
> Thank you,
> 
> Alex.
> 



Re: [code] [for discussion] map-trie

2014-06-12 Thread Amos Jeffries
On 12/06/2014 10:43 a.m., Kinkie wrote:
> Hi,
>   I've done some benchmarking, here are the results so far:
> The proposal I'm suggesting for dstdomain acl is at
> lp:~kinkie/squid/flexitrie . It uses the level-compact trie approach
> I've described in this thread (NOT a Patricia trie). As a comparison,
> lp:~kinkie/squid/domaindata-benchmark implements the same benchmark
> using the current splay-based implementation.
> 
> I've implemented a quick-n-dirty benchmarking tool
> (src/acl/testDomainDataPerf); it takes as input an acl definition -
> one dstdomain per line, as if it was included in squid.conf, and a
> hostname list file (one destination hostname per line).
> 
> I've run both variants of the code against the same dataset: a 4k
> entries domain list, containing both hostnames and domain names, and a
> 18M entries list of destination hostnames, both matching and not
> matching entries in the domain list (about 7.5M hits, 10.5M misses).
> 
> Tested 10 times on a Core 2 PC with plenty of RAM - source datasets
> are in the fs cache.
> level-compact-trie: the mean time is 11 sec; all runs take between
> 10.782 and 11.354 secs; 18 Mb of core used
> full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core.
> splay-based: mean time is 16.3sec; all runs take between 16.193 and
> 16.427 secs; 14 Mb of core
> 
> I expect compact-trie to scale better as the number of entries in the
> list grows and with the number of clients and requests per second;
> furthermore using it removes 50-100 LOC, and makes code more readable.
> 
> IMO it is the best compromise in terms of performance, resources
> useage and expected scalability; before pursuing this further however
> I'd like to have some feedback.

Does making compact vs full trie configurable affect compexity much? I
think we have sufficiently wide customer base that people will want to
go further now that its known there is a much faster version.

Amos



Re: [RFC] unified port directive

2014-06-12 Thread Amos Jeffries
On 12/06/2014 8:09 a.m., Alex Rousskov wrote:
> On 06/11/2014 05:15 AM, Amos Jeffries wrote:
>>>> On 06/10/2014 12:09 AM, Kinkie wrote:
>>> I had understood that it would eventually be a catch-all directive
>>> for all squid service ports (possibly including FTP etc).
> 
>> That was indeed the long term intention.
> 
> If the long-term plan is to replace all *_port option with a single
> "port" or "listen" option, then I would like to hear why we should do
> that. The analysis presented so far was specific to HTTP (including
> HTTPS) so it does not really apply any more. Needless to say, the end
> goal has significant influence on the new directive name and internal
> code design.

Yes the only ports we have to backward-compat with this are the
http_port and https_port. No other port has separate TCP and TLS variants.

> 
> For example, why replacing http_port and snmp_port with "port http" and
> "port snmp" is better than having distinct protocol-specific directives
> for those two protocols?

Today we have to configure (and explain to people):

 XY_port ... Y protocol=X
or
 XY_port ... Y
or
 X_port ... protocol=X
or
 X_port ... Y

NP: the *long-term* goals working towards is:

  port ... protocol=X
or
  port ... Y protocol=X


Today I am tryingto get rid of the XY_port variants as unnecessary
duplication with "X_port ... Y" variants.

Removing the need for 's' in http(s)_port with existing "ssl" parameter.

When FTP transfer protocol is added is the time to determine if the fpt_
prefix is actually necesary or the *existing* protocol= parameter can be
used alone. FWIW the only strong reason http_ and https_ are currently
used is to determine which of the split config option lists to add the
PortCfg to. And this proposed step removes that need.

> 
> Replacing all current Squid directives with
> 
>   squid old_directive_name_here old_options_here ...
> 
> is obviously a bad idea. Thus, at some unknown point(s), merged
> directives become worse than dedicated ones. I suspect the key here is
> the amount of overlapping port options and typical configuration
> combinations. Is there enough common things about all Squid listening
> ports to warrant their merger?

shared needs by all ports, both existing and needed functionality:
 - host/ip:port resolution
 - select loops prioritization over other ports
 - SSL/TLS/DSTS options
 - startup/shutdown operation and timings

shared by TCP ports:
 - TcpAcceptor class

shared by UDP ports:
 - recv
 - separate outgoing IP:port selection

Note that the difference between TCP and UDP classes is the listen
logic, which is initiated after the config lines is fully received.

Amos


Re: [RFC] unified port directive

2014-06-11 Thread Amos Jeffries
On 11/06/2014 2:35 a.m., Francesco wrote:
> 
> On 10 Jun 2014, at 16:29, Alex Rousskov  
> wrote:
> 
>> On 06/10/2014 12:09 AM, Kinkie wrote:
>>> On Mon, Jun 9, 2014 at 6:47 PM, Alex Rousskov wrote:
>>>> On 06/08/2014 11:07 PM, Amos Jeffries wrote:
>>>>> I propose that we combine the http_port and https_port directives into
>>>>> a single directive called "port" with the old names as aliases and an
>>>>> option to select between TCP and TLS transport protocol.
>>
>>>> Just "port" is a bad name, IMO, because there are many other, non-HTTP
>>>> *_ports in squid.conf. Consider using "http_port" name for both SSL and
>>>> plain transports, with appropriate transport defaults (that may even
>>>> depend on the port value!).
>>
>>> How about "listen"? It's consistent with apache, clear, and 
>>> protocol-neutral.
>>
>> Why is being protocol neutral a good thing for an HTTP-specific(*) port
>> in an environment with many other protocol-specific ports?
>>
>> (*) In this context, both encrypted ("HTTPS") and plain ("HTTP")
>> transport connections are assumed to carry the same transfer protocol: HTTP.
> 
> Oh my bad. I had understood that it would eventually be a catch-all directive 
> for all squid service ports (possibly including FTP etc).
> 

That was indeed the long term intention.


Amos


[RFC] unified port directive

2014-06-08 Thread Amos Jeffries
Hi,

 I propose that we combine the http_port and https_port directives into
a single directive called "port" with the old names as aliases and an
option to select between TCP and TLS transport protocol.

The two directives already share almost all configuration options and
processing logic inside Squid. This will allow us to:
 1) de-duplicate a lot of code managing each set of ports as separate
lists, and
 2) more easily add UDP (CoAP), SCTP, SOCKS, and potentially other
protocols as the transport under HTTP in future.

I expect this to impact the configuration code
startup/reconfigure/shutdown, the comm/Mod*, and possibly other places
such as myportname ACL and getMyHostname() [which will improve].

Backward compatibility is easily done so the user visible impact is
small. Limited to users down-grading their Squid versions.

Amos


Re: Squid 3.4.5 warning about MGR_INDEX

2014-06-08 Thread Amos Jeffries
On 9/06/2014 1:36 a.m., Marcus Kool wrote:
> 
> Using Squid 3.4.5 I observed in cache.log the following warning:
> 
> 2014/06/08 09:50:42.804 kid1| disk.cc(92) file_open: file_open: error
> opening file /local/squid34/share/errors/templates/MGR_INDEX: (2) No
> such file or directory
> 2014/06/08 09:50:42.805 kid1| errorpage.cc(307) loadDefault: WARNING:
> failed to find or read error text file MGR_INDEX
> 
> For other error template files all is well.

This is an optional template. It is intended to be provided by
third-party cachemgr software. Apart from that warning Squid handles its
absence properly.

Amos


Re: issue with ICAP message for redirecting HTTPS/CONNECT

2014-06-08 Thread Amos Jeffries
On 9/06/2014 2:22 a.m., Marcus Kool wrote:
> I ran into an issue with the ICAP interface.
> The issue is that a GET/HTTP-based URL can be successfully rewritten but a
> CONNECT/HTTPS-based URL cannot.  I used debug_options ALL,9 to find out
> what is going wrong
> but I fail to understand Squid.

Nathan pretty accurately pinned the problem I think. Beyond that you had
to general Q's about CONNECT I can answer.

Firstly, why the CONNECT does not have https:// is because CONNECT
reuqests use an "authority" URL syntax. RFC7230 explains it far better
than RFC2616 did, http://tools.ietf.org/html/rfc7230#section-5.3.3

Secondly, the sanityCheck parse function is attempting to find the reply
message syntax "HTTP/1.0 ..." and being passed the request "CONNECT
...". I am not sure exectly why the ICAP parser is using HttpReply:: to
parse a request header though, maybe just to eliminate the possibility
of it being an error reply before parsing the request details. Nothing
to worry about though if the ICAP syntax is valid and correct.

Amos


Re: https_port & CONNECT

2014-06-08 Thread Amos Jeffries
On 19/05/2014 6:57 a.m., Henrik Nordström wrote:
> It looks like Mozilla is finally adding support for TLS encrypted proxy
> connections.
> 
> So if we still have the problem with CONNECT not properly handling TLS
> then it might we a good idea to finally fix that.

If you are meaning bug 3371 (potentially lost nitial bytes on a CONNECT)
then yes that is fixed.

If you are meaning anything else please expand "CONNECT not properly
handling TLS".

Amos


[RFC] flag day commits

2014-06-08 Thread Amos Jeffries
While discussing in IRC how we could improve our procedures and reduce
audit problems we hit on the issue of wide-ranging code cleanup or API
conversion patches/"projects".

The distinguishing factors for flag-day changes is that they are a)
large, b) essentially not changing logics in new ways ("polish" or
"cleanup" patches), and c) variously painful for everyone merging into
their incomplete work.

Some examples of these that we know of are:
 * removal of HERE macro
 * removal of some AsyncCall wrappers (ie CommCalls)
 * removal of .cci files
 * removal of _SQUID_INLINE_
 * multiple cases of Squid-3 coding guidelines renamings
 * SourceLayout shuffling actions (not the API cleanup)
 * removal of defines.h and/or all shufflings out of there
 * removal of typedefs.h and/or all shufflings out of there
 * removal of structs.h and/or all shufflings out of there
 * C++11 "Rule of Five" compliance check/update
 * several C++11 syntax upgrades changes once -std=c++11 is required
 * possible replacement for debugs() API
 * astyle version upgrade

When a change can it should avoid being a flag-day. But if breaking into
a set of smaller incremental changes is difficult or causes too much
annoyance from repeated painful merges or wastes audit time on other
patches, then it seems better to end the ongoing pain with a flag-day
commit.


I propose that we dedicate a 5-day period in the Squid release cycle
when these types of patches are accepted into trunk. Accumulating a list
of proposals like above until the period occurs.

* From the release perspective the week or two prior to branching trunk
seems to be the best time for these to occur. During that period new
feature and logic commits to trunk are usually heldup in audit we are
trying to ensure a stable first beta. We also have minimal pressure/need
for back-porting patches across the change to already stable branches by
then.

* 5 days because its a waste of time to write these very long in advance
and we may want a short period to ensure stability is retained. Also the
exact branching day is not predictable very long in advance.

* audit requirements are small. Just checking the patch is restricted to
pre-agreed painful act(s) and has no other code changes.


I also propose that this type of patch audit submission be accompanied
by clear how-to steps assisting porters of older code to get over the bump.
 For example, when we initiated astyle use globally I have provided
details of how to merge all previous revno from trunk, how to merge the
painful rev dropping rejects and run the script on new code manually. A
simple process instead of painful fiddling with individual patch rejects
which the unwary would be faced with.
 NP: if we can't provide a simple how-to then re-consider the patch
suitability for being a flag-day change.


Amos


Re: HTTP/1.1: They're done.

2014-06-07 Thread Amos Jeffries
On 7/06/2014 5:42 p.m., Amos Jeffries wrote:
> Its official now ... RFC616 is obsolete!

Silly me, should have been RFC 2616.

Although RFC616 is an interesting read if one is interested in "ancient"
history.

Amos


Re: HTTP/1.1: They're done.

2014-06-06 Thread Amos Jeffries
Its official now ... RFC616 is obsolete!

On 7/06/2014 2:20 p.m., Mark Nottingham wrote:
> Oh! And one more thank you, to Mark Baker for serving as Shepherd for the 
> Caching doc.
> 
> Cheers,
> 
> 
> On 6 Jun 2014, at 10:09 pm, Mark Nottingham wrote:
> 
>> Just a quick note -
>>
>> The revision of HTTP/1.1’s specification, obsoleting RFC2616, is complete.
>>
>> See:
>> http://tools.ietf.org/html/rfc7230 - Message Syntax and Routing
>> http://tools.ietf.org/html/rfc7231 - Semantics and Content
>> http://tools.ietf.org/html/rfc7232 - Conditional Requests
>> http://tools.ietf.org/html/rfc7233 - Range Requests
>> http://tools.ietf.org/html/rfc7234 - Caching
>> http://tools.ietf.org/html/rfc7235 - Authentication
>>
>> Along with the related documents:
>> http://tools.ietf.org/html/rfc7236 - Authentication Scheme Registrations
>> http://tools.ietf.org/html/rfc7237 - Method Registrations
>>
>> Thanks to everyone who has commented upon, reviewed and otherwise 
>> contributed to them over this nearly seven-year(!) effort.
>>
>> Special thanks to our Area Directors over the years: Lisa Dusseault, Alexey 
>> Melnikov, Peter Saint-Andre and Barry Leiba, along with Yves Lafon, who 
>> helped edit Range Requests.
>>
>> Finally, please warmly thank both Roy Fielding and Julian Reschke the next 
>> time you see them (I believe beer would be appreciated); the amount of 
>> effort that they put into these documents is far, far more than they 
>> originally signed up for, and they’ve done an excellent job.
>>
>> Now, onwards to HTTP/2...
>>
>> P.S. This document set’s completion also has enabled the publication of 
>> these related non-WG documents:
>> http://tools.ietf.org/html/rfc7238 - The Hypertext Transfer Protocol Status 
>> Code 308 (Permanent Redirect)
>> http://tools.ietf.org/html/rfc7239 - Forwarded HTTP Extension
>> http://tools.ietf.org/html/rfc7240 - Prefer Header for HTTP
>>




Re: [squid-users] squid smp fails -k reconfigure

2014-06-06 Thread Amos Jeffries
On 6/06/2014 8:44 p.m., Amos Jeffries wrote:
> Hi Fernando
>  The answer to your repeated "why" questions is that Squid is a huge
> piece of software and the SMP changes are relatively new and incomplete.
> 
> Thank you for finding this bug. I'm forwarding to squid-dev where
> someone working on SMP may be able to help.
> 
> Amos
> 
> On 6/06/2014 4:54 a.m., Fernando Lozano wrote:
>> Hi there,
>>
>> Since I enabled SMP mode on my squid 3.4.3 server, reconfiguring is not
>> working consitently. Here's the relevant log entries:
>>
>> --
>> 2014/06/02 11:35:37| Set Current Directory to /cache
>> 2014/06/02 11:35:37 kid6| Reconfiguring Squid Cache (version 3.4.3)...
>> 2014/06/02 11:35:37 kid6| Logfile: closing log
>> stdio:/var/log/squid/access.log
>> 2014/06/02 11:35:37 kid5| Reconfiguring Squid Cache (version 3.4.3)...
>> ...
>> 2014/06/02 11:35:37 kid6| ERROR opening swap log
>> /cache/worker6/swap.state: (2) No such file or directory
>> 2014/06/02 11:35:37 kid5| ERROR opening swap log
>> /cache/worker5/swap.state: (2) No such file or directory
>> 2014/06/02 11:35:37 kid5| storeDirWriteCleanLogs: Starting...
>> 2014/06/02 11:35:37 kid5| log.clean.start() failed for dir #1
>> 2014/06/02 11:35:37 kid5|   Finished.  Wrote 0 entries.
>> 2014/06/02 11:35:37 kid5|   Took 0.00 seconds (  0.00 entries/sec).
>> FATAL: UFSSwapDir::openLog: Failed to open swap log.
>> Squid Cache (Version 3.4.3): Terminated abnormally.
>> FATAL: UFSSwapDir::openLog: Failed to open swap log.
>> Squid Cache (Version 3.4.3): Terminated abnormally.
>> --
>>
>> I find very strange that workers 6 and 5 try to get aufs cache stores.
>> They are supposed to be the rock store disker and the coordinator! My
>> squid.conf has:
>>
>> workers 4
>> cache_mem 6144 MB
>> cache_dir rock /cache/shared 3 min-size=1 max-size=31000
>> max-swap-rate=250 swap-timeout=350
>> cache_dir aufs /cache/worker${process_number} 25000 16 256
>> min-size=31001 max-size=346030080
>> logfile_rotate 4
>>
>> Would squid be having troubles with my cache_mem and cache_dir big sizes?
>>
>> Is squid -k reconfigure working well for everyone else with SMP?
>>
>> Other strange entries, from earlier in the cache.log:
>> -
>> 2014/06/01 03:13:05 kid5| Set Current Directory to /cache
>> 2014/06/01 03:13:05 kid5| Starting Squid Cache version 3.4.3 for
>> x86_64-redhat-linux-gnu...
>> 2014/06/01 03:13:05 kid5| Process ID 23990
>> 2014/06/01 03:13:05 kid5| Process Roles: disker
>> 2014/06/01 03:13:05 kid5| With 65536 file descriptors available
>> 2014/06/01 03:13:05 kid5| Initializing IP Cache...
>> 2014/06/01 03:13:05 kid5| DNS Socket created at 0.0.0.0, FD 7
>> 2014/06/01 03:13:05 kid5| Adding nameserver 200.20.212.75 from
>> /etc/resolv.conf
>> 2014/06/01 03:13:05 kid5| Adding nameserver 200.20.212.99 from
>> /etc/resolv.conf
>> 2014/06/01 03:13:05 kid5| Adding domain inmetro.gov.br from
>> /etc/resolv.conf
>> 2014/06/01 03:13:05 kid5| Adding domain inmetro.gov.br from
>> /etc/resolv.conf
>> 2014/06/01 03:13:05 kid5| helperOpenServers: Starting 10/100
>> 'basic_ldap_auth' processes
>> -
>>
>> If kid5 is a disker, why does it setups up dns resolver and ldap auth
>> helpers? It looks like disker and coordinator try to process all
>> squid.conf directives, even when they are supposed not to do any
>> network-related stuff.
>>
>> Should I try to "hide" those directives from them?
>>
>> I also got something strange on shutdown:
>>
>> 
>> 2014/06/02 14:36:47| Set Current Directory to /cache
>> 2014/06/02 14:36:47 kid6| Preparing for shutdown after 0 requests
>> 2014/06/02 14:36:47 kid6| Waiting 5 seconds for active connections to
>> finish
>> ...
>> 2014/06/02 14:36:53 kid6| Shutting down...
>> 2014/06/02 14:36:53 kid6| Not currently OK to rewrite swap log.
>> 2014/06/02 14:36:53 kid6| storeDirWriteCleanLogs: Operation aborted.
>> -
>>
>> What means "not OK to rewrite swap log"? kid6 is the coordinator, it
>> shoud not mess with cache dirs!
>>
>>
>> []s, Fernando Lozano
>>
> 



Re: [PATCH] Updated Comm::Read I/O API

2014-06-05 Thread Amos Jeffries
On 5/06/2014 5:16 a.m., Alex Rousskov wrote:
> On 06/04/2014 09:10 AM, Amos Jeffries wrote:
> 
>>  * Comm::Read() which accepts the Comm::Connection pointer for the
>> socket to read on and an AsyncCall callback to be run when read is
>> ready. The Job is responsible for separately initiating read(2) or
>> alternative action when that callback is run.
>>
>>  * Comm::ReadNow() which accepts an SBuf buffer and a CommIoCbParams
>> initialized to contain the Comm::Connection pointer for the socket to
>> read on which will be filled out with result flag, xerrno, and size.
>> This synchronously performs read(2) operations to append bytes to the
>> provided buffer. It returns a comm_err_t flag for use in determining how
>> to handle the results and signalling one of OK, INPROGRESS, ERROR,
>> ERR_CLOSING as having happened.
> 
> 
>> +inline void comm_read(const Comm::ConnectionPointer &conn, char *buf, int 
>> len, AsyncCall::Pointer &callback)
>> +{
>> +assert(buf != NULL);
> 
> but:
> 
>> commHalfClosedCheck(void *)
>> {
> ...
>> comm_read(c, NULL, 0, call);
> 
> Please fix and double check that no other comm_read calls are using a
> NULL buffer. If you cannot be sure, let's not enforce this to avoid
> breaking working code.
> 

Oops. Fixed and all of src/ audited for this just to be sure.

The handler commHalfClosedReader() is now uselessly asserting size==0.
But I have left that alone for now.

> 
>> +if (retval > 0) { // data read most common case
> ...
>> +fd_bytes(params.conn->fd, retval, FD_READ);
> 
>> +/* Note - read 0 == socket EOF, which is a valid read */
>> +if (retval >= 0) {
>> +fd_bytes(fd, retval, FD_READ);
> 
> Please be consistent. Since bytes accounting is not your patch focus, I
> suggest calling fd_bytes() for zero returned value to avoid a subtle
> irrelevant change.
> 

Updating the API to improve performance is the overall goal of this (and
earlier patches). Removing a useless convenience-library jump and some
if-statements in the new SBuf pathway comes under that even if it is
small. Given the new flag returns optimizing makes sense at this point
rather than delaying. I did audit the fd_bytes() code path to ensure the
change was a no-op removal.


> 
>> +// if the caller did not supply a buffer, just schedule callback
> 
> Please detail along these lines:
> 
>   // without a buffer, just call back and the caller will ReadNow()
> 

We place no criteria on what the caller will do next. The connection
monitor callers will in fact close() instead of ReadNow().

I am using the first half of your statement, but not the "and the caller
will ..."


>> +/* For legacy callers : Attempt a read */
> 
> Please add something like "Keep in sync with Comm::ReadNow()!"
> 

Done although this is not a strict requirement. ReadNow() is for updated
callers and may perform some things differently as per our new model
(ie. flag return values, SBuf usage, readv(2)?, etc.) the old read(2)
acts must be kept more in sync with old callers expectations.


> 
>> +debugs(5, 3, "SBuf " << params.conn << ", size " << sz << ", retval " 
>> << retval << ", errno " << params.xerrno);
> 
>> +debugs(5, 3, "char FD " << fd << ", size " << ccb->size << ", retval " 
>> << retval << ", errno " << errno);
> 
> Please remove the "SBuf " and "char " prefixes. These lines are about
> Comm I/O, these specific prefixes do not make much sense here, and
> debugs() will supply the right context.

Removed.

> 
>> +SBuf::size_type sz = buf.spaceSize();
> 
> Please make const if possible.
> 

Done.

> 
>> +char *b = buf.rawSpace(sz);
> 
> Please rename "b" to "space", "rawSpace", or something like that to ease
> reading/searching.
> 
Done. "theBuf"

> 
>> +int retval = FD_READ_METHOD(params.conn->fd, b, sz);
> 
> Please make constant.
> 
Done.

> 
>> +} else if (retval == 0) { // remote closure (somewhat less) common
>> +// Note - read 0 == socket EOF, which is a valid read.
>> +params.flag = COMM_ERR_CLOSING;
> 
> It is a bad idea to overload COMM_ERR_CLOSING like this! The reaction to
> COMM_ERR_CLOSING in the callback is very different to the reaction to
> EOF! And EOF is not really an error. If you think this case deserves a
> dedicated flag, add COMM_EOF instead.
> 

Okay. Ad

Re: [PATCH] Update SBuf::trim

2014-06-04 Thread Amos Jeffries
On 4/06/2014 2:48 a.m., Alex Rousskov wrote:
> On 06/03/2014 08:22 AM, Amos Jeffries wrote:
>> On 4/06/2014 1:08 a.m., Alex Rousskov wrote:
>>> On 06/03/2014 04:46 AM, Amos Jeffries wrote:
>>>
>>>> This replaces the SBuf::trim() to use CharacterSet instead of an SBuf()
>>>> list of characters and memchr()
>>>>
>>>> It seems to be faster for CharacterSet lookup than repeated memchr
>>>> calls, but Im not certain of that. It is certainly makes simpler parser
>>>> code with trim and a predefined CharacterSet than static SBuf set of chars.
>>>
>>> I agree that CharacterSet membership test should be faster than repeated
>>> memchr() calls.
>>>
>>> No objections to this patch, although I suspect that any code calling
>>> SBuf::trim() should actually use Tokenizer instead; we may be optimizing
>>> code that should not be used in the first place.
>>
>> The use-case that brought this up is scanning mime header lines.
>>
>>   Tokenizer tok(...);
>>   SBuf line;
>>   while (tok.prefix(line, CharacterSet::LF)) {
>> // drop optional trailing CR* sequence
>> line.trim(CharacterSet::CR, false, true);
> 
> The above does not make sense to me: After tok.prefix(), "line" will
> contain LF characters only. There will be no CR characters to trim.

Sorry, that is inverted. CharacterSet(not-LF)

Amos



[RFC] useful warnings

2014-06-04 Thread Amos Jeffries
The efforts to clean up Squid C++ compliance are going forwards in small
increments painfully sowly. The last -W flag we manged to get added was
-Wshadow.

Since then I have come to the conclusion that we should add several
others but the number of warnings they produce is unrealistic to fix on
top of teh other work we all have on.

As a half measure I would like to enable those warnings such as
-Wconversion, -Weffc++, -Woverride-virtual, etc. which provide useful
warnings about nasty code problems immediately. But given that there are
hoards of issues existing right now pairing each with a -Wno-error= so
that we can work on them jointly and more gradually before removing the
-Wno-error.

Amos


Re: [code] [for discussion] map-trie

2014-06-03 Thread Amos Jeffries
On 4/06/2014 2:40 a.m., Kinkie wrote:
> Hi all,
>   as an experiment and to encourage some discussion I prepared an
> alternate implementation of TrieNode which uses a std::map instead of
> an array to store a node's children.
> 
> The expected result is a worst case performance degradation on insert
> and delete from O(N) to O(N log R) where N is the length of the
> c-string being looked up, and R is the size of the alphabet (as R =
> 256, we're talking about 8x worse).
> 
> The expected benefit is a noticeable reduction in terms of memory use,
> especially for sparse key-spaces; it'd be useful e.g. in some lookup
> cases.
> 
> Comments?
> 

Since this is in libTrie used for ESI message body parsing the major
operations being done on it are insert and delete. These happen a
relatively large number of times per-request as the XML payload/body
gets parsed and a whole tree is constructed+destructed.

So the worst-case areas are worse thay they would be for example in ACLs
based on Trie.

Amos



Re: [PATCH] Update SBuf::trim

2014-06-03 Thread Amos Jeffries
On 4/06/2014 1:08 a.m., Alex Rousskov wrote:
> On 06/03/2014 04:46 AM, Amos Jeffries wrote:
> 
>> This replaces the SBuf::trim() to use CharacterSet instead of an SBuf()
>> list of characters and memchr()
>>
>> It seems to be faster for CharacterSet lookup than repeated memchr
>> calls, but Im not certain of that. It is certainly makes simpler parser
>> code with trim and a predefined CharacterSet than static SBuf set of chars.
> 
> I agree that CharacterSet membership test should be faster than repeated
> memchr() calls.
> 
> No objections to this patch, although I suspect that any code calling
> SBuf::trim() should actually use Tokenizer instead; we may be optimizing
> code that should not be used in the first place.

The use-case that brought this up is scanning mime header lines.

  Tokenizer tok(...);
  SBuf line;
  while (tok.prefix(line, CharacterSet::LF)) {
// drop optional trailing CR* sequence
line.trim(CharacterSet::CR, false, true);
... use line content for something.

.. skip tok past the LF, and repeat.
  }

If there are no other objections I will drop this in at some point prior
to the parser-ng merge. No rush on approval until then.

Amos



  1   2   3   4   5   6   7   8   9   10   >