Re: [RFC] unified port directive

2014-06-11 Thread Amos Jeffries
On 11/06/2014 2:35 a.m., Francesco wrote:
 
 On 10 Jun 2014, at 16:29, Alex Rousskov rouss...@measurement-factory.com 
 wrote:
 
 On 06/10/2014 12:09 AM, Kinkie wrote:
 On Mon, Jun 9, 2014 at 6:47 PM, Alex Rousskov wrote:
 On 06/08/2014 11:07 PM, Amos Jeffries wrote:
 I propose that we combine the http_port and https_port directives into
 a single directive called port with the old names as aliases and an
 option to select between TCP and TLS transport protocol.

 Just port is a bad name, IMO, because there are many other, non-HTTP
 *_ports in squid.conf. Consider using http_port name for both SSL and
 plain transports, with appropriate transport defaults (that may even
 depend on the port value!).

 How about listen? It's consistent with apache, clear, and 
 protocol-neutral.

 Why is being protocol neutral a good thing for an HTTP-specific(*) port
 in an environment with many other protocol-specific ports?

 (*) In this context, both encrypted (HTTPS) and plain (HTTP)
 transport connections are assumed to carry the same transfer protocol: HTTP.
 
 Oh my bad. I had understood that it would eventually be a catch-all directive 
 for all squid service ports (possibly including FTP etc).
 

That was indeed the long term intention.


Amos


Jenkins build is back to normal : 3.HEAD-coadvisor #309

2014-06-11 Thread noc
See http://build.squid-cache.org/job/3.HEAD-coadvisor/309/



Build failed in Jenkins: 3.HEAD-amd64-OpenBSD-5.4 #95

2014-06-11 Thread noc
See http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/95/

--
Started by upstream project 3.HEAD-amd64-centos-6 build number 370
originally caused by:
 Started by an SCM change
 Started by an SCM change
 Started by an SCM change
 Started by an SCM change
Building remotely on ypg-openbsd-54 (gcc farm amd64-openbsd 5.4 openbsd-5.4 
openbsd amd64-openbsd-5.4 amd64) in workspace 
http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/ws/
$ bzr revision-info -d 
http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/ws/
info result: bzr revision-info -d 
http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/ws/ returned 0. 
Command output: 13454 squid...@squid-cache.org-20140607001438-gi9xlrr7mccrgvk4
 stderr: 
[3.HEAD-amd64-OpenBSD-5.4] $ bzr pull --overwrite 
http://bzr.squid-cache.org/bzr/squid3/trunk/
bzr: ERROR: Connection error: Couldn't resolve host 'bzr.squid-cache.org' 
[Errno -5] no address associated with name
ERROR: Failed to pull
Since BZR itself isn't crash safe, we'll clean the workspace so that on the 
next try we'll do a clean pull...
Retrying after 10 seconds
Cleaning workspace...
$ bzr branch http://bzr.squid-cache.org/bzr/squid3/trunk/ 
http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/ws/
bzr: ERROR: Connection error: Couldn't resolve host 'bzr.squid-cache.org' 
[Errno -5] no address associated with name
ERROR: Failed to branch http://bzr.squid-cache.org/bzr/squid3/trunk/
Retrying after 10 seconds
Cleaning workspace...
$ bzr branch http://bzr.squid-cache.org/bzr/squid3/trunk/ 
http://build.squid-cache.org/job/3.HEAD-amd64-OpenBSD-5.4/ws/
bzr: ERROR: Connection error: Couldn't resolve host 'bzr.squid-cache.org' 
[Errno -5] no address associated with name
ERROR: Failed to branch http://bzr.squid-cache.org/bzr/squid3/trunk/



Build failed in Jenkins: 3.HEAD-amd64-ubuntu-saucy #246

2014-06-11 Thread noc
See http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/246/changes

Changes:

[Francesco Chemolli] Fixed bug in CharacterSet::addRange()

[Francesco Chemolli] Complete ConnOpener::connect change from r13459

[Amos Jeffries] Windows: fix various libip build issues

* Missing include ws2tcpip.h for IPv6 definitions
* Alternative IN6_ARE_ADDR_EQUAL definition required
* 'byte' is a reserved / system defined type on Windows,
resolve variable shadowing by renaming to ipbyte.

[Amos Jeffries] Windows: rename TcpLogger::connect

Windows sockets API is mapped via #define macros. connect() macro and
this TcpLogger method collide. Rename the method doConnect().

[Amos Jeffries] Windows: rename ConnOpener::connect

Windows sockets API is mapped via #define macros. connect() macro and
this ConnOpener method collide. Rename the method doConnect().

[Amos Jeffries] Revert rename of Comm::Flag ERROR

On MinGW at least ERROR is a #define'd macro resulting in build failure.
Revert to the old name COMM_ERROR until we can find a better one that
does not duplicate 'comm'.

[Amos Jeffries] RFC2616 obsoleted by RFC 7230 et al

[Amos Jeffries] CharacterSet: update RFC723x references and add some missing 
sets

[Amos Jeffries] Docs: fix Comm::ReadNow() retval text

--
[...truncated 10211 lines...]
Making uninstall in auth
make[3]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth'
Making uninstall in basic
make[4]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/basic'
make[4]: Nothing to be done for `uninstall'.
make[4]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/basic'
Making uninstall in ntlm
make[4]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/ntlm'
make[4]: Nothing to be done for `uninstall'.
make[4]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/ntlm'
Making uninstall in negotiate
make[4]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/negotiate'
make[4]: Nothing to be done for `uninstall'.
make[4]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/negotiate'
Making uninstall in digest
make[4]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/digest'
make[4]: Nothing to be done for `uninstall'.
make[4]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth/digest'
make[4]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth'
make[4]: Nothing to be done for `uninstall-am'.
make[4]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth'
make[3]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/auth'
Making uninstall in http
make[3]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/http'
make[3]: Nothing to be done for `uninstall'.
make[3]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/http'
Making uninstall in ip
make[3]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/ip'
make[3]: Nothing to be done for `uninstall'.
make[3]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/ip'
Making uninstall in icmp
make[3]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/icmp'
make[3]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/icmp'
Making uninstall in ident
make[3]: Entering directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/ident'
make[3]: Nothing to be done for `uninstall'.
make[3]: Leaving directory 
`http://build.squid-cache.org/job/3.HEAD-amd64-ubuntu-saucy/ws/btlayer-00-default/squid-3.HEAD-BZR/_build/src/ident'
Making uninstall in log
make[3]: Entering directory 

[PATCH] Support client connection annotation by helpers via clt_conn_id=ID

2014-06-11 Thread Tsantilas Christos

TCP client connections tagging is useful for faking various forms of
connection-based authentication when standard HTTP authentication 
cannot be used. A URL rewriter or, external ACL helper may mark the 
authenticated client connection to avoid going through 
authentication steps during subsequent requests on the same connection 
and to share connection authentication information with Squid ACLs, 
other helpers, and logs.


After this change, Squid accepts optional clt_conn_id=ID pair from a
helper and associates the received ID with the client TCP connection.
Squid treats the received clt_conn_id=ID pair as a regular annotation, 
but also keeps it across all requests on the same client connection. A 
helper may update the client connection ID value during subsequent requests.


This patch documents the clt_conn_id key=value pair in cf.data.pre file 
only for url rewriters. Because annotations are common to all helpers we 
may want to make a special section at the beginning of cf.data.per for 
all helpers. Suggestions are welcome.


I must also note that this patch adds an inconsistency. All annotation 
key=values  pairs received from helpers, accumulated to the existing key 
notes values. The clt_conn_id=Id pair is always unique and replaces the 
existing clt_conn_id=Id annotation pair.
We may want to make all annotations unique, or maybe implement a 
configuration mechanism to define which annotations are overwriting 
their previous values and which appending the new values.


This is a Measurement Factory project


Regards,
   Christos
Support client connection annotation by helpers via clt_conn_id=ID.
  
TCP client connections tagging is useful for faking various forms of
connection-based authentication when standard HTTP authentication cannot be
used. A URL rewriter or, external ACL helper may mark the authenticated
client connection to avoid going through authentication steps during
subsequent requests on the same connection and to share connection
authentication information with Squid ACLs, other helpers, and logs.
 
After this change, Squid accepts optional clt_conn_id=ID pair from a 
helper and associates the received ID with the client TCP connection.
Squid treats the received clt_conn_id=ID pair as a regular annotation, but
also keeps it across all requests on the same client connection. A helper may
update the client connection ID value during subsequent requests.
  
To send clt_conn_id=ID pair to a URL rewriter, use url_rewrite_extras with a
%{clt_conn_id}note macro.

This is a Measurement Factory project

=== modified file 'src/Notes.cc'
--- src/Notes.cc	2014-04-30 09:41:25 +
+++ src/Notes.cc	2014-06-10 15:20:30 +
@@ -14,40 +14,41 @@
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
  *  the Free Software Foundation; either version 2 of the License, or
  *  (at your option) any later version.
  *
  *  This program is distributed in the hope that it will be useful,
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  *  GNU General Public License for more details.
  *
  *  You should have received a copy of the GNU General Public License
  *  along with this program; if not, write to the Free Software
  *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
  *
  */
 
 #include squid.h
 #include AccessLogEntry.h
 #include acl/FilledChecklist.h
 #include acl/Gadgets.h
+#include client_side.h
 #include ConfigParser.h
 #include globals.h
 #include HttpReply.h
 #include HttpRequest.h
 #include SquidConfig.h
 #include Store.h
 #include StrList.h
 
 #include algorithm
 #include string
 
 Note::Value::~Value()
 {
 aclDestroyAclList(aclList);
 }
 
 Note::Value::Pointer
 Note::addValue(const String value)
 {
 Value::Pointer v = new Value(value);
@@ -186,40 +187,54 @@
 return value.size() ? value.termedBuf() : NULL;
 }
 
 const char *
 NotePairs::findFirst(const char *noteKey) const
 {
 for (std::vectorNotePairs::Entry *::const_iterator  i = entries.begin(); i != entries.end(); ++i) {
 if ((*i)-name.cmp(noteKey) == 0)
 return (*i)-value.termedBuf();
 }
 return NULL;
 }
 
 void
 NotePairs::add(const char *key, const char *note)
 {
 entries.push_back(new NotePairs::Entry(key, note));
 }
 
 void
+NotePairs::remove(const char *key)
+{
+std::vectorNotePairs::Entry *::iterator i = entries.begin();
+while(i != entries.end()) {
+if ((*i)-name.cmp(key) == 0) {
+delete *i;
+i = entries.erase(i);
+} else {
+++i;
+}
+}
+}
+
+void
 NotePairs::addStrList(const char *key, const char *values)
 {
 String strValues(values);
 const char *item;
 const char *pos = NULL;
 int ilen = 0;
 while (strListGetItem(strValues, ',', item, ilen, pos)) {
 String v;
 

Re: [code] [for discussion] map-trie

2014-06-11 Thread Kinkie
Hi,
  I've done some benchmarking, here are the results so far:
The proposal I'm suggesting for dstdomain acl is at
lp:~kinkie/squid/flexitrie . It uses the level-compact trie approach
I've described in this thread (NOT a Patricia trie). As a comparison,
lp:~kinkie/squid/domaindata-benchmark implements the same benchmark
using the current splay-based implementation.

I've implemented a quick-n-dirty benchmarking tool
(src/acl/testDomainDataPerf); it takes as input an acl definition -
one dstdomain per line, as if it was included in squid.conf, and a
hostname list file (one destination hostname per line).

I've run both variants of the code against the same dataset: a 4k
entries domain list, containing both hostnames and domain names, and a
18M entries list of destination hostnames, both matching and not
matching entries in the domain list (about 7.5M hits, 10.5M misses).

Tested 10 times on a Core 2 PC with plenty of RAM - source datasets
are in the fs cache.
level-compact-trie: the mean time is 11 sec; all runs take between
10.782 and 11.354 secs; 18 Mb of core used
full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core.
splay-based: mean time is 16.3sec; all runs take between 16.193 and
16.427 secs; 14 Mb of core

I expect compact-trie to scale better as the number of entries in the
list grows and with the number of clients and requests per second;
furthermore using it removes 50-100 LOC, and makes code more readable.

IMO it is the best compromise in terms of performance, resources
useage and expected scalability; before pursuing this further however
I'd like to have some feedback.

Thanks


On Wed, Jun 4, 2014 at 4:11 PM, Alex Rousskov
rouss...@measurement-factory.com wrote:
 On 06/04/2014 02:06 AM, Kinkie wrote:

 there are use cases
 for using a Trie (e.g. prefix matching for dstdomain ACL); these may
 be served by other data strcutures, but none we could come up with on
 the spot.

 std::map and StringIdentifier are the ones we came up with on the spot.
 Both may require some adjustments to address the use case you are after.

 Alex.


 A standard trie uses quite a lot of RAM for those use cases.
 There are quite a lot of areas where we can improve so this one is not
 urgent. Still I'd like to explore it as it's synchronous code (thus
 easier for me to follow) and it's a nice area to tinker with.

 On Tue, Jun 3, 2014 at 10:12 PM, Alex Rousskov
 rouss...@measurement-factory.com wrote:
 On 06/03/2014 08:40 AM, Kinkie wrote:
 Hi all,
   as an experiment and to encourage some discussion I prepared an
 alternate implementation of TrieNode which uses a std::map instead of
 an array to store a node's children.

 The expected result is a worst case performance degradation on insert
 and delete from O(N) to O(N log R) where N is the length of the
 c-string being looked up, and R is the size of the alphabet (as R =
 256, we're talking about 8x worse).

 The expected benefit is a noticeable reduction in terms of memory use,
 especially for sparse key-spaces; it'd be useful e.g. in some lookup
 cases.

 Comments?


 To evaluate these optimizations, we need to know the targeted use cases.
 Amos mentioned ESI as the primary Trie user. I am not familiar with ESI
 specifics (and would be surprised to learn you want to optimize ESI!),
 but would recommend investigating a different approach if your goal is
 to optimize search/identification of strings from a known-in-advance set.


 Cheers,

 Alex.








-- 
Francesco