from:"Adrian Chadd"

Re: SMP: logging

2010-02-24 Thread Adrian Chadd

On 24 February 2010 06:55, Amos Jeffries squ...@treenet.co.nz wrote:

 Ah, I did not realize cache.log daemon logging is not supported yet. One
 more reason to start with simple O_APPEND. As a side effect, we would be
 able to debug daemon log starting problems as well :-).


 Yay. Definitely +3 then. :)

Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may
be under Linux and it may be under certain FreeBSD versions, but it's
likely a side-effect of VFS locking than the actual specification.



adrian

Re: SMP: logging

2010-02-24 Thread Adrian Chadd

On 24 February 2010 18:06, Adrian Chadd adr...@squid-cache.org wrote:

 Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may
 be under Linux and it may be under certain FreeBSD versions, but it's
 likely a side-effect of VFS locking than the actual specification.

.. and it certainly won't be supported for logging-to-NFS.

I'd honestly just investigate a logging layer that implements some
kind of IPC mechanism (sockets, sysvshm, etc) that can handle logs
from multiple processes.

Or you go down the apache path - lock, append, unlock. Eww.



adrian

Re: [squid-users] 'gprof squid squid.gmon' only shows the initial configuration functions

2009-12-09 Thread Adrian Chadd

Talk to the freebsd guys (eg me) about pmcstat and support for your
hardware. You may just need to find / organise a backport of the
particular hardware support for your platform. I've been working on
profiling Lusca with pmcstat and some new-ish tools which use and
extend it in useful ways.

gprof data is almost certainly uselessly unreliable on modern CPUs.
Too much can and will happen between profiling ticks.

I can hazard a few guesses about where your CPU is going. Likely
candidate is poll() if your Squid is too old. First thing to do is
organise porting the kqueue() stuff if it isn't already included.

I can make more educated guesses about where the likely CPU hog
culprits are given workload and configuration file information.

Adrian

2009/12/10 Guy Bashkansky guy...@gmail.com:
Is there an oprofile version for FreeBSD? I thought it is limited to
Linux. On FreeBSD I tried pmcstat, but it gives an initialization
error.

My version of Squid is old and customized (so I can't upgrade) and may
not have builtin timers. Since what version did they appear?

As for gprof - even with the event loop on top, still the rest of the
table might give some idea why the CPU is overloaded. The problem is
- I see only initial configuration functions:

called/total parents
index %time self descendents called+self name index
called/total children
spontaneous
[1] 63.4 0.17 0.00 _mcount [1]
---
0.00 0.10 1/1 _start [3]
[2] 36.0 0.00 0.10 1 main [2]
0.00 0.10 1/1 parseConfigFile [4]
...
---
spontaneous
[3] 36.0 0.00 0.10 _start [3]
0.00 0.10 1/1 main [2]
---
0.00 0.10 1/1 main [2]
[4] 36.0 0.00 0.10 1 parseConfigFile [4]
0.00 0.09 1/1 readConfigLines [5]
0.00 0.00 169/6413 parse_line [6]
..

System info:

# uname -m -r -s
FreeBSD 6.2-RELEASE-p9 amd64

# gcc -v
Using built-in specs.
Configured with: FreeBSD/amd64 system compiler
Thread model: posix
gcc version 3.4.6 [FreeBSD] 20060305

There are 7 fork()s for unlinkd/diskd helpers. Can these fork()s
affect profiling info?

On Wed, Dec 9, 2009 at 2:04 AM, Robert Collins
robe...@robertcollins.net wrote:
On Tue, 2009-12-08 at 15:32 -0800, Guy Bashkansky wrote:
I've built squid with the -pg flag and run it in the no-daemon mode
(-N flag), without the initial fork().

I send it the SIGTERM signal which is caught by the signal handler, to
flag graceful exit from main().

I expect to see meaningful squid.gmon, but 'gprof squid squid.gmon'
only shows the initial configuration functions:

gprof isn't terribly useful anyway - due to squids callback based model,
it will see nearly all the time belonging to the event loop.

oprofile and/or squids built in analytic timers will get much better
info.

-Rob

Re: your suggestion for range_offset_limit

2009-11-26 Thread Adrian Chadd

the trick at least in squid-2 is to make sure that quick abort isn't
occuring. Or it will begin downloading the whole object, return the
requested range bit, and then abort the remainder of the fetch.



Adrian

2009/11/25 Amos Jeffries squ...@treenet.co.nz:
 Matthew Morgan wrote:

 On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries squ...@treenet.co.nz
 wrote:

 Matthew Morgan wrote:

 Sorry it's taking me so long to get this done, but I do have a question.

 You suggested making getRangeOffsetLimit a member of HttpReply.  There
 are
 two places where this method currently needs to be called: one is
 CheckQuickAbort2() in store_client.cc.  This one will be easy, as I can
 just
 do entry-getReply()-getRangeOffsetLimit().

 The other is HttpStateData::decideIfWeDoRanges in http.cc.  Here, all we
 have access to is an HttpRequest object.  I looked through the source to
 see
 if I could find where a request owned or had access to a reply, but I
 don't
 see anything like that.  If getRangeOffsetLimit were a member of
 HttpReply,
 what do you suggest doing here?  I could make a static version of the
 method, but that wouldn't allow caching the result.

 Ah. I see. Quite right.

 After a bit more though I find my original request a bit weird.

 Yes it should be a _Request_ member and do its caching there. You can go
 ahead with that now while we discuss whether to do a slight tweak on top
 of
 the basic feature.


 [cc'ing squid-dev so others can provide input]

 I'm not certain of the behavior we want here if we do open the ACLs to
 reply
 details. Some discussion is in order.

 Simple way would be to not cache the lookup the first time when reply
 details are not provided.

 It would mean making it return potentially two different values across
 the
 transaction.

  1) based on only request detail to
  and other on request+reply details. decide if a range request to
 possible.
 and then
 2) based on additional reply details to see if the abort could be done.

 No problem if the reply details cause an increase in the limit. But if
 they
 restrict it we enter grounds of potentially making a request then
 canceling
 it and being unable to store the results.


 Or, taking the maximum of the two across two calls? so it can only
 increase.
  would be slightly trickier involving a flag a well to short-circuit the
 reply lookups instead of just a magic cache value.

 Am I seriously over-thinking things today?


 Amos

 Here's a question, too: is this feature going to benefit anyone?  I
 realized later that it will not solve my problem, because all the
 traffic that was getting force downloaded ended up being from windows
 updates.  The urls showing up in netstat and such were just weird
 because the windows update traffic was actually coming from limelight.
  My ultimate solution was to write a script that reads access.log,
 checks for windows update urls that are not cached, and manually
 download them one at a time after hours.

 If there is anyone at all who would benefit from this I would still be
 *more* than glad to code it (as I said, it would be my first real open
 source contribution...very exciting), but I just wondered if anyone
 will actually use it.

 I believe people will find more control here useful.

 Windows update service packs are a big reason, but there are also similar
 range issues with Adobe Reader online PDFs, google maps/earth, and flash
 videos when paused/resumed. Potentially other stuff, but I have not heard of
 problems.

 This will allow anyone to fine tune the places where ranges are permitted or
 forced to fully cache. Avoiding the problems a blanket limit adds.


 As to which approach would be better, I don't know enough about that
 data path to really suggest.  When I initially made my changes, I just
 replaced each reference to Config.range_offset_limit or whatever.
 Today I went back and read some more of the code, but I'm still
 figuring it out.  How often would the limit change based on the
 request vs. the reply?

 Just the once. On first time being checked for the reply.
 And most likely on the case of testing for a reply mime type. The other
 useful info I can think of are all request data.

 You can ignore if you like. I'm just worrying over a borderline case.
 Someone else can code a fix if they find it a problem or need to do mime
 checks.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20
  Current Beta Squid 3.1.0.15

Re: squid-smp: synchronization issue solutions

2009-11-19 Thread Adrian Chadd

Right. Thats the easy bit. I could even do that in Squid-2 with a
little bit of luck. The hard bit is rewriting the relevant code which
relies on cbdata style reference counting behaviour. That is the
tricky bit.



Adrian

2009/11/20 Robert Collins robe...@robertcollins.net:
 On Wed, 2009-11-18 at 10:46 +0800, Adrian Chadd wrote:
 Plenty of kernels nowdays do a bit of TCP and socket process in
 process/thread context; so you need to do your socket TX/RX in
 different processes/threads to get parallelism in the networking side
 of things.

 Very good point.

 You could fake it somewhat by pushing socket IO into different threads
 but then you have all the overhead of shuffling IO and completed IO
 between threads. This may be .. complicated.

 The event loop I put together for -3 should be able to do that without
 changing the loop - just extending the modules that hook into it.

 -Rob

Re: Recent Facebook Issues

2009-10-09 Thread Adrian Chadd

I've emailed the facebook NOC directly about the issue.

Thanks,


Adrian

2009/10/9 Kinkie gkin...@gmail.com:
 You can try to access facebook with konqueror. It complains
 rather loudly, drops the excess data and the site generally doesn't
 work (has been dong so for a few days, but only NOW I'm connecting the
 wires...)

  Kinkie

 On Fri, Oct 9, 2009 at 2:52 AM, Adrian Chadd adr...@squid-cache.org wrote:
 Ok, this happens for all versions?

 I can bring this up with facebook engineering if someone provides me
 with further information.


 Adrian

 2009/10/9 Amos Jeffries squ...@treenet.co.nz:
 Thanks to several people I've managed to track down why the facebook issues
 are suddenly appearing and why its intermittent.

 On the sometimes works sometimes doesn't problem. facebook.com does
 User-Agent header checks and sends back one of four pages.
 1) a generic page saying 'please use another browser'.
 2) a redirect to login for each of IE, Firefox and Safari
 3) a home page (if cookies sent initially)

 going through the login redirects to the page also presented at (3) above.

 The home page is the real problem. When cookies are presented it ships
 without Content-Length (fine).
 When they _are_ present, ie after the user has logged in it ships with
 Content-Length: 18487 and data size of 18576.

 Amos






 --
    /kinkie

Re: Recent Facebook Issues

2009-10-08 Thread Adrian Chadd

Ok, this happens for all versions?

I can bring this up with facebook engineering if someone provides me
with further information.


Adrian

2009/10/9 Amos Jeffries squ...@treenet.co.nz:
 Thanks to several people I've managed to track down why the facebook issues
 are suddenly appearing and why its intermittent.

 On the sometimes works sometimes doesn't problem. facebook.com does
 User-Agent header checks and sends back one of four pages.
 1) a generic page saying 'please use another browser'.
 2) a redirect to login for each of IE, Firefox and Safari
 3) a home page (if cookies sent initially)

 going through the login redirects to the page also presented at (3) above.

 The home page is the real problem. When cookies are presented it ships
 without Content-Length (fine).
 When they _are_ present, ie after the user has logged in it ships with
 Content-Length: 18487 and data size of 18576.

 Amos

Re: Segfault in HTCP CLR request on 64-bit

2009-10-02 Thread Adrian Chadd

The whole struct is on the local stack. Hence bzero() or memset() to 0.

2009/10/2 Matt W. Benjamin m...@linuxbox.com:
 Bzero?  Is it an already-allocated array/byte sequence?  (Apologies, I 
 haven't seen the code.)  Assignment to NULL/0 is in fact correct for 
 initializing a sole pointer, and using bzero for that certainly isn't 
 typical.  Also, for initializing a byte range, memset is preferred [see Linux 
 BZERO(3), which refers to POSIX.1-2008 on that point].

 STYLE(9) says use NULL rather than 0, and it is clearer.  But C/C++ 
 programmers should know that NULL is 0.  And note that at least through 1998, 
 initialization to 0 was the preferred style in C++, IIRC.

 Matt

 - Adrian Chadd adr...@squid-cache.org wrote:

 I've just replied to the ticket in question. It should probably just
 be a bzero() rather than setting the pointer to 0. Which should
 really
 be setting it to NULL.

 Anyway, please test whether the bzero() works. If it does then I'll
 commit that fix to HEAD and 2.7.

 2009/9/28 Jason Noble ja...@linuxbox.com:
  I have opened a bug for this issue here:
 http://bugs.squid-cache.org/show_bug.cgi?id=2788  Also, the previous
 patch was not generated against head so I re-rolled the patch against
 current head and attached to the bug report

 --

 Matt Benjamin

 The Linux Box
 206 South Fifth Ave. Suite 150
 Ann Arbor, MI  48104

 http://linuxbox.com

 tel. 734-761-4689
 fax. 734-769-8938
 cel. 734-216-5309

Re: Segfault in HTCP CLR request on 64-bit

2009-09-25 Thread Adrian Chadd

Could you please create a bugzilla report for this, complete with a
patch against Squid-2.HEAD and 2.7? I'll then commit it.

2009/9/26 Jason Noble ja...@linuxbox.com:
 I recently ran into an issue where Squid 2.7 would segfault trying to issue
 HTCP CLR requests.  I found the segfault only occurred on 64-bit machines.
  While debugging, I found that the value of stuff.S.req_hdrs was not
 initialized but later, strlen was being called on it.  This seems to -- by
 chance -- not fail on 32 bit builds, but always segfaults on 64-bit.  The
 attached patch fixed the problem for me and it seems good programming
 practice to properly initialize pointers to prevent issues such as this.  As
 the htcpStuff struct is used in other places, I have concerns that other
 issues may be lurking as well, although I have yet to run into them.

 Regards,
 Jason

Re: Squid-smp : Please discuss

2009-09-15 Thread Adrian Chadd

2009/9/15 Sachin Malave sachinmal...@gmail.com:
 On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd adr...@squid-cache.org wrote:
 Guys,

 Please look at what other multi-CPU network applications do, how they
 work and don't work well, before continuing this kind of discussion.

 Everything that has been discussed has already been done to death
 elsewhere. Please don't re-invent the wheel, badly.

 Yes synchronization is always expensive . So we must target only those
 areas where shared data is updated infrequently. Also if we are making
 thread then the amount of work done must be more as compared to
 overheads required in thread creation, synchronization  scheduling.

Current generation CPUs are a lot, lot better at the thread-style sync
primitives than older CPUs.

There's other things to think about, such as lockless queues,
transactional memory hackery, atomic instructions in general, etc,
etc, which depend entirely upon the type of hardware being targetted.

 If we try to provide locks to existing data structures then
 synchronization factor will definitely affect to our design.

 Redesigning of such structures and there behavior is time consuming
 and may change whole design of the Squid.


Adrian

Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?

2009-09-15 Thread Adrian Chadd

But in that case, ACCESS_REQ_PROXY_AUTH would be returned rather than
ACCESS_DENIED..



Adrian

2009/9/15 Robert Collins robe...@robertcollins.net:
 On Tue, 2009-09-15 at 15:22 +1000, Adrian Chadd wrote:
 G'day. This question is aimed mostly at Henrik, who I recall replying
 to a similar question years ago but without explaining why.

 Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL?

 The particular bit in src/client_side.c:

 int require_auth = (answer == ACCESS_REQ_PROXY_AUTH ||
 aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;

 Is there any particular reason why auth is tried again? it forces a
 pop-up on browsers that already have done authentication via NTLM.

 Because it should? Perhaps you can expand on where you are seeing this -
 I suspect a misconfiguration or some such.

 Its entirely appropriate to signal HTTP_PROXY_AUTHENTICATION_REQUIRED
 when a user is denied access to a resource *and if they log in
 differently they could get access*.

 -Rob

Re: Squid-smp : Please discuss

2009-09-15 Thread Adrian Chadd

If you want to start looking at -threading- inside Squid, I'd suggest
thinking first how you'd create a generic thread helper framework
that allows Squid to run multiple internal threads that can do
stuff, and then implement some message/data queues and handle
notification between threads.

You can then push some stuff into these worker threads as an
experiment and see exactly what the issues are.

Building worker threads into Squid is easy. Making them do anything?
Not so easy :)


Adrian

2009/9/15 Sachin Malave sachinmal...@gmail.com:
 On Tue, Sep 15, 2009 at 1:38 AM, Adrian Chadd adr...@squid-cache.org wrote:
 2009/9/15 Sachin Malave sachinmal...@gmail.com:
 On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd adr...@squid-cache.org 
 wrote:
 Guys,

 Please look at what other multi-CPU network applications do, how they
 work and don't work well, before continuing this kind of discussion.

 Everything that has been discussed has already been done to death
 elsewhere. Please don't re-invent the wheel, badly.

 Yes synchronization is always expensive . So we must target only those
 areas where shared data is updated infrequently. Also if we are making
 thread then the amount of work done must be more as compared to
 overheads required in thread creation, synchronization  scheduling.

 Current generation CPUs are a lot, lot better at the thread-style sync
 primitives than older CPUs.

 There's other things to think about, such as lockless queues,
 transactional memory hackery, atomic instructions in general, etc,
 etc, which depend entirely upon the type of hardware being targetted.

 If we try to provide locks to existing data structures then
 synchronization factor will definitely affect to our design.

 Redesigning of such structures and there behavior is time consuming
 and may change whole design of the Squid.


 Adrian




 And  current generation libraries are also far better than older, like
 OpenMP, creating threads and handling synchronization issues in OpenMP
 is very easy...

 Automatic locks are provided, u need not to design your own locking
 mechanisms Just a statement and u can lock the shared
 variable...
 Then the major work remains is to identify the shared access.

 I WANT TO USE OPENMP library.

 ANY suggestions.

Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?

2009-09-14 Thread Adrian Chadd

G'day. This question is aimed mostly at Henrik, who I recall replying
to a similar question years ago but without explaining why.

Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL?

The particular bit in src/client_side.c:

int require_auth = (answer == ACCESS_REQ_PROXY_AUTH ||
aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;

Is there any particular reason why auth is tried again? it forces a
pop-up on browsers that already have done authentication via NTLM.

I've written a patch to fix this in Squid-2.7:

http://www.creative.net.au/diffs/2009-09-15-squid-2.7-auth_required_on_auth_acl_deny.diff

I'll create a bugtraq entry when I have some more background
information about this.

Thanks,


adrian

Re: Squid-smp : Please discuss

2009-09-14 Thread Adrian Chadd

Guys,

Please look at what other multi-CPU network applications do, how they
work and don't work well, before continuing this kind of discussion.

Everything that has been discussed has already been done to death
elsewhere. Please don't re-invent the wheel, badly.



Adrian

2009/9/15 Robert Collins robe...@robertcollins.net:
 On Tue, 2009-09-15 at 14:27 +1200, Amos Jeffries wrote:


 RefCounting done properly forms a lock on certain read-only types like
 Config. Though we are currently handling that for Config by leaking
 the
 memory out every gap.

 SquidString is not thread-safe. But StringNG with its separate
 refcounted
 buffers is almost there. Each thread having a copy of StringNG sharing
 a
 SBuf equates to a lock with copy-on-write possibly causing issues we
 need
 to look at if/when we get to that scope.

 General rule: you do /not/ want thread safe objectse for high usage
 objects like RefCount and StringNG.

 synchronisation is expensive; design to avoid synchronisation and hand
 offs as much as possible.

 -Rob

squid-2 - vary and x-accelerator-vary differences?

2009-08-04 Thread Adrian Chadd

G'day,

I just noticed in src/HttpReply.c that the vary expire option
(Config.onoff.vary_ignore_expire) is checked if the reply has HDR_VARY
set but it does not check if  HDR_X_ACCELERATOR_VARY is set.

Everywhere else in the code checks them both consistently and
assembles Vary header contents consistently from both.

Is this an oversight/bug? Is it intentional behaviour?

Thanks,


Adrian

multiple store-dir issues

2009-07-19 Thread Adrian Chadd

G'day,

I've fixed a potentially risky situation in Lusca relating to the
initialisation of the storeIOState cbdata type. Each storedir has a
different idea of how the allocation should be free()'ed.

The relevant commit in Lusca is r14208 -
http://code.google.com/p/lusca-cache/source/detail?r=14208 .

I'd like this approach to be included in Squid-2.HEAD and backported
to Squid-2.7 / Squid-2.6.

Thanks,


adrian

Re: multiple store-dir issues

2009-07-19 Thread Adrian Chadd

2009/7/20 Henrik Nordstrom hen...@henriknordstrom.net:

 I've fixed a potentially risky situation in Lusca relating to the
 initialisation of the storeIOState cbdata type. Each storedir has a
 different idea of how the allocation should be free()'ed.

 Risky in what sense?

Ah. I just re-re-re-read the code again and I now understand what is
going on. There are multiple definitions of storeIOState cbdata
being allocated instead of one. The definitions are local to each
module.

Ok. Sorry for the noise. I'll commit a fix to COSS for the
initialisation issue someone reported during reconfigure.



Adrian

Re: Hello from Mozilla

2009-07-16 Thread Adrian Chadd

2009/7/17 Ian Hickson i...@hixie.ch:

 That way you are still speaking HTTP right until the protocol change
 occurs, so any and all HTTP compatible changes in the path(s) will
 occur.

 As mentioned earlier, we need the handshake to be very precisely defined
 because otherwise people could trick unsuspecting servers into opting in,
 or rather appearing to opt in, and could then send all kinds of commands
 down to those servers.

Would you please provide an example of where an unsuspecting server is
tricked into doing something?

 Ian, don't you see and understand the semantic difference between
 speaking HTTP and speaking a magic bytecode that is intended to look
 HTTP-enough to fool a bunch of things until the upgrade process occurs
 ? Don't you understand that the possible set of things that can go wrong
 here is quite unbounded ? Don't you understand the whole reason for
 known ports and protocol descriptions in the first place?

 Apparently not.

Ok. Look at this.

The byte sequence GET / HTTP/1.0\r\nHost: foo\r\nConnection:
close\r\n\r\n is not byte equivalent to the sequence GET /
HTTP/1.0\r\nConnection: close\r\nHost: foo\r\n\r\n

The same byte sequence interpreted as a HTTP protocol exchange is equivalent.

There's a mostly-expected understanding that what happens over port 80
is HTTP. The few cases where that has broken (specifically Shoutcast,
but I do see other crap on port 80 from time to time..) has been by
people who have implemented a mostly HTTP looking protocol, tested
that it mostly works via a few gateways/firewalls/proxies, and then
deployed it.

 My suggestion is to completely toss the whole pretend to be HTTP thing
 out of the window and look at extending or adding a new HTTP mechanism
 for negotiating proper tunneling on port 80. If this involves making
 CONNECT work on port 80 then so be it.

 Redesigning HTTP is really much more work than I intend to take on here.
 HTTP already has an Upgrade mechanism, reusing it seems the right thing to
 do.

What you intend to take on here and what should be taken on here is
very relevant.
You're intending to do stuff over tcp/80 which looks like HTTP but
isn't HTTP. Everyone who implements anything HTTP gateway related (be
it a transparent proxy, a firewall, a HTTP router, etc) suddenly may
have to implement your websockets stuff as well. So all of a sudden
your attempt to not extend HTTP ends up extending HTTP.

 The point is, there may be a whole lot of stuff going on with HTTP
 implementations that you're not aware of.

 Sure, but with the except of man-in-the-middle proxies, this isn't a big
 deal -- the people implementing the server side are in control of what the
 HTTP implementation is doing.

That may be your understanding of how the world works, but out here in
the rest of the world, the people who deploy the edge and the people
who deploy the core may not be the same people. There may be a dozen
layers of red tape, equipment lifecycle, security features, etc, that
need to be handled before websockets happy stuff can be deployed
everywhere it needs to.

Please don't discount man-in-the-middle -anything- as being easy to deal with.

 In all cases except a man-in-the-middle proxy, this seems to be what we
 do. I'm not sure how we can do anything in the case of such a proxy, since
 by definition the client doesn't know it is present.

.. so you're still not speaking HTTP?

Ian, are you absolutely certain that everywhere you use the
internet, there is no man in the middle between you and the server
you're speaking to? Haven't you ever worked at any form of corporate
or enterprise environment? What about existing captive portal
deployments like wifi hotspots, some of which still use squid-2.5
(eww!) as their http firewall/proxy to control access to the internet?
That stuff is going to need upgrading sure, but I'd rather see the
upgrade happen once to a well thought out and reasonably well designed
protocol, versus having lots of little upgrades need to occur because
your HTTP but not quite HTTP exchange on port 80 isn't thought out
enough.




Adrian

Re: Hello from Mozilla

2009-07-15 Thread Adrian Chadd

2009/7/15 Ian Hickson i...@hixie.ch:
 On Tue, 14 Jul 2009, Alex Rousskov wrote:

 WebSocket made the handshake bytes look like something Squid thinks it
 understands. That is the whole point of the argument. You are sending an
 HTTP-looking message that is not really an HTTP message. I think this is
 a recipe for trouble, even though it might solve some problem in some
 environments.

 Could you elaborate on what bytes Squid thinks it should change in the
 WebSocket handshake?

Anything which it can under the HTTP/1.x RFCs.

Maybe I missed it - why exactly again aren't you just talking HTTP on
the HTTP port(s), and doing a standard HTTP upgrade?


Adrian

Re: Hello from Mozilla

2009-07-15 Thread Adrian Chadd

2009/7/15 Amos Jeffries squ...@treenet.co.nz:

 a) Getting a dedicated WebSocket port assigned.
   * You and the client needing it have an argument to get that port opened
 through the firewall.
   * Squid and other proxies can be altered to allow CONNECT through to safe
 defined ports (80 is not one). Or to do the WebSocket upgrade itself.

 b) accepting that the network being traversed is screwed beyond redemption
 by its own policy or admin.

I think the fundamental mistake being made here by Ian (and
potentially others) is breaking the assumption that specific protocols
exist on the well-known ports. Suddenly treating stuff on port 80 as
almost but not quite HTTP is bound to cause issues, both devices
speaking valid HTTP (eg Squid) and firewalls etc which may treat the
exchange as not HTTP and decide to start dropping things. Or worse -
passing it through, sort of.

Ian - I understand your motivations here but I think it shows a
fundamental mis-understanding of the glue which keeps the internet
mostly functioning together. Here's a question for you - would you run
a mythical protocol, call it foonet, over IP, if it looked
almost-but-not-quite like IP so people could run it on their existing
IP networks? Can you see any particular issues with that? Other slots
in the mythical OSI stack shouldn't be treated any differently.


Adrian

Re: [PATCH] Bug 2680: ** helper errors after -k rotate

2009-07-15 Thread Adrian Chadd

NOte that winbind has a hard coded limit that is by default very low.

Opening 2n ntlm_auth helpers may make things blow up in horrible ways.



Adrian

2009/7/16 Robert Collins robe...@robertcollins.net:
 On Thu, 2009-07-16 at 14:08 +1200, Amos Jeffries wrote:

 Both reconfigure and helper recovery use startHelpers() where the
 limit
 needs to take place.
 The DOS bug fix broke *rotate* (reconfigure has an async step added by
 Alex
 that prevents it being a problem).

 s/rotate/reconfigure then :) In my mind one is a subset of the other.

  If someone is running hundreds of helpers on openwrt/olpc then
 things
  are broken already :). I'd really suggest that such environments
  pipeline through a single helper rather than many concurrent
 helpers.
  Such platorms are single core and you'll get better usage of memory
  doing many requests in a single helper than one request each to many
  helpers.

 lol, NTLM concurrent? try it!

 I did. IIRC the winbindd is fully capable of handling multiple
 overlapping requests, and each NTLM helper is *solely* a thunk layer
 between squid's format and the winbindd *state*.

 ASCII art time, 3 requests:
 Multiple helpers:
       /--1-helper--\
 squid-*---2-helper---* winbindd [state1, state2, state3]
       \--3-helper--/
 One helper:
 squid-*--1-helper---* winbindd [state1, state2, state3]

 -Rob

squid-2.HEAD hanging with 304 not modified responses

2009-06-29 Thread Adrian Chadd

G'day guys,

I've fixed a bug in Lusca which was introduced with Benno's method_t
stuff. The specific bug is revalidation replies 'hanging' until the
upstream socket closes, forcing an end of message to occur.

The history and patch are here:
http://code.google.com/p/lusca-cache/source/detail?r=14103

Those of you toying with Squid-2.HEAD (eg Mark) - would you mind
verifying that you can reproduce it on Squid-2.HEAD and comment on the
fix?

Thanks,


adrian

NTLM authentication popups, etc

2009-06-16 Thread Adrian Chadd

I'm working on a couple of paid squid + active directory deployments
and they're both seeing the occasional NTLM auth popup happening.

The workaround is pretty simple - just enable the IP auth cache. This
however doesn't solve the fundamental problem(s), whatever they are.

The symptom is logs like this:

[2009/06/15 16:20:17, 1] libsmb/ntlmssp.c:ntlmssp_update(334)
  got NTLMSSP command 1, expected 3

And vice versa (expected 3, got 1.) These correspond to states in
samba/source/include/ntlmssp.h - 1 is NTLMSSP_NEGOTIATE; 3 is
NTLMSSP_AUTH.

The conclusion here is that there's a disconnect between the
authentication state of the client -and- the authentication state of
ntlm_auth.

I'm trying to eliminate the possibilities here.

The stateful helper stuff seems correct enough, so requests aren't
being queued to already busy stateful helpers.

The other two possibilities I can immediately think of:

* 1 - authentication is aborted somewhere for whatever reason; an
authentication helper is stuck at the wrong point in the state engine;
the next request coming along starts at NTLMSSP_NEGOTIATE but the
ntlm_auth helper it is handed to is at NTLMSSP_AUTH (from the partial
authentication attempt earlier); error
* 2 - the web browser is stuffing different phases of the negotiation
down different connections to the proxy.

Now, debugging (1) shouldn't be difficult at all. I'm going to try and
determine the code paths that lead to and from an aborted auth
request, add in some debugging and see if the helper is closed.

Debugging (2) without full logs (impractical in this environment) and
full traffic dump (again, impractical in production) is going to be a
bit more difficult. I'm thinking about adding some hacky code to the
Squid ntlm auth class which keeps a log of the auth blobs
sent/received from/to the client and ntlm_auth. I can then dump the
entire conversation out to cache.log whenever authentication
fails/errors. This should at least give me a hint as to what is going
on.

(1) can explain the client state == NTLMSSP_NEGOTATE but ntlm_auth
state is NTLMSSP_AUTH problem but not vice versa. (2) explains both.
It is quite possible it is the combination of both however.

Now, the reason this is getting somewhat annoying and why I'd like to
try and understand/fix it is that -another- problem seen by one of
these clients is negotiate/ntlm authentication from IE (at least IE8)
through Squid. I've got packet dumps showing the browser sending
different phases of the negotiation down separate proxy connections
and then reusing the original one incorrectly. My medium term plan is
to take whatever evidence I have of this behaviour and throw it at the
IE group(s) at Microsoft but in the short term I'd like to make
certain the proxy authentication side of things is completely
blameless before I hand off stuff to third parties.

Ideas? Comments?



adrian

Re: Very odd problem running squid 2.7 on Windows

2009-05-25 Thread Adrian Chadd

strtoul(). But if you want to verify the -whole- thing is numeric,
just write a bit of C which does this:

int isNumeric(const char *str)
{

}

2009/5/25 Amos Jeffries squ...@treenet.co.nz:
 Guido Serassio wrote:

 Hi,

 At 16.17 24/05/2009, Adrian Chadd wrote:

 Well as Amos said, this isn't the way to call getservbyname().

 getservbyname() doesn't translate ports to ports; it translates
 tcp/udp service names to ports. It should be returning NULL if it
 can't find the service string in the file.

 Methinks numeric values shouldn't be handed to getservbyname() under
 Windows. :)

 So, we have just found a Squid bug  :-)

 Regards

 Yes. Question becomes though, fastest way to detect numeric-only strings.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
  Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1

Re: Very odd problem running squid 2.7 on Windows

2009-05-25 Thread Adrian Chadd

Actually, it should probably be 1 vs 0; -1 still evaluates to true if
you go (if func()).

 I think the last few hours of fixing bad C and putting in error
checking messed me around a bit. I just wrote a quick bit of C to
double-check.

(Of course in C++ there's native bool types, no? :)

Sorry!


Adrian

2009/5/25 Kinkie gkin...@gmail.com:
 On Mon, May 25, 2009 at 2:21 PM, Adrian Chadd adr...@squid-cache.org wrote:
 int
 isUnsignedNumeric(const char *str)
 {
    for (; *str; str++) {
        if (! isdigit(*str))
            return -1;
    }
    return 1;
 }

 Wouldn't returning 0 on false instead of -1 be easier?
 Just a random thought..


 --
    /kinkie

Re: Very odd problem running squid 2.7 on Windows

2009-05-24 Thread Adrian Chadd

Well as Amos said, this isn't the way to call getservbyname().

getservbyname() doesn't translate ports to ports; it translates
tcp/udp service names to ports. It should be returning NULL if it
can't find the service string in the file.

Methinks numeric values shouldn't be handed to getservbyname() under Windows. :)

adrian

2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it:
 Hi,

 At 04.38 24/05/2009, Adrian Chadd wrote:

 Can you craft a small C program to replicate the behaviour?

 Sure, I wrote the following test program:

 #include stdio.h
 #include Winsock2.h

 void main(void)
 {
    u_short i, converted;
    WSADATA wsaData;
    struct servent *port = NULL;
    char token[32];
    const char proto[] = tcp;

    WSAStartup(2, wsaData);

    for (i=1; i65535; i++)
    {
        sprintf(token, %d, i);
        port = getservbyname(token, proto);
        if (port != NULL) {
            converted=ntohs((u_short) port-s_port);
            if (i != converted)
                printf(%d %d\n, i, converted);
       }
    }
    WSACleanup();
 }

 And this is the result on my Windows XP x64 machine (similar results on
 Windows 2000 and Vista):

 2 512
 258 513
 524 3074
 770 515
 782 3587
 1288 2053
 1792 7
 1807 3847
 2050 520
 2234 47624
 2304 9
 2311 1801
 2562 522
 2564 1034
 2816 11
 3328 13
 3586 526
 3853 3343
 4352 17
 4354 529
 4610 530
 4864 19
 4866 531
 5120 20
 5122 532
 5376 21
 5632 22
 5888 23
 6400 25
 7170 540
 7938 543
 8194 544
 8706 546
 8962 547
 9472 37
 10752 42
 10767 3882
 11008 43
 11266 556
 12054 5679
 13058 563
 13568 53
 13570 565
 13579 2869
 14380 11320
 14856 2106
 15372 3132
 15629 3389
 16165 9535
 16897 322
 17920 70
 18182 1607
 18183 1863
 19977 2382
 20224 79
 20233 2383
 20480 80
 20736 81
 20738 593
 21764 1109
 22528 88
 22550 5720
 22793 2393
 23049 2394
 23809 349
 24335 3935
 25602 612
 25856 101
 25858 613
 26112 102
 27392 107
 27655 1900
 27904 109
 28160 110
 28416 111
 28928 113
 29952 117
 30208 118
 30222 3702
 30464 119
 31746 636
 34049 389
 34560 135
 35072 137
 35584 139
 36106 2701
 36362 2702
 36608 143
 36618 2703
 36874 2704
 37905 4500
 38400 150
 38919 1944
 39173 1433
 39426 666
 39429 1434
 39936 156
 39945 2460
 40448 158
 42250 2725
 43520 170
 44806 1711
 45824 179
 45826 691
 47383 6073
 47624 2234
 47873 443
 47878 1723
 48385 445
 49166 3776
 49664 194
 49926 1731
 50188 3268
 50437 1477
 50444 3269
 50693 1478
 51209 2504
 52235 3020
 53005 3535
 53249 464
 53510 1745
 54285 3540
 55309 3544
 56070 1755
 56579 989
 56585 2525
 56835 990
 57347 992
 57603 993
 57859 994
 58115 995
 59397 1512
 60674 749
 62469 1524
 62980 1270
 64257 507
 65040 4350

 It seems that sometime (!!!) getservbyname() will incorrectly return
 something ...

 Regards

 Guido


 adrian

 2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it:
  Hi,
 
  One user has reported a very strange problem using cache_peer directive
  on
  2.7 STABLE6 running on Windows:
 
  When using the following config:
 
  cache_peer 192.168.0.63 parent 3329 0 no-query
  cache_peer rea.acmeconsulting.loc parent 3328 3130
 
  the result is always:
 
  2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0
  2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent
  rea.acmeconsulting.loc/13/3130
 
  Very odd 
 
  Debugging the code, I have found where is situated the problem.
 
  The following if GetService() from cache_cf.c:
 
  static u_short
  GetService(const char *proto)
  {
     struct servent *port = NULL;
     char *token = strtok(NULL, w_space);
     if (token == NULL) {
         self_destruct();
         return -1;              /* NEVER REACHED */
     }
     port = getservbyname(token, proto);
     if (port != NULL) {
         return ntohs((u_short) port-s_port);
     }
     return xatos(token);
  }
 
  When the value of port-s_port is 3328, ntohs() always returns 13.
  Other values seems to work fine.
 
  Any idea ?
 
  Regards
 
  Guido
 
 
 
  -
  
  Guido Serassio
  Acme Consulting S.r.l. - Microsoft Certified Partner
  Via Lucia Savarino, 1           10098 - Rivoli (TO) - ITALY
  Tel. : +39.011.9530135  Fax. : +39.011.9781115
  Email: guido.seras...@acmeconsulting.it
  WWW: http://www.acmeconsulting.it/
 
 


 -
 
 Guido Serassio
 Acme Consulting S.r.l. - Microsoft Certified Partner
 Via Lucia Savarino, 1           10098 - Rivoli (TO) - ITALY
 Tel. : +39.011.9530135  Fax. : +39.011.9781115
 Email: guido.seras...@acmeconsulting.it
 WWW: http://www.acmeconsulting.it/

Re: Very odd problem running squid 2.7 on Windows

2009-05-23 Thread Adrian Chadd

Can you craft a small C program to replicate the behaviour?






adrian

2009/5/24 Guido Serassio guido.seras...@acmeconsulting.it:
 Hi,

 One user has reported a very strange problem using cache_peer directive on
 2.7 STABLE6 running on Windows:

 When using the following config:

 cache_peer 192.168.0.63 parent 3329 0 no-query
 cache_peer rea.acmeconsulting.loc parent 3328 3130

 the result is always:

 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0
 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent
 rea.acmeconsulting.loc/13/3130

 Very odd 

 Debugging the code, I have found where is situated the problem.

 The following if GetService() from cache_cf.c:

 static u_short
 GetService(const char *proto)
 {
struct servent *port = NULL;
char *token = strtok(NULL, w_space);
if (token == NULL) {
self_destruct();
return -1;  /* NEVER REACHED */
}
port = getservbyname(token, proto);
if (port != NULL) {
return ntohs((u_short) port-s_port);
}
return xatos(token);
 }

 When the value of port-s_port is 3328, ntohs() always returns 13.
 Other values seems to work fine.

 Any idea ?

 Regards

 Guido



 -
 
 Guido Serassio
 Acme Consulting S.r.l. - Microsoft Certified Partner
 Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
 Tel. : +39.011.9530135  Fax. : +39.011.9781115
 Email: guido.seras...@acmeconsulting.it
 WWW: http://www.acmeconsulting.it/

Re: Is it really necessary for fatal() to dump core?

2009-05-19 Thread Adrian Chadd

2009/5/19 Mark Nottingham m...@yahoo-inc.com:
 I'm going to push back on that; the administrator doesn't really have any
 need to get a core when, for example, append_domain doesn't start with .'.

 Squid.conf is bloated as it is; if there are cases where a core could be
 conceivably useful, they should be converted to fatal_dump. From what I've
 seen they'll be a small minority at best...

Well, I'd be interested in seeing some better defined characteristics
of stuff with some sort of defined expectations and behaviour. Like
an API. :)

Right now, fatal, assert, etc are all used interchangably for quite a
wide variety of reasons and the codebase may be much better off if
someone starts off by fixing these a bit.




Adrian

Re: Is it really necessary for fatal() to dump core?

2009-05-18 Thread Adrian Chadd

just make that behaviour configurable?

core_on_fatal {on|off}



Adrian

2009/5/19 Mark Nottingham m...@yahoo-inc.com:
 tools.c:fatal() dumps core because it calls abort.

 Considering that the core can be quite large (esp. on a 64bit system), and
 that there's fatal_dump() as well if you really want one, can we just make
 fatal() exit(1) instead of abort()ing?

 Cheers,

 --
 Mark Nottingham       m...@yahoo-inc.com

Re: 3.0 assertion in comm.cc:572

2009-05-11 Thread Adrian Chadd

2009/5/11 Amos Jeffries squ...@treenet.co.nz:
 We have one user with a fairly serious production machine hitting this
 assertion.
 It's an attempted comm_read of closed FD after reconfigure.

 Nasty, but I think the asserts can be converted to a nop return. Does anyone
 know of a subsystem that would fail badly after a failed read with all its
 sockets and networking closed anyway?

That will bite you later on if/when you wanted to move to support
Windows overlapped IO / POSIX AIO style kernel async IO on network
sockets. You don't want read's scheduled on FDs that are closed; nor
do you want the FD closed during the execution of the read.

Figure out what is scheduling a read / what is scheduling the
completion incorrectly and fix the bug.



Adrian

/dev/poll solaris 10 fixes

2009-05-03 Thread Adrian Chadd

I'm giving my /dev/poll (Solaris 10) code a good thrashing on some
updated Sun hardware. I've fixed one silly bug of mine in 2.7 and
2.HEAD.

If you're running Solaris 10 and not using the /dev/poll code then
please try out the current CVS version(s) or wait for tomorrow's
snapshots.

I'll commit whatever other fixes are needed in this environment here :)

Thanks,


Adrian

Squid-2/Lusca async io shortcomings..

2009-04-10 Thread Adrian Chadd

Hi all,

I've been braindumping my thoughts into the Lusca blog during some
experimental development to eliminate the data copy in the disk store
read path. This shows up as the number 1 CPU abuser in my test CDN
deployment - where I see a 99% hit rate on a set of large objects (
16meg.)

My first idea was to avoid having to paper over the storage code
shortcomings with refcounted buffers, and modify various bits of code
to keep the store supplied read buffer around until the completion of
said read IO. This mirrors the requirements for various other
underlying async io implementations such as posix AIO and windows
completion IO.

Unfortunately the store layer and the async IO code doesn't handle
event cancellation right (ie, you can't do it) but the temporary read
buffer in async_io.c + the callback data pointer check papers over
that. Store reads and writes may be scheduled and in flight when some
other part of code calls storeClose() and nothing really tries to wait
around for the read IO to complete.

So either the store layer needs to be made slightly more sane (which I
may attempt later), or the whole mess can stay a mess and be papered
over by abusing refcounted buffers all the way down to the IO layer.

Anyway, I know there are other developers out there working on
filesystem code for Squid-3 and I'm reasonably certain (read: at last
check a few months ago) the store layer and IO layers are just as
grimey - so hopefully my braindumping will save some more of you a
whole lot of headache. :)




Adrian

Re: Feature: quota control

2009-02-26 Thread Adrian Chadd

I'm looking at implementing this as part of a contract for squid-2.

I was going to take a different approach - that is, i'm not going to
implement quota control or management in squid; I'm going to provide
the hooks to squid to allow external controls to handle the quota.



adrian

2009/2/21 Pieter De Wit pie...@insync.za.net:
 Hi Guys,

 I would like to offer my time in working on this feature - I have not done
 any squid dev, but since I would like to see this feature in Squid, I
 thought I would take it on.

 I have briefly contacted Amos off list and we agreed that there is no set
 in stone way of doing this. I would like to propose that we then start
 throwing around some ideas and let's see if we can get this into squid :)

 Some ideas that Amos quickly said :

   - Based on delay pools
   - Use of external helpers to track traffic


 The way I see this happening is that a Quota is like a pool that empties
 based on 2 classes - bytes and requests. Requests will be for things like
 the number of requests, i.e. a person is only allowed to download 5 exe's
 per day or 5 requests of 1meg or something like that (it just popped into
 my head :) )

 Bytes is a pretty straight forward one, the user is only allowed x amount of
 bytes per y amount of time.

 Anyways - let the ideas fly :)

 Cheers,

 Pieter

Resigning from squid-core

2009-01-31 Thread Adrian Chadd

Hi all,

It's been a tough decision, but I'm resigning from any further active
role in the Squid core group and cutting back on contributing towards
Squid development.

I'd like to wish the rest of the active developers all the best in the
future, and thank everyone here for helping me develop and test my
performance and feature related Squid work.



Adrian

Re: Buffer/String split, take2

2009-01-21 Thread Adrian Chadd

2009/1/21 Kinkie gkin...@gmail.com:

 What I fear from the DC approach is that we'll end up with lots of
 duplicate code between the 'buffer' classes, to gain a tiny little bit
 of efficiency and semantic clarity. If that approach has to be taken,
 then I'd rather take the variant of the note - in fact that's quite in
 line with what the current (agreeably ugly) code does.

The trouble is that the current, agreeably ugly code, actually works
(for values of works) right now, and the last thing the project
needs is for that works bit to be disturbed too much.

 In my opinion the 'universal buffer' model can be adapted quite easily
 to address different uses by extending its allocation strategy - it's
 a self-contained function of code exactly for this purpose, and it
 could be extended again by using Strategy patterns to do whatever the
 caller wishes. It would be trivial for instance for users to request
 that the underlying memory be allocated by the pageful, or to request
 preallocation of a certain amount of memory if they know they'll be
 using, etc.
 Having a wide interface is a drawback of the Universal approach,

But you don't know how that memory should be arranged. If its just for
strings, then you know the memory should be arranged in whatever makes
sense to minimise memory allocator overheads. In the parsing codepath,
that involves parsing and creating references to an already-allocated
large chunk of RAM, instead of copying into separately allocated
areas. For things like disk IO (and later on, network IO too!) this
may not be as obvious a case. In fact, based on the -provider-
(anonymous? disk? network? some peer module?) you may want to request
pages from -them- to put data into for various reasons, as simply
grabbing an anonymous page from the system allocator and filling it
with data may need -another- copy step later on.

This is why I'm saying that right now, focusing on -just- the String
stuff and the minimum required to do copy-free parsing and copying in
and out of the store is probably the best bet. A universal buffer
method is probably over-reaching things. There's a lot of code in
Squid which needs tidying up and whatever we come up and -all- of it
-has- to happen -regardless- of what buffer abstraction(s) we choose.

 Regarding vector i/o, it's almost a no-brainer at a first glance:
 given UniversalBuffer, implement UniversalBufferList and make MemBuf
 use the latter to implement producer-consumer semantics. Then use this
 for writev(). produce and consume become then extremely lightweight
 calls. Let me remind you that currently MemBuf happily memmoves
 contents at each consume, and other producer-consumer classes I could
 find (BodyPipe and StoreEntry) are entirely different beasts, which
 would benefit from having their interfaces changed to use
 UniversalBuffers, but probably not their innards.

And again, what I'm saying here is that a conservative, cautious
approach now is likely to save a lot of risk in the development path
forward.

 Regarding Adrian's proposal, he and I discussed the issue extensively.
 I don't agree with him that the current String will give us the best
 long-term benefits. My expectation is (but we can only know after we
 have at least some extensive use of it) that the cheap substringing
 features of the current UniversalBuffer implementation will give us
 substantial benefits in the long term.
 I agree with him that fixing the most broken parts of the String
 interface is a sensible strategy for merging whatever String
 implementation we end up choosing.

 I fear that if we focus too much on the long-term, we may end up
 losing sight of the medium-term, and thus we risk reaching neither
 because short-term noone does anything. EVERYONE keeps on asserting
 that squid (2 and 3) has low-level issues to be fixed, yet at the same
 time only Adrian does something in squid-2, and I feel I'm the only
 one trying to do something in squid-3 - PLEASE correct me and prove me
 wrong.

*shrug* I think people keep choosing the wrong bits to bite off. I'm
not specifically talking about you Kinkie, this certainly isn't the
only instance where the problem isn't really fully understood.

The problem in my eyes is that noone understands the entire Squid-3
codebase enough to start to understand what needs to happen and begin
engineering an actual path forward. Everyone knows their little
corner of the codebase. Squid-3 seems to be plagued by little
mini-projects which focus on specific areas without much knowledge of
how it all holds together, and all kinds of busted behaviour ensues.

 There's another issue which worries me: the current implementation has
 been in the works for 5 months; there have been two extensive reviews,
 two half-rewrites and endless discussions. Now the issue crops up that
 the basic design - whose blueprint has also been available for 5
 months in the wiki - is not good, and that we may end up having to
 basically start from scratch. How can we as

Re: Ref-counted strings in Squid-2/Cacheboy

2009-01-21 Thread Adrian Chadd

I'd like to avoid having to write to those pages if possible. Leaving
the incoming data as read-only will save another write-back pass for
those pages through the cache/bus, and in the case of tiny objects
(ie, where parsing becomes a -big- part of the overhead), that may end
up hurting.

NUL terminated strings make iteration easier (you only need an address
register and a check for 0) but current CPUs with their
plenty-of-registers and superscalar execution mostly make that point
moot. You can check, increment the pointer and decrement a length
value pretty damned quickly. :)

There aren't all that many places that assume C buffer semantics for
String. Most of it isn't all that hairy (access_log, etc); some of it
is only hairy because of the use of _C_ string library functions with
String.buf() (ftp); the biggest annoyance is the vary code and the
client-side code. Oh, and one has to copy the buffer anyway for regexp
lookups (POSIX regex API requires a NUL terminated string), at least
until we convert to PCRE which can and does take a length parameter to
a regex run function. :)

The point is, once you've been forced to tidy up the String users by
removing the assumption that NUL will occur, you'll (hopefully) have
been forced to write nicer replacement code, and everyone benefits
from that.


Adrian

2009/1/21 Henrik Nordstrom hen...@henriknordstrom.net:
 fre 2009-01-16 klockan 12:53 -0500 skrev Adrian Chadd:

 So far, so good. It turns out doing this as an intermediary step
 worked out better than trying to replace the String code in its
 entirety with replacement code which doesn't assume NUL terminated
 strings.

 Just a thought, but is there really any parsing step where we can not
 just overwrite the next octet with a \0 to get null-terminated strings?
 This is what the parser does today, right?

 The HTTP parser certainly can in-place null-terminate everything. Header
 names always ends with a : which we always throw away, and the data ends
 with a newline which is also thrown away.

 Regards
 Henrik

Re: IRC Meetup logs up in the wiki

2009-01-21 Thread Adrian Chadd

 Uhm, guess I go on holiday and miss out on EVERYTHING I got back on the
 17th and would have loved to attend had I the precence of mind to have
 checked.

:) Hey, someone got a holiday! Quick, he's relaxed enough now to work! :)


 Sorry guys.

 In other news I've got some new exposed counters for squid-2 performance -
 will port to 3.1 and then submit for review. Also planning to extend
 cachemgr to output in xml as alternative, will allow far simpler processing
 and xsl transforms.

Do you have the patches against Squid-2 available?


adrian


 Extended cacti monitoring of all relevant bits is in process and will be
 available soon.

 Regardt

Re: Buffer/String split, take2

2009-01-20 Thread Adrian Chadd

2009/1/20 Alex Rousskov rouss...@measurement-factory.com:

 Please voice your opinion: which design would be best for Squid 3.2 and
 the foreseeable future.

[snip]

I'm about 2/3rds of the way along the actual implementation path of
this in Cacheboy so I can provide an opinion based on increasing
amounts of experience. :)

[Warning: long, somewhat rambly post follows, from said experience.]

The thing I'm looking at right now is what buffer design is required
to adequately handle the problem set. There's a few things which we
currently do very stupidly in any Squid related codebase:

* storeClientCopy - which Squid-2.HEAD and Cacheboy avoid the copy on,
but it exposes issues (see below);
* storeAppend - the majority of data coming -into- the cache (ie,
anything from an upstream server, very applicable today for forward
proxies, not as applicable for high-hit-rate reverse proxies) is still
memcpy()'ed, and this can use up a whole lot of bus time;
* creating strings - most strings are created during parsing; few are
generated themselves, and those which are, are at least half static
data which shouldn't be re-generated over and over and over again;
* duplicating strings - httpHeaderClone() and friends - dup'ing
happens quite often, and making it cheap for the read only copies
which are made would be fantastic
* later on, being able to use it for disk buffers, see below
* later on, being able to properly use it for the memory cache, again see below

The biggest problems I've hit thus far stem from the data pipeline
from server - memstore - store client - client side. At the moment,
the storeClientCopy() call aggregates data across the 4k stmem page
size (at least in squid-2/cacheboy, I think its still 4k in squid-3)
and thus if your last access gave you half a page, your next access
can get data from both the other half of the page and whatever is in
the next buffer. Just referencing the stmem pages in 2.HEAD/Cacheboy
means that you can (and do) end up with a large number of small reads
from the memory store. You save on the referencing, but fail on the
work chunk size. You end up having to have a sensible reference
counted buffer design -and- a vector list to operate on it with.

The string type right now makes sense if it references a contiguous,
linear block of memory (ie, a sub-region of a contig buffer). This is
how its treated today. For almost all of the lifting inside Squid
proper, that may be enough. There may however be a need later on for
string-like and buffer-like operations on buffer -vectors- - for
example, if you're doing some kind of content scanning over incoming
data, you may wish to buffer your incoming data until you have enough
data to match that string which is straddling two buffers - and the
current APIs don't support it. Well, nothing in Squid supports it
currently, but I think its worth thinking about for the longer term.

Certainly though, I think that picking a sensible string API with
absolutely no direct buffer access out of a few controlled areas (eg,
translating a list of strings or list of buffers into an iovec for
writev(), for example) is the way to go. That will equip Squid with a
decent enough set of tools to start converting everything else which
currently uses C strings over to using Squid Strings and eventually
reap the benefits of the zero-cost string duplication.

Ok, to summarise, and this may not exactly be liked by the majority of
fellow developers:

I think the benefits that augmenting/fixing the current SquidString
API and tidying up all the bad places where its used right now is
going to give you the maximum long-term benefit. There's a lot of
legacy code right now which absolutely needs to be trashed and
rewritten. I think the smartest path forward is to ignore 95% of the
decision about deciding which buffering method to use for now, fix the
current String API and all the code which uses it so its sensible (and
fixing it so its sensible won't take long; fixing the code which
uses it will take longer) and at that point the codebase will be in
much better shape to decide which will be the better path forward.

Now, just so people don't think I'm stirring trouble, I've gone
through this myself in both a squid-2 branch and Cacheboy, and here's
what I found:

* there's a lot of code which uses C strings created from Strings;
* there's a lot of code which init'ed strings from C strings, where
the length was already known and thrown out;
* there's a lot of code which init'ed strings from C strings which
were once Strings;
* there's even code which init's strings -from- a string, but only by
using strBuf(s) (I'm pointing at the http header related code here,
ugh)
* all the stuff which directly accesses the string buffer code can and
should be tossed, immediately - unfortunately there's a lot of it, the
majority being in what I gather is very long-lived code in
src/client_side.c (and what it became in squid-3)

So what I'm sort of doing now in Cacheboy-head, combined

Ref-counted strings in Squid-2/Cacheboy

2009-01-16 Thread Adrian Chadd

I've just created a branch off of my Cacheboy tree and dumped in the
first set of changes relating to ref-counted strings into it.

They're not as useful and flexible as the end-goal we all want -
specifically, this pass just creates ref counted NUL-terminated C
strings, so creating references of regions of other strings / buffers
isn't possible. But it does mean that duplicating header sets (ie,
httpHeaderClone() I think?) becomes bloody cheap. The next move -
removing the requirement for the NUL-termination - is slightly hairer,
but still completely doable (and I've done it in a previous branch in
sourceforge, so I know whats required.) Thats when the real benefits
start to appear.

So far, so good. It turns out doing this as an intermediary step
worked out better than trying to replace the String code in its
entirety with replacement code which doesn't assume NUL terminated
strings.

http://code.google.com/p/cacheboy/source/list?path=/branches/CACHEBOY_HEAD_strref

This, and all the other gunk thats gone into cacheboy over the last
few months during the reorganisation and tidyup, still mostly
represents where I think Squid core codebase should have gone / should
be going at the present time.

Enjoy. :)


Adrian

Re: [PATCH] WCCPv2 documentation and cleanup for bug 2404

2009-01-10 Thread Adrian Chadd

Have you tested these changes against various WCCPv2 implementations?

I do recall some structure definitions in the draft mis-matching the
wide number of IOS versions out there, this is why I'm curious.



Adrian

2009/1/10 Amos Jeffries squ...@treenet.co.nz:
 This patch:
  - adds a reference to each struct mentioning the exact draft
   RFC section where that struct is defined.
  - fixes sent mask structure fields to match draft. (bug 2404)
  - removes two duplicate useless structs

 Submitting as a patch to give anyone interested time to double-check the
 code changes.


 As a result we are a step closer toward splitting the code into a separate
 library. It's highlighted some of the WCCPv2 issues and a pathway forward
 now clear:
  - move types definitions to a protocol types header (wccp2_types.h ?)
  - correct mangled definitions for generic use. including code in that.
  - add capability handling
  - add hash/mask service negotiation
  - add sibling peer discovery through WCCP group details ??


 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11
  Current Beta Squid 3.1.0.3

Re: When can we make Squid using multi-CPU?

2009-01-07 Thread Adrian Chadd

2009/1/8 Alex Rousskov rouss...@measurement-factory.com:

 SMP support has been earmarked for Squid v3.2 but there is currently not
 enough resources to make it happen (AFAICT) so it may have to wait until
 v3.3 or later.

 FWIW, I think that multi-core scalability in many environments would not
 require another Squid rewrite, especially if initial support does not
 have to do better than running multiple Squids.

Well, people are already doing that where its suitable. Whats really
missing for those sorts of setups is a simple(!) storage-only backend
and some smarts in Squid to be able to push and pull stuff out of a
shared storage backend, rather than relaying through it.

The trouble, as I've found here, is if you're trying to aggregate a
bunch of forward proxy squid instances on one box through one backend
squid instance - all of a sudden you end up with lots of RAM wastage
and things die at high loads with all the duplicate data floating
around in socket buffers. :/



Adrian

Re: When can we make Squid using multi-CPU?

2009-01-05 Thread Adrian Chadd

I've been looking into what would be needed to thread squid as part of
my cacheboy squid-2 fork.

Basically, I've been working on breaking out a bunch of the core code
into libraries, which I can then check and verify are thread-safe. I
can then use these bits in threaded code.

My first goal was probably to break out the ACL and internal URL
rewriter code into threads, but the current use of the callback data
setup in Squid makes passing cbdata pointers into other threads quite
uhm, tricky.

The basic problem is that although a given chunk of memory backing a
cbdata pointer will remain valid for as long as the reference exists,
the -data itself- may not be valid at any point. So if thread A
creates a cbdata pointer and passes it into thread B to do something
(say an ACL lookup), there's no way (at the moment) for thread B to
guarantee at any/all points during its execution that the data in B
will stay valid without a whole lot of pissing around with locking,
which I'd absolutely like to avoid doing in a high performance network
application even the apparent wonderful performance current hardware
has w/ lots of locking. :)

So for the time being, I'm looking at what would be needed for a basic
inter-thread batch event/callback message queue, sort of like
AsyncCalls in squid-3 but minus 100% of the legacy cruft; and then
I'll see what kind of tasks can be pushed out to the threads.

Hopefully a bunch of stuff can be easily pushed out to threads with a
minimum amount of effort, such as some/all of the ACL lookups, some
URL rewriting, some GZIP and other kind of basic content mainpulation,
and the freakishly simple (comparitively) server-side HTTP code
(src/http.c). But doing that requires making sure a bunch of the low
level code is suitably re-enterant/thread-safe/etc, and this includes
a -lot- of stuff (lib/, debug, logging, memory allocation, some
statistics gathering, chunks of the HTTP parsing and packing routines,
the packer routines, membufs, etc.)

Thankfully (in Cacheboy) I've broken out almost all of the needed
stuff into top-level libraries which can be independently audited for
thread-happiness. There's just some loose ends which need tidying up.
For example, almost all of the code in libhttp/ in cacheboy (ie, basic
http header and header entry stuff, parsing, range request headers,
cc, headers, etc) are thread-safe, but the functions -they- call (such
as the base64 functions) use static buffers which may or may not be
thread-safe. Stuff which calls the legacy non-safe inet_* routines, or
perhaps the non thread-safe strtok() and other string.h functions, all
need to be fixed.

Threading the rest of it would take a lot, -lot- more time. A
thread-aware storage backend (disk, memory, store index) is definitely
an integral part of making a threaded Squid, and a whole lot more code
modularity and reorganisation would have to take place for that to
occur.

Want to help? :)


Adrian

2009/1/4 ShuXin Zheng zhengshu...@gmail.com:
 I've ever do this to run multi-squid on one machine which can use multi-CPU,
 but can't share the same store-fs, and must configure multi-IP on the same
 machine. Can we rewrite squid as follow:

 thread0(client side, no block, can accept many connections)  thread1
 ..threadn(n=CPU number)
   |
  |
   v
 v
 access check
   access check
  |
 |
  v
v
 http header parse
   http header parse
  |
 |
  v
v
 acl filter
   acl filter
  |
 |
  v
v
 check local cache
  check local cache
  |
 |
  v
v
 ---
  |
 neighbor| ||-ufs
 webserver--|--- forward - |store fs |-aufs
| |
 |-coss
 ---
  |(thread0)
  |(thread1) ..
  v
 v
  ...




 2009/1/4  anest...@cisdi.com:

 I've found the best way is to run multiple copies of squid on a single
 machine, and use LVS to load balance between the squid processes.

 -- Joe

 Quoting Adrian Chadd adr...@squid-cache.org:

 when someone decides to either help code it up, or donate towards the
 effort.



 adrian

 2009/1/3 ShuXin Zheng zhengshu...@gmail.com:

 Hi, Squid current can only use one CPU, but multi-CPU hardware
 machines are so popular. These are so greatly wastely. How can we use
 the multi-CPU? Can we separate some parallel sections which are CPU
 wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us
 some thinking about using multi-CPU, so can we use these technology in
 Squid?

 Thanks

Re: When can we make Squid using multi-CPU?

2009-01-03 Thread Adrian Chadd

when someone decides to either help code it up, or donate towards the effort.



adrian

2009/1/3 ShuXin Zheng zhengshu...@gmail.com:
 Hi, Squid current can only use one CPU, but multi-CPU hardware
 machines are so popular. These are so greatly wastely. How can we use
 the multi-CPU? Can we separate some parallel sections which are CPU
 wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us
 some thinking about using multi-CPU, so can we use these technology in
 Squid?

 Thanks

 --
 zsxxsz

Re: Introductions

2008-12-31 Thread Adrian Chadd

Welcome!

2008/12/30 Regardt van de Vyver sq...@vdvyver.net:
 Hi Dev Team.

 My name is Regardt van de Vyver, a technology enthusiast who tinkers with
 squid on a regular basis. I've been involved in development for around 12
 years and am an active participant on numerous open source projects.

 Right now I'm focussed on improving and extending performance metrics for
 squid, specifically related to SNMP and the cachemanager.

 I'd like to take a more active role in the coming year from a dev
 perspective and feel the 1st step here is to at least get my butt onto the
 dev mailing list ;-)

 I look forward to getting involved.

 Regards,

 Regardt van de Vyver

src/debug.cc : amos?

2008-12-25 Thread Adrian Chadd

Amos, whats this for in src/debug.cc ?

//*AYJ:*/if (!Config.onoff.buffered_logs)
fflush(debug_log);



Adrian

Re: Migrating debug code from src/ to src/debug/

2008-12-25 Thread Adrian Chadd

Ok, besides the lacking build dependency on src/core and src/debug, I
think the first round of changes are finished. That is, the ctx/debug
routines and all that they depend on have been shuffled out of src/
and into src/core / src/debug as appropriate.

I've pushed the changes to the launchpad URL mentioned previously.

I'd like some feedback and some assistance figuring out how/where to
convince src/Makefile.am that the two above directories are build
prereqs for almost everything. There are a -lot- of build targets in
that Makefile under Squid-3 and I'm not sure that I want to add to the
mess in a naive way.

Thanks,



Adrian

Re: X-Vary-Options support

2008-12-21 Thread Adrian Chadd

2008/12/20 Mark Nottingham m...@yahoo-inc.com:
 I agree. My impression was that it's pretty specific to their requirements,
 not a good general solution.

Well, I'm all ears about a slightly more flexible solution. I mean,
this is an X-* header; we could simply document it as a Squid specific
feature once a few basic concerns have been addressed, and leave
nutting out the right solution to the IETF group. :)



Adrian

Re: Migrating debug code from src/ to src/debug/

2008-12-21 Thread Adrian Chadd

2008/12/18 Adrian Chadd adr...@freebsd.org:
 I've begun fiddling with migrating the bulk of the debug code out of
 src/ and into src/debug/; as per the source reorganisation wiki page.

The next step is migrating some other stuff out and doing some API
hiding hijinx of the debugging logfile code - a bunch of code directly
frobs the debug log fd/filehandle for various nefarious purposes. Grr.

The other next thing is to sort out where to put the SquidTime stuff,
which is used by the debug code. I'll create src/core for now in my
branch to put this random stuff; I'll worry about the final
destination for it all later.

I couldn't tease apart ctx and debug all that much in cacheboy (and I
couldn't figure out how it should or may be done as an exercise
either) so I'll just lump them together.



Adrian

Re: Migrating debug code from src/ to src/debug/

2008-12-21 Thread Adrian Chadd

Would someone perhaps enlighten me why Squid-3 is trying to install
src/SquidTime.h as part of some build rule, and why moving it out of
the way (into src/core/) has resulted in make install completely
failing?

I'm having some real trouble understanding all of the gunk thats in
the Squid-3 src/Makefile.am and its starting to give me a headache.

Thanks,


Adrian

Migrating debug code from src/ to src/debug/

2008-12-18 Thread Adrian Chadd

I've begun fiddling with migrating the bulk of the debug code out of
src/ and into src/debug/; as per the source reorganisation wiki page.

The first step is to just relocate the syslog facility code out, which
I've done.

The next step is to break out the debug code which handles the actual
debugging into src/debug/.

The changes can be viewed at
http://bazaar.launchpad.net/~adrian-squid-cache/squid/adrian_src_reorganise/
.

I'll post again when I've finished the debug code shuffle so I can
figure out the right way to submit the change request.



Adrian

X-Vary-Options support

2008-12-17 Thread Adrian Chadd

Hi,

I've got a small contract to get Squid going in front of a small group
of Mediawiki servers and one of the things which needs adding is the
X-Vary-Options support.

So is there any reason whatsoever that it can't be committed to
Squid-2.HEAD as-is, and at least backported (but not committed to
start with) to squid-2.7?

I remember the Wiki guys' issues wrt Variant purging, which I'm hoping
Y! and Benno have sorted out, and I'm not looking to commit anything
relating to that now - just the X-Vary-Options support.

Thanks,


Adrian

Re: Request for new round of SBuf review

2008-12-06 Thread Adrian Chadd

Howdy,

As most of you aren't aware, Kinkie, alex and I had a bit of a
discussion about this on IRC rather than on the mailing list, so
there's probably some other stuff which should be posted here.

Kinkie, are you able to post some updated code + docs after our discussion?

My main suggestion to Kinkie was to take his code and see how well it
worked with some test use cases - the easiest and most relevant one
being parsing HTTP requests and building HTTP replies. I think that a
few test case implementations outside of the Squid codebase will be
helpful in both understanding the issues which this sort of class is
trying to solve.

I would really be against integrating it into Squid mainline until
we've all had a chance to play with it without being burdened by the
rest of Squid. :)



Adrian


2008/12/4 Kinkie [EMAIL PROTECTED]:
 Hi all,
   I feel that SBuf may just be complete enough to be considered a
 viable replacement for SquidString, as a first step towards
 integration.
 I'd appreciate anyone's help in giving it a check to gather feedback
 and suggestions.

 Doxygen documentation for the relevant classes is available at
 http://eu.squid-cache.org/~kinkie/sbuf-docs/ , the code is at
 lp:~kinkie/squid/stringng
 (https://code.launchpad.net/~kinkie/squid/stringng).

 Thanks!

 --
/kinkie

Re: The cache deny QUERY change... partial rollback?

2008-12-01 Thread Adrian Chadd

2008/12/1 Henrik Nordstrom [EMAIL PROTECTED]:
 After analyzing a large cache with significantly declining hit ratio
 over the last months I have came to the conclusion that the removal of
 cache deny QUERY can have a very negative impact on hit ratio, this due
 to a number of flash video sites (youtube, google, various porno sites
 etc) who include per-view unique query parameters in the URL and
 responding with a cachable response.

 Because of this I suggest that we add back the cache deny rule in the
 recommended config, but leave the refresh_pattern change as-is.

 People running reverse proxies or combating these cache busting sites
 using store rewrites know how to change the cache rules, while many
 users running general proxy servers are quite negatively impacted by
 these sites if caching of query urls is allowed.

Hm, thats kind of interesting actually. Whats it displacing from the
cache? Is the drop of hit ratio due to the removal of other cachable
large objects, or other cachable small objects? Is it -just- flash
video thats exhibiting this behaviour?

Are you able to put up some examples and statistics? I really think
the right thing to do here is look at what various sites are doing and
try to open a dialogue with them. Chances are they don't really know
exactly how to (ab)use HTTP to get the semantics they want whilst
retaining control over their content.



Adrian

Re: Rv: Why not BerkeleyDB based object store?

2008-11-26 Thread Adrian Chadd

I thought about it a while ago but i'm just out of time to be honest.
Writing objects to disk only if they're popular or you need the RAM to
handle concurrent accesses for large objects for some reason would
probably way way improve disk performance as the amount of writing
would drop drastically.

Sponsorship for investigating and developing this is gladly accepted :)


Adrian


2008/11/26 Mark Nottingham [EMAIL PROTECTED]:
 Just a tangental thought; has there been any investigation into reducing the
 amount of write traffic with the existing stores?

 E.g., establishing a floor for reference count; if it doesn't have n refs,
 don't write to disk? This will impact hit rate, of course, but may mitigate
 in situations where disk caching is desirable, but writing is the
 bottleneck...


 On 26/11/2008, at 9:14 AM, Kinkie wrote:

 On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti
 [EMAIL PROTECTED] wrote:

 Amazon uses BerkeleyDB for several critical parts of its website. The
 Chicago Mercatile Exchange uses BerkeleyDB for backup and recovery of its
 trading database. And Google uses BerkeleyDB to process Gmail and Google
 user accounts. Are you sure BerkeleyDB is not a good idea to replace the
 Squid filesystems even COSS?

 Squid3 uses a modular storage backend system, so you're more than
 welcome to try to code it up and see how it compares.
 Generally speaking, the needs of a data cache such as squid are very
 different from those of a general-purpose backend storage.
 Among the other key differences:
 - the data in the cache has little or no value.
  it's important to know whether a file was corrupted, but it can
 always be thrown away and fetched from the origin server at a
 relatively low cost
 - workload is mostly writes
  a well-tuned forward proxy will have a hit-rate of roughly 30%,
 which means 3 writes for every read on average
 - data is stored in incremental chunks

 Given these characteristics, a long list of mechanisms database-like
 systems have such as journaling, transactions etc. are a  waste of
 resources.
 COSS is explicitly designed to handle a workload of this kind. I would
 not trust any valuable data to it, but it's about as fast as it gets
 for a cache.

 IMHO BDB might be much more useful as a metadata storage engine, as
 those have a very different access pattern than a general-purpose
 cache store.
 But if I had any time to devote to this, my priority would be in
 bringing 3.HEAD COSS up to speed with the work Adrian has done in 2.

 --
   /kinkie

 --
 Mark Nottingham   [EMAIL PROTECTED]

Re: omit to loop-forever processing some regex acls

2008-11-26 Thread Adrian Chadd

G'day!

If these are patches against Squid-2 then please put them into the
Squid bugzilla so we don't lose them.

There's a different process for Squid-3 submissions.

Thanks!


Adrian


2008/11/26 Matt Benjamin [EMAIL PROTECTED]:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256




 - --

 Matt Benjamin

 The Linux Box
 206 South Fifth Ave. Suite 150
 Ann Arbor, MI  48104

 http://linuxbox.com

 tel. 734-761-4689
 fax. 734-769-8938
 cel. 734-216-5309

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.7 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

 iD8DBQFJLYaAJiSUUSaRdSURCNBMAJ90xJm8VjlLJuubuxqi2drt8plR7QCdHXDs
 zBhdg5Gf8JScY8BdXqMZf8I=
 =Kd5i
 -END PGP SIGNATURE-

Re: access_log acl not observing my_port

2008-11-13 Thread Adrian Chadd

g'day!

Just create a ticket in the Squid bugzilla and put the patch into there.

Thanks for your contribution!



Adrian


2008/11/13 Stephen Thorne [EMAIL PROTECTED]:
 G'day,

 I've been looking into a problem we've observed where this situation
 does not work as expected, this is in squid-2.7.STABLE4:

 acl direct myport 8080
 access_log /var/log/squid/direct_proxy.log common direct

 I did some tracing through the code and established that this chain of
 events occurs:
 httpRequestFree calls clientAclChecklistCreate calls aclChecklistCreate

 But aclChecklistCacheInit is the function that populates the
 checklist-my_port, which is required for a myport acl to work, and it
 isn't called.

 I have attached a patch that fixes this particular problem for me, which
 simply calls aclChecklistCacheInit in clientAclChecklistCreate.

 --
 Regards,
 Stephen Thorne
 Development Engineer
 NetBox Blue - 1300 737 060

 Scanned by the NetBox from NetBox Blue
 (http://netboxblue.com/)


 Scanned by the NetBox from NetBox Blue
 (http://netboxblue.com/)

delayed forwarding is in Squid-2.HEAD

2008-10-16 Thread Adrian Chadd

G'day,

I've just committed the delayed forwarding stuff into Squid-2.HEAD.

Thanks,



Adrian

Re: [PATCH] Check half-closed descriptors at most once per second.

2008-09-25 Thread Adrian Chadd

2008/9/25 Alex Rousskov [EMAIL PROTECTED]:

 This revision resurrects 1 check/sec limit, but hopefully with fewer
 bugs. In my limited tests, CPU usage seems to be back to normal.

Woo, thanks!

 The DescriptorSet class has O(1) complexity for search, insertion,
 and deletion. It uses about 2*sizeof(int)*MaxFD bytes. Splay tree that
 used to store half-closed descriptors previously uses less RAM for small
 number of descriptors but has O(log n) complexity.

 The DescriptorSet code should probably get its own .h and .cc files,
 especially if it is going to be used by deferred reads.

Could you do that sooner rather than later? I'd like to try using this
code for deferred reads and delay pools.

Thanks!



Adrian

Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd

2008/9/23 Martin Langhoff [EMAIL PROTECTED]:

 Any way we can kludge our way around it for the time being? Does squid
 take any signal that gets it to shed its index?

It'd be pretty trivial to write a few cachemgr hooks to implement that
kind of behaviour. 'flush memory cache', 'flush disk cache entirely',
etc.

The trouble is that the index is -required- at the moment for the disk
cache. if you flush the index you flush the disk cache entirely.

 There's no hard limit for squid and squid (any version) handles
 memory allocation failures very very poorly (read: crashes.)

 Is it relatively sane to run it with a tight rlimit and restart it
 often? Or just monitor it and restart it?

It probably won't like that very much if you decide to also use disk caching.

 You can limit the amount of cache_mem which limits the memory cache
 size; you could probably modify the squid codebase to start purging
 objects at a certain object count rather than based on the disk+memory
 storage size. That wouldn't be difficult.

 Any chance of having patches that do this?

I could probably do that in a week or so once I've finished my upcoming travel.
Someone could try beating me to it..


 The big problem: you won't get Squid down to 24meg of RAM with the
 current tuning parameters. Well, I couldn't; and I'm playing around

 Hmmm...

 with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
 RAM.) Its something which will require quite a bit of development to
 slim some of the internals down to scale better with restricted
 memory footprints. Its on my personal TODO list (as it mostly is in
 line with a bunch of performance work I'm slowly working towards) but
 as the bulk of that is happening in my spare time, I do not have a
 fixed timeframe at the moment.

 Thanks for that -- at whatever pace, progress is progress. I'll stay
 tuned. I'm not on squid-devel, but generally interested in any news on
 this track; I'll be thankful if you CC me or rope me into relevant
 threads.

Ok.

 Is there interest within the squid dev team in moving towards a memory
 allocation model that is more tunable and/or relies more on the
 abilities of modern kernels to do memory mgmt? Or an alternative
 approach to handle scalability (both down to small devices and up to
 huge kit) more dynamically and predictably?

You'll generally find the squid dev team happy to move in whatever
directions make sense. The problem isn't direction as so much as the
coding to make it happen. Making Squid operate well in small memory
footprints turns out to be quite relevant to higher performance and
scalability; the problem is in the doing.

I'm hoping to start work on some stuff to reduce the memory footprint
in my squid-2 branch (cacheboy) once the current round of IPv6
preparation is completed and stable. The developers working on Squid-3
are talking about similar stuff.


Adrian

Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd

2008/9/24 Martin Langhoff [EMAIL PROTECTED]:

 Good hint, thanks! If we did have such a control, what is the wired
 memory that squid will use for each entry? In an email earlier I
 wrote...

sizeof(StoreEntry) per index entry, basically.


  - Each index entry takes between 56 bytes and 88 bytes, plus
 additional, unspecificed overhead. Is 1KB per entry a reasonable
 conservative estimate?

1kb per entry is pretty conservative. The per-object overhead includes
the StoreEntry, the couple of structures for the memory/disk
replacement policies, plus the MD5 URL for the index hash, whatever
other stuff hangs off MemObject for in-memory objects.

You'll find that the RAM requirements grow a bit more for things like
in-memory cache objects as the full reply headers stay in memory, and
are copied whenever anyone wants to request it.

  - Discussions about compressing or hashing the URL in the index are
 recurrent - is the uncompressed URL there? That means up to 4KB per
 index entry?

The uncompressed URL and headers are in memory during:

* request/reply handling
* in-memory object; (objects with MemObject's allocated); on-disk
entries just have the MD5 URL hash per StoreEntry.

HTH,

Oh, and I'll be in the US from October for a few months; I can always
do a side-trip out to see you guys if there's enough interest.


Adrian

Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Adrian Chadd

G'day,

I've looked into this a bit (and have a couple of OLPC laptops to do
testing with) and .. well, its going to take a bit of effort to make
squid fit.

There's no hard limit for squid and squid (any version) handles
memory allocation failures very very poorly (read: crashes.)

You can limit the amount of cache_mem which limits the memory cache
size; you could probably modify the squid codebase to start purging
objects at a certain object count rather than based on the disk+memory
storage size. That wouldn't be difficult.

The big problem: you won't get Squid down to 24meg of RAM with the
current tuning parameters. Well, I couldn't; and I'm playing around
with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
RAM.) Its something which will require quite a bit of development to
slim some of the internals down to scale better with restricted
memory footprints. Its on my personal TODO list (as it mostly is in
line with a bunch of performance work I'm slowly working towards) but
as the bulk of that is happening in my spare time, I do not have a
fixed timeframe at the moment.


Adrian


2008/9/23 Martin Langhoff [EMAIL PROTECTED]:
 Hi!

 I am working on the School Server (aka XS: a Fedora 9 spin, tailored
 to run on fairly limited hw), I'm preparing the configuration settings
 for it. It's a somewhat new area for me -- I've setup Squid before on
 mid-range hardware... but this is... different.

 So I'm interested in understanding more aobut the variables affecting
 memory footprint and how I can set a _hard limit_ on the wired memory
 that squid allocates.

 In brief:

  - The workload is relatively light - 3K clients is the upper bound.

  - The XS will (in some locations) be hooked to *very* unreliable
 power... uncontrolled shutdowns are the norm. Is this ever a problem with 
 Squid?

  - After a bad shutdown, graceful recovery is the most important
 aspect. If a few cached items are lost, we can cope...

  - The XS hardware runs many services (mostly webbased), so Squid gets
 only a limited slice of memory. To make matters worse, I *really*
 don't want the core working set (Squid, Pg, Apache/PHP) to get paged
 out. So I am interested in pegging the max memory Squid will take to itself.

  - The XS hw is varied. In small schools it may have 256MB RAM (likely
 to be running on XO hardware + usb-connected ext hard-drive).
 Medium-to-large schools will have the recommended 1GB RAM and a cheap
 SATA disk. A few very large schools will be graced with more RAM (2 or
 4GB).

 .. so RAM allocation for Squid will prob range between 24MB at the
 lower-end and 96MB at the 1GB recommended RAM.

 My main question is: how would you tune Squid 3 so that

  - it does not allocate directly more than 24MB / 96MB? (Assume that
 the linux kernel will be smart about mmapped stuff, and aggressive
 about caching -- I am talking about the memory Squid will claim to
 itself).

  - still gives us good thoughput? :-)



 So far Google has turned up very little info, and it seems to be
 rather old. What I've found can be summarised as follows:

  - The index is malloc'd, so the number of entries in the index will
 be the dominant concern WRT memory footprint.

  - Each index entry takes between 56 bytes and 88 bytes, plus
 additional, unspecificed overhead. Is 1KB per entry a reasonable
 conservative estimate?

  - Discussions about compressing or hashing the URL in the index are
 recurrent - is the uncompressed URL there? That means up to 4KB per
 index entry?

  - The index does nto seem to be mmappable or otherwise

 We can rely on the (modern) linux kernel doing a fantastic job at
 caching disk IO and shedding those cached entries when under memory
 pressure, so I am likely to set Squid's own cache to something really
 small. Everything I read points to the index being my main concern -
 is there a way to limit (a) the total memory the index is allowed to
 take or (b) the number of index entries allowed?

 Does the above make sense in general? Or am I barking up the wrong tree?


 cheers,



 martin
 --
  [EMAIL PROTECTED]
  [EMAIL PROTECTED] -- School Server Architect
  - ask interesting questions
  - don't get distracted with shiny stuff - working code first
  - http://wiki.laptop.org/go/User:Martinlanghoff
 ___
 Server-devel mailing list
 [EMAIL PROTECTED]
 http://lists.laptop.org/listinfo/server-devel

Re: [MERGE] Connection pinning patch

2008-09-21 Thread Adrian Chadd

Its a 900-odd line patch; granted, a lot of it is boiler plate for
config parsing and management, but I recall the issues connection
pinning had when it was introduced and I'd hate to try and be the one
debugging whatever crazy stuff pops up in 3.1 combined with the
changes to the workflow connection pinning introduces.

I don't pretend to completely understand the implications for ICAP
either. Is there any documentation for how connection pinning should
behave with ICAP and friends?

Is there any particular rush to get this in for this release at such a
late point in the release cycle?

Could we hold off of it until the next release, and just focus on
getting whats currently in 3.HEAD released and stable?



Adrian


2008/9/21 Tsantilas Christos [EMAIL PROTECTED]:
 Hi all,

 This patch fixes the bug 1632
 (http://www.squid-cache.org/bugs/show_bug.cgi?id=1632)
 It is based on the original squid2.5 connection pinning patch developed by
 Henrik (http://devel.squid-cache.org/projects.html#pinning) and the related
 squid 2.6 connection pinning code.

 Although I spend many hours looking on pined connections I am still not
 absolutely  sure that does not have  bugs. However the code is very similar
 with this in squid2.6 (where the pinning code runs for years) and I hope
 will be easy to fix problems and bugs.

 Regards,
Christos

Re: [MERGE] Connection pinning patch

2008-09-21 Thread Adrian Chadd

2008/9/22 Alex Rousskov [EMAIL PROTECTED]:


 It would help if there was a document describing what connection pinning
 is and what are the known pitfalls. Do we have such a document? Is RFC
 4559 enough?

I'll take another read. I think we should look at documenting these
sorts of features somewhere else though.

 If not, Christos, can you write one and have Adrian and others
 contribute pitfalls? It does not have to be long -- just a few
 paragraphs describing the basics of the feature. We can add that
 description to code documentation too.

I'd be happy to help troll over the 2.X code and see what its doing.
Henrik and Steven know the code better than I do; I've just spent some
time figuring out how it interplays with load balancing to peers and
such.

 ICAP and eCAP do not care about HTTP connections or custom headers. Is
 connection pinning more than connection management via some custom
 headers?

Nope; it just changes the semantics a little and some code may assume
things work a certain way.

 Sine NTLM authentication forwarding appears to be a required feature for
 many and since connection pinning patch is not trivial (but is not huge
 either), I would rather see it added now (after the proper review
 process, of course). It could be the right icing on 3.1 cake for many
 users. I do realize that, like any 900-line patch, it may cause problems
 even if it is reviewed and tested.

*nodnod* I'm just making sure the reasons for pushing it through are
recorded somewhere during the process.



Adrian

Re: Strategy

2008-09-21 Thread Adrian Chadd

Put this stuff on hold, get Squid-3.1 out of the way, sort out the
issues surrounding that before you start throwing more code into
Squid-3 trunk, and -then- have this discussion.

We can sort this stuff out in a short period of time if its our only focus.



Adrian

2008/9/22 Amos Jeffries [EMAIL PROTECTED]:
 On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
 Alex Rousskov wrote:

  * Look for simpler warts with localized impact. We have plenty of them
  and your energy would be well spent there. If you have a choice, do
 not
  try to improve something as fundamental and as critical as String.
  Localized single-use code should receive a lot less scrutiny than
  fundamental classes.
 

 Agreed, but that said. If you kinkie, picking oe of the hard ones causes
 a thorough discussion, as String has, and comes up with a good API. That
 not just a step in the rght direction but a giant leap. And worth doing
 if you can spare the time (months in some cases).
 The follow on effects will be better and easier code in other areas
 depending on it.

 Amos,

 I think the above work-long-enough-and-you-will-make-it analysis and
 a few other related comments do not account for one important factor:
 cost (and the limited resources this project has). Please compare the
 following estimates (all numbers are very approximate, of course):

  Kinkie's time to draft a String class:   2 weeks
  Kinkie's time to fix the String class:   6 weeks
  Reviewers' time to find bugs and
   convince Kinkie that they are bugs: 2 weeks
  Total:  10 weeks

  Reviewer's time to write a String class: 3 weeks
  Total:   3 weeks


 Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
 reviewers gain 1 week themselves. So I stand by, if he feels strongly
 enough to do it.

 If you add to the above that one reviewer cannot review and work on
 something else at the same time, the waste goes well above 200%.

 Which is wrong. We can review one thing and work on another project.


 Compare the above with a regular project that does not require writing
 complex or fundamental classes (again, numbers are approximate):

 Kinkie's time to complete a regular project:   1 week
 Reviewer's time to complete a regular project: 1 week

 After which both face the hard project again. Which remains hard and could
 have cut off 5 days of the regular project.


 If we want Squid code to continue to be a playground for half-finished
 code and ideas, then we should abandon the review process. Let's just
 commit everything that compiles and that the committer is happy with.

 I assume you are being sarcastic.

 Otherwise, let's do our best to find a project for everyone, without
 sacrificing the quality of the output or wasting resources. For example,
 if a person wants String to implement his pet project, but cannot make a
 good String, it may be possible to trade String implementation for a few
 other pet projects that the person can do.

 Then that trade needs to be discussed with the person before they start.
 I get the idea you are trying to manage this FOSS like you would a company
 project. That approach has been tried and failed miserably in FOSS.

 This will not be smooth and
 easy, but it is often doable because most of us share the goal of making
 the best open source proxy.

  * When assessing the impact of your changes, do not just compare the
 old
  code with the one submitted for review. Consider how your classes
 stand
  on their own and how they _will_ be used. Providing a poor but
  easier-to-abuse interface is often a bad idea even if that interface
 is,
  in some aspects, better than the old hard-to-use one.
 
  Noone else is tackling the issues that I'm working on. Should they be
  left alone? Or should I aim for the perfect solution each time?

 Perfect varies, and will change. As the baseline 'worst' code in Squid
 improves. The perfect API this year may need changing later. Aim for the
 best you can find to do, and see if its good enough for inclusion.

 Right. The problems come when it is not good enough, and you cannot fix
 it on your own. I do not know how to avoid these ugly situations.

 Teamwork. Which I thought we were starting to get in the String API after
 earlier attempts at solo by whoever wrote SquidString and myself on the
 BetterString mk1, mk2, mk3.

 I doubt any of us could do a good job of something so deep without help.
 Even you needed Henrik to review and find issues with AsyncCalls, maybe
 others I don't know about before that.

 The fact remains these things NEED someone to kick us into a team and work
 on it.


 for example, Alex had no issues with wordlist when it first came out.

 This was my first review of the proposed class, but I doubt it would
 have changed if I reviewed it earlier.

 Thank you,

 Alex.


 Amos

Re: Strategy

2008-09-21 Thread Adrian Chadd

And in the meantime, if someone (eg kinkie) wants to work on this
stuff some more, I suggest sitting down and writing some of the
support code which would use it.

Write a HTTP parser, HTTP response builder, do some benchmaking,
perhaps glue it to something like libevent or some other comm
framework and do some benchmarking there.
See how it performs, how it behaves, see if it does everything y'all
want cleanly. _Then_ have this discussion.



Adrian

2008/9/22 Adrian Chadd [EMAIL PROTECTED]:
 Put this stuff on hold, get Squid-3.1 out of the way, sort out the
 issues surrounding that before you start throwing more code into
 Squid-3 trunk, and -then- have this discussion.

 We can sort this stuff out in a short period of time if its our only focus.



 Adrian

 2008/9/22 Amos Jeffries [EMAIL PROTECTED]:
 On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
 Alex Rousskov wrote:

  * Look for simpler warts with localized impact. We have plenty of them
  and your energy would be well spent there. If you have a choice, do
 not
  try to improve something as fundamental and as critical as String.
  Localized single-use code should receive a lot less scrutiny than
  fundamental classes.
 

 Agreed, but that said. If you kinkie, picking oe of the hard ones causes
 a thorough discussion, as String has, and comes up with a good API. That
 not just a step in the rght direction but a giant leap. And worth doing
 if you can spare the time (months in some cases).
 The follow on effects will be better and easier code in other areas
 depending on it.

 Amos,

 I think the above work-long-enough-and-you-will-make-it analysis and
 a few other related comments do not account for one important factor:
 cost (and the limited resources this project has). Please compare the
 following estimates (all numbers are very approximate, of course):

  Kinkie's time to draft a String class:   2 weeks
  Kinkie's time to fix the String class:   6 weeks
  Reviewers' time to find bugs and
   convince Kinkie that they are bugs: 2 weeks
  Total:  10 weeks

  Reviewer's time to write a String class: 3 weeks
  Total:   3 weeks


 Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
 reviewers gain 1 week themselves. So I stand by, if he feels strongly
 enough to do it.

 If you add to the above that one reviewer cannot review and work on
 something else at the same time, the waste goes well above 200%.

 Which is wrong. We can review one thing and work on another project.


 Compare the above with a regular project that does not require writing
 complex or fundamental classes (again, numbers are approximate):

 Kinkie's time to complete a regular project:   1 week
 Reviewer's time to complete a regular project: 1 week

 After which both face the hard project again. Which remains hard and could
 have cut off 5 days of the regular project.


 If we want Squid code to continue to be a playground for half-finished
 code and ideas, then we should abandon the review process. Let's just
 commit everything that compiles and that the committer is happy with.

 I assume you are being sarcastic.

 Otherwise, let's do our best to find a project for everyone, without
 sacrificing the quality of the output or wasting resources. For example,
 if a person wants String to implement his pet project, but cannot make a
 good String, it may be possible to trade String implementation for a few
 other pet projects that the person can do.

 Then that trade needs to be discussed with the person before they start.
 I get the idea you are trying to manage this FOSS like you would a company
 project. That approach has been tried and failed miserably in FOSS.

 This will not be smooth and
 easy, but it is often doable because most of us share the goal of making
 the best open source proxy.

  * When assessing the impact of your changes, do not just compare the
 old
  code with the one submitted for review. Consider how your classes
 stand
  on their own and how they _will_ be used. Providing a poor but
  easier-to-abuse interface is often a bad idea even if that interface
 is,
  in some aspects, better than the old hard-to-use one.
 
  Noone else is tackling the issues that I'm working on. Should they be
  left alone? Or should I aim for the perfect solution each time?

 Perfect varies, and will change. As the baseline 'worst' code in Squid
 improves. The perfect API this year may need changing later. Aim for the
 best you can find to do, and see if its good enough for inclusion.

 Right. The problems come when it is not good enough, and you cannot fix
 it on your own. I do not know how to avoid these ugly situations.

 Teamwork. Which I thought we were starting to get in the String API after
 earlier attempts at solo by whoever wrote SquidString and myself on the
 BetterString mk1, mk2, mk3.

 I doubt any of us could do a good job of something so deep without help.
 Even you

Re: Strategy

2008-09-21 Thread Adrian Chadd

only focus should really have been our main focus at that short
period of time, not the only thing we care about.

Sheesh. :P



Adrian

2008/9/22 Alex Rousskov [EMAIL PROTECTED]:
 On Mon, 2008-09-22 at 10:36 +0800, Adrian Chadd wrote:
 Put this stuff on hold, get Squid-3.1 out of the way, sort out the
 issues surrounding that before you start throwing more code into
 Squid-3 trunk, and -then- have this discussion.

 If this stuff is WordList, then put this stuff on hold is my
 suggestion as well.

 If this stuff is String, then I think the basic design choices can be
 discussed now, but waiting is even better for me, so I am happy to
 follow your suggestion :-).

 If this stuff is how we improve teamwork, then I am happy to
 continue any _constructive_ discussions since releasing 3.1 can benefit
 from teamwork as well.

 We can sort this stuff out in a short period of time if its our only focus.

 The only focus? You must be dreaming :-).

 Alex.


 2008/9/22 Amos Jeffries [EMAIL PROTECTED]:
  On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
  Alex Rousskov wrote:
 
   * Look for simpler warts with localized impact. We have plenty of them
   and your energy would be well spent there. If you have a choice, do
  not
   try to improve something as fundamental and as critical as String.
   Localized single-use code should receive a lot less scrutiny than
   fundamental classes.
  
 
  Agreed, but that said. If you kinkie, picking oe of the hard ones causes
  a thorough discussion, as String has, and comes up with a good API. That
  not just a step in the rght direction but a giant leap. And worth doing
  if you can spare the time (months in some cases).
  The follow on effects will be better and easier code in other areas
  depending on it.
 
  Amos,
 
  I think the above work-long-enough-and-you-will-make-it analysis and
  a few other related comments do not account for one important factor:
  cost (and the limited resources this project has). Please compare the
  following estimates (all numbers are very approximate, of course):
 
   Kinkie's time to draft a String class:   2 weeks
   Kinkie's time to fix the String class:   6 weeks
   Reviewers' time to find bugs and
convince Kinkie that they are bugs: 2 weeks
   Total:  10 weeks
 
   Reviewer's time to write a String class: 3 weeks
   Total:   3 weeks
 
 
  Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
  reviewers gain 1 week themselves. So I stand by, if he feels strongly
  enough to do it.
 
  If you add to the above that one reviewer cannot review and work on
  something else at the same time, the waste goes well above 200%.
 
  Which is wrong. We can review one thing and work on another project.
 
 
  Compare the above with a regular project that does not require writing
  complex or fundamental classes (again, numbers are approximate):
 
  Kinkie's time to complete a regular project:   1 week
  Reviewer's time to complete a regular project: 1 week
 
  After which both face the hard project again. Which remains hard and could
  have cut off 5 days of the regular project.
 
 
  If we want Squid code to continue to be a playground for half-finished
  code and ideas, then we should abandon the review process. Let's just
  commit everything that compiles and that the committer is happy with.
 
  I assume you are being sarcastic.
 
  Otherwise, let's do our best to find a project for everyone, without
  sacrificing the quality of the output or wasting resources. For example,
  if a person wants String to implement his pet project, but cannot make a
  good String, it may be possible to trade String implementation for a few
  other pet projects that the person can do.
 
  Then that trade needs to be discussed with the person before they start.
  I get the idea you are trying to manage this FOSS like you would a company
  project. That approach has been tried and failed miserably in FOSS.
 
  This will not be smooth and
  easy, but it is often doable because most of us share the goal of making
  the best open source proxy.
 
   * When assessing the impact of your changes, do not just compare the
  old
   code with the one submitted for review. Consider how your classes
  stand
   on their own and how they _will_ be used. Providing a poor but
   easier-to-abuse interface is often a bad idea even if that interface
  is,
   in some aspects, better than the old hard-to-use one.
  
   Noone else is tackling the issues that I'm working on. Should they be
   left alone? Or should I aim for the perfect solution each time?
 
  Perfect varies, and will change. As the baseline 'worst' code in Squid
  improves. The perfect API this year may need changing later. Aim for the
  best you can find to do, and see if its good enough for inclusion.
 
  Right. The problems come when it is not good enough, and you cannot fix
  it on your own. I do not know how to avoid

Re: SBuf review

2008-09-18 Thread Adrian Chadd

2008/9/19 Amos Jeffries [EMAIL PROTECTED]:

 I kind of fuzzily disagree, the point of this is to replace MemBuf + String
 with SBuf. Not implement both again independently duplicating stuff.

I'll say it again - ignore MemBuf. Ignore MemBuf for now. Leave it as
a NUL-terminated dynamic buffer with some printf append like
semantics.

When you've implemented a non-NUL-terminated ref-counted memory region
implementation and you layer some basic strings semantics on top of
it, you can slowly convert or eliminate the bulk of the MemBuf users
over.

You're going to find plenty of places where the string handling is
plain old horrible. Don't try to cater for those situations with
things like NULL strings. I tried that, its ugly. Aim to implement
something which'll cater to something narrow to begin with - like
parsing HTTP headers - and look to -rewrite- larger parts of the code
later on. Don't try to invent things which will somehow seamlessly fit
into the existing code and provide the same semantics. Some of said
semantics is plain shit.

I still don't get why this is again becoming so freakishly complicated.



Adrian

Re: [MERGE] WCCPv2 Config Cleanup

2008-09-13 Thread Adrian Chadd

2008/9/13 Amos Jeffries [EMAIL PROTECTED]:

 This one was easy and isolated, so I went and did it early.
 It's back-compatible, so people don't have to use the new names if they
 like. But its clearer for the newbies until the big cleanup you mention
 below is stable.

Well, the newbies still need to know about the different kinds of
redirection/assignment methods; what would be nice is if it were
mostly autonegotiated per-host per-service group, and if wccp2d could
setup/teardown the GRE tunnels as required.

 The WCCPv2 stuff works fine (for what it does); it could do with some
 better documentation but what it really needs is to be broken out from
 Squid itself and run as a seperate daemon.


 I've been waiting most of a year for your work on that direction in Squid-2
 to be ported over. There does not appear to be any sign of it happening in
 time for 3.1.
 The rest of us are largely concentrating on cleaning other components.

I still haven't done all that much with the WCCPv2 stuff yet. I'll be
breaking out the source code in Cacheboy after I finish the next set
of IPv6 changes; the wccp2d code will then use the config registry
type stuff we've discussed and reuse the core code for comms,
debugging, logging, etc.



Adrian

Re: [MERGE] WCCPv2 Config Cleanup

2008-09-12 Thread Adrian Chadd

The specification defines them as separate entities and using them in
this fashion makes it clearer for people working on the code.



Adrian

2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]:
 On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote:

 +#define WCCP2_FORWARDING_METHOD_GRE  WCCP2_METHOD_GRE
 +#define WCCP2_FORWARDING_METHOD_L2   WCCP2_METHOD_L2

 +#define WCCP2_PACKET_RETURN_METHOD_GRE   WCCP2_METHOD_GRE
 +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2

 Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are
 the same value?

 Regards
 Henrik

Re: [MERGE] WCCPv2 Config Cleanup

2008-09-12 Thread Adrian Chadd

Amos, why are you pushing through changes to the WCCP configuration
stuff at this point in the game?

The WCCPv2 stuff works fine (for what it does); it could do with some
better documentation but what it really needs is to be broken out from
Squid itself and run as a seperate daemon.




Adrian

2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]:
 With the patch the code uses WCCP2_METHOD_.. in some places (config
 parsing/dumping) and the context specific ones in other places. This is
 even more confusing.

 Very minor detail in any case.


 On lör, 2008-09-13 at 09:49 +0800, Adrian Chadd wrote:
 The specification defines them as separate entities and using them in
 this fashion makes it clearer for people working on the code.



 Adrian

 2008/9/13 Henrik Nordstrom [EMAIL PROTECTED]:
  On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote:
 
  +#define WCCP2_FORWARDING_METHOD_GRE  WCCP2_METHOD_GRE
  +#define WCCP2_FORWARDING_METHOD_L2   WCCP2_METHOD_L2
 
  +#define WCCP2_PACKET_RETURN_METHOD_GRE   WCCP2_METHOD_GRE
  +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2
 
  Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are
  the same value?
 
  Regards
  Henrik

Re: squid-2.HEAD: storeCleanup and -F option (foreground rebuild)

2008-09-12 Thread Adrian Chadd

I've committed a slightly modified version of this - store_rebuild.c
r1.80 . Take a look and see if it works for you.

Thanks!



Adrian

2008/8/5 Alexander V. Lukyanov [EMAIL PROTECTED]:
 Hello!

 I use squid in transparent mode, so I don't want degraded performance
 while rebuilding and cleanup. Here is a patch I use to make storeCleanup
 do all the work at once before squid starts processing requests, when
 -F option is specified on command line.

 Index: store_rebuild.c
 ===
 RCS file: /squid/squid/src/store_rebuild.c,v
 retrieving revision 1.80
 diff -u -p -r1.80 store_rebuild.c
 --- store_rebuild.c 1 Sep 2007 23:09:32 -   1.80
 +++ store_rebuild.c 5 Aug 2008 05:51:43 -
 @@ -68,7 +68,8 @@ storeCleanup(void *datanotused)
 hash_link *link_ptr = NULL;
 hash_link *link_next = NULL;
 validnum_start = validnum;
 -while (validnum - validnum_start  500) {
 +int limit = opt_foreground_rebuild ? 1  30 : 500;
 +while (validnum - validnum_start  limit) {
if (++bucketnum = store_hash_buckets) {
debug(20, 1) (  Completed Validation Procedure\n);
debug(20, 1) (  Validated %d Entries\n, validnum);
 @@ -147,8 +148,8 @@ storeRebuildComplete(struct _store_rebui
 debug(20, 1) (  Took %3.1f seconds (%6.1f objects/sec).\n, dt,
(double) counts.objcount / (dt  0.0 ? dt : 1.0));
 debug(20, 1) (Beginning Validation Procedure\n);
 -eventAdd(storeCleanup, storeCleanup, NULL, 0.0, 1);
 safe_free(RebuildProgress);
 +storeCleanup(0);
  }

  /*

Re: squid-2.HEAD: fwdComplete/Fail before comm_close

2008-09-12 Thread Adrian Chadd

Hiya,

Could you please verify this is still a problem in the latest 2.HEAD
and if so lodge a bugzilla bug report with the patch?

Thanks!


Adrian


2008/8/5 Alexander V. Lukyanov [EMAIL PROTECTED]:
 Hello!

 Some time ago I had core dumps just after these messages:
Short response from ...
httpReadReply: Excess data from ...

 I beleave this patch fixes these problems.

 Index: http.c
 ===
 RCS file: /squid/squid/src/http.c,v
 retrieving revision 1.446
 diff -u -p -r1.446 http.c
 --- http.c  25 Jun 2008 22:11:20 -  1.446
 +++ http.c  5 Aug 2008 06:05:29 -
 @@ -755,6 +757,7 @@ httpAppendBody(HttpStateData * httpState
 /* Is it a incomplete reply? */
 if (httpState-chunk_size  0) {
debug(11, 2) (Short response from '%s' on port %d. Expecting % 
 PRINTF_OFF_T  octets more\n, storeUrl(entry), comm_local_port(fd), 
 httpState-chunk_size);
 +   fwdFail(httpState-fwd, errorCon(ERR_INVALID_RESP, HTTP_BAD_GATEWAY, 
 httpState-fwd-request));
comm_close(fd);
return;
 }
 @@ -774,6 +777,7 @@ httpAppendBody(HttpStateData * httpState
(httpReadReply: Excess data from \%s %s\\n,
RequestMethods[orig_request-method].str,
storeUrl(entry));
 +   fwdComplete(httpState-fwd);
comm_close(fd);
return;
 }

Re: squid-2.HEAD:

2008-09-12 Thread Adrian Chadd

have you dumped this into bugzilla?

Thanks!

2008/9/3 Alexander V. Lukyanov [EMAIL PROTECTED]:
 Hello!

 I have noticed lots of 'impossible keep-alive' messages in the log.
 It appears that httpReplyBodySize incorrectly returns -1 for 304 Not
 Modified replies. Patch to fix it is attached.

 --
   Alexander.

Re: Where to document APIs?

2008-09-11 Thread Adrian Chadd

2008/9/11 Alex Rousskov [EMAIL PROTECTED]:

 To clarify:

 Longer API documents, .dox file in docs/, or maybe src/ next to the .cc

 Basic rules the code need to fulfill, or until the API documentation
 grows large, in the .h or .cc file.

 You all have seen the current API notes for Comm and AsyncCalls. Do you
 think they should go into a .dox or .h file?

 I think they are big enough (and growing) to justify a .dox file. I will
 probably add those files to trunk (next to the corresponding .h files)
 unless there are better ideas.

Whats wrong with inline documentation again?



Adrian

Australian Development Meetup 2008 - Notes

2008-09-11 Thread Adrian Chadd

G'day,

I've started publishing the notes from the presentations and developer
discussions that we held at the Yahoo!7 offices last month.
You can find them at
http://www.squid-cache.org/Conferences/AustraliaMeeting2008/ .

I'm going to try and make sure any further
mini-conferences/discussions/etc which happen go up there so people
get more of an idea of whats going on.

Who knows, eventually there may be enough interest to hold a
reasonably formal Squid conference somewhere.. :)



Adrian

Re: [MERGE] Config cleanups

2008-09-10 Thread Adrian Chadd

You have the WCCPv2 stuff around the wrong way.

the redirection has nothing to do with the assignment method.

You can and do have L2 redirection with hash assignment. You probably
won't have GRE redirection with mask assignment though, but I think
its entirely possible.

Keep the options separate, and named whatever they are in the wccp2 draft.

I'd also suggest committing each chunk thats different seperately -
ie, the wccp stuff seperate, the ACL tidyup seperate, the default
storage stuff seperate, etc. That makes backing out patches easier if
needed.

2c,



Adrian

2008/9/10 Amos Jeffries [EMAIL PROTECTED]:
 This update removes several magic number options in the WCCPv2
 configuration. Replacing them with user-freindly text options.

 This should help with a lot of config confusion where these are needed until
 they are obsoleted properly.

 # Bazaar merge directive format 2 (Bazaar 0.90)
 # revision_id: [EMAIL PROTECTED]
 # target_branch: file:///src/squid/bzr/trunk/
 # testament_sha1: 7b319238106ae2926697f85b2ec58c3476abc121
 # timestamp: 2008-09-11 03:50:49 +1200
 # base_revision_id: [EMAIL PROTECTED]
 #   q5rnfdpug13p94fl
 #
 # Begin patch
 === modified file 'src/cf.data.depend'
 --- src/cf.data.depend  2008-04-03 05:31:29 +
 +++ src/cf.data.depend  2008-09-10 15:22:08 +
 @@ -47,6 +47,7 @@
  tristate
  uri_whitespace
  ushort
 +wccp2_method
  wccp2_service
  wccp2_service_info
  wordlist

 === modified file 'src/cf.data.pre'
 --- src/cf.data.pre 2008-08-09 06:24:33 +
 +++ src/cf.data.pre 2008-09-10 15:47:36 +
 @@ -831,8 +831,8 @@

  NOCOMMENT_START
  #Allow ICP queries from local networks only
 -icp_access allow localnet
 -icp_access deny all
 +#icp_access allow localnet
 +#icp_access deny all
  NOCOMMENT_END
  DOC_END

 @@ -856,8 +856,8 @@

  NOCOMMENT_START
  #Allow HTCP queries from local networks only
 -htcp_access allow localnet
 -htcp_access deny all
 +#htcp_access allow localnet
 +#htcp_access deny all
  NOCOMMENT_END
  DOC_END

 @@ -883,7 +883,7 @@
  NAME: miss_access
  TYPE: acl_access
  LOC: Config.accessList.miss
 -DEFAULT: none
 +DEFAULT: allow all
  DOC_START
Use to force your neighbors to use you as a sibling instead of
a parent.  For example:
 @@ -897,11 +897,6 @@

By default, allow all clients who passed the http_access rules
to fetch MISSES from us.
 -
 -NOCOMMENT_START
 -#Default setting:
 -# miss_access allow all
 -NOCOMMENT_END
  DOC_END

  NAME: ident_lookup_access
 @@ -1555,9 +1550,7 @@

  icp-port:  Used for querying neighbor caches about
 objects.  To have a non-ICP neighbor
 -specify '7' for the ICP port and make sure the
 -neighbor machine has the UDP echo port
 -enabled in its /etc/inetd.conf file.
 +specify '0' for the ICP port.
NOTE: Also requires icp_port option enabled to send/receive
  requests via this method.

 @@ -1955,7 +1948,7 @@
  NAME: maximum_object_size_in_memory
  COMMENT: (bytes)
  TYPE: b_size_t
 -DEFAULT: 8 KB
 +DEFAULT: 512 KB
  LOC: Config.Store.maxInMemObjSize
  DOC_START
Objects greater than this size will not be attempted to kept in
 @@ -2124,7 +2117,7 @@
which can be changed with the --with-coss-membuf-size=N configure
option.
  NOCOMMENT_START
 -cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256
 +# cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256
  NOCOMMENT_END
  DOC_END

 @@ -2291,7 +2284,7 @@
  NAME: access_log cache_access_log
  TYPE: access_log
  LOC: Config.Log.accesslogs
 -DEFAULT: none
 +DEFAULT: @DEFAULT_ACCESS_LOG@ squid
  DOC_START
These files log client request activities. Has a line every HTTP or
ICP request. The format is:
 @@ -2314,9 +2307,9 @@

And priority could be any of:
err, warning, notice, info, debug.
 -NOCOMMENT_START
 -access_log @DEFAULT_ACCESS_LOG@ squid
 -NOCOMMENT_END
 +
 +   Default:
 +   access_log @DEFAULT_ACCESS_LOG@ squid
  DOC_END

  NAME: log_access
 @@ -2342,14 +2335,17 @@

  NAME: cache_store_log
  TYPE: string
 -DEFAULT: @DEFAULT_STORE_LOG@
 +DEFAULT: none
  LOC: Config.Log.store
  DOC_START
Logs the activities of the storage manager.  Shows which
objects are ejected from the cache, and which objects are
 -   saved and for how long.  To disable, enter none. There are
 -   not really utilities to analyze this data, so you can safely
 +   saved and for how long.  To disable, enter none or remove the
 line.
 +   There are not really utilities to analyze this data, so you can
 safely
disable it.
 +NOCOMMENT_START
 +# cache_store_log @DEFAULT_STORE_LOG@
 +NOCOMMENT_END
  DOC_END

  NAME: cache_swap_state cache_swap_log
 @@ -3085,7 +3081,7 @@
  NAME: request_header_max_size
  COMMENT: (KB)
  TYPE: b_size_t
 -DEFAULT: 20 KB
 +DEFAULT: 64 KB
  LOC: Config.maxRequestHeaderSize
  DOC_START
This specifies the

Re: Comm API notes

2008-09-10 Thread Adrian Chadd

2008/9/11 Alex Rousskov [EMAIL PROTECTED]:
 * I/O cancellation.

  To cancel an interest in a read operation, call comm_read_cancel()
  with an AsyncCall object. This call guarantees that the passed Call
  will be canceled (see the AsyncCall API for call cancellation
  definitions and details). Naturally, the code has to store the
  original read callback Call pointer to use this interface. This call
  does not guarantee that the read operation has not already happen.
  This call guarantees that the read operation will not happen.

As I said earlier, you can't guarantee that with asynchronous IO. The
call may be in progress and not completed. I'm assuming you'd count
in progress as has already happened but unlike the latter, you
can't cancel it at the OS level.

As long as the API keeps all the relevant OS related structures in
place to allow the IO to complete, and callers to the cancellation
function are prepared to handle the case where the IO is happening
versus has already happened, then i'm happy.

  You cannot reliably cancel an interest in read operation using the old
  comm_read_cancel call that uses a function pointer. The handler may
  get even called after old comm_read_cancel was called. This old API
  will be removed.

I really did think I had fixed removing the pending callbacks from the
callback queue when I implemented this. (Ie, I thought I implemented
enough for the POSIX read/write API but not enough for
overlapped/POSIX IO.) What were people seeing pre-AsyncCalls?

  It is OK to call comm_read_cancel (both old and new) at any time as
  long as the descriptor has not been closed and there is either no read
  interest registered or the passed parameters match the registered
  ones. If the descriptor has been closed, the behavior is undefined.
  Otherwise, if parameters do not match, you get an assertion.

  To cancel other operations, close the descriptor with comm_close.

I'm still not happy with comm_close() being used in that way; it seems
you aren't either and are stipulating new user code aborts jobs via
alternative paths.

I'm also not happy with the idea of close handlers to unwind state
associated with it; how deep do close handlers actually get? Would
we be better off in the long run by stipulating a more rigid shutdown
process (eg - shutting down a client-side fd would not involve
comm_close(fd); but ConnStateData::close() which would handle clearing
the clientHttpRequests and such, then itself + fd?)

  Raw socket descriptors may be replaced with unique IDs or small
  objects that help detect stale descriptor/socket usage bugs and
  encapsulate access to socket-specific information. New user code
  should treat descriptor integers as opaque objects.

I do agree with this. As Henrik said, this makes Windows porting a bit
easier. There are still other problems to tackle to properly abuse
overlapped IO in any sensible fashion, mostly surrounding IO
scheduling and callback scheduling..



adrian

Re: Comm API notes

2008-09-10 Thread Adrian Chadd

2008/9/11 Alex Rousskov [EMAIL PROTECTED]:
 Here is a replacement text:

  The comm_close API will be used exclusively for stop future I/O,
  schedule a close callback call, and cancel all other callbacks
  purposes. New user code should not use comm_close for the purpose of
  immediately ending a job via a close handler call.

Yup.

(As part of another email) I'd also make it completely clear that the
underlying socket and IO may not be immediately closed via a
comm_close() until pending scheduled IO events occur; and that the
callers should be prepared for the situation where the underlying
buffer(s) and other resources must stay immutable until the completion
of the kernel-side stuff.

This is partially why I wanted explicit notification, cancellation or
not, so the owners of things like buffers would know when they were
able to modify/reuse them again - or the immutable semantics must be
enforced some other way.



Adrian

Re: How to buffer a POST request

2008-09-09 Thread Adrian Chadd

Well, I've got a proof of concept which works well but its -very-
ugly. This is one of those things may have been slightly easier to do
in Squid-3 with Alex's BodyPipe changes. I haven't stared at the
BodyPipe code to know whether its doing all the right kinds of
buffering for this application.

The problem is that Squid-2's request body data pipeline doesn't do
any of its own buffering - it doesn't do anything at all until a
consumer says give me some more request body data please at which
point its copied out of conn-in.buf (the client-side incoming socket
buffer), consumed, and passed onto the caller.

I thought about a clean implementation which would involve the
request body pipeline code consuming socket buffer data until a
certain threshold is reached, then feeding that back up to the request
body consumer but I decided that was too difficult for this particular
contract.

Instead, the hack here is to just keep reading data into the
client-side socket buffer - its already doing double duty as a request
body buffer anyway - until an ACL match fires to begin forwarding. Its
certainly not clean but it seems to work in local testing. I haven't
yet tested connection aborts and such to make sure that connections
are properly cleaned up.

I'll look at posting a patch to squid-dev in a day or two once my
client has had a look at it.

Thanks,



Adrian


2008/8/8 Adrian Chadd [EMAIL PROTECTED]:
 Well I'm still going through the process of planning out what changes
 need to happen.

 I know what changes need to happen long-term but this project doesn't
 have that sort of scope..



 Adrian

 2008/8/8 Mark Nottingham [EMAIL PROTECTED]:
 You said you were doing it :)


 On 08/08/2008, at 4:40 PM, Adrian Chadd wrote:

 Way to dob me in!


 Adrian

 2008/8/8 Mark Nottingham [EMAIL PROTECTED]:

 I took at stab at:
 http://wiki.squid-cache.org/Features/RequestBuffering


 On 22/07/2008, at 4:40 PM, Henrik Nordstrom wrote:

 It's not a bug. A feature request in the wiki is more appropriate.

 wiki.squid-cache.org/Features/

 Regards
 Henrik

 On mån, 2008-07-21 at 17:50 -0700, Mark Nottingham wrote:

 I couldn't find an open bug for this, so I opened
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2420


 On 11/06/2008, at 3:29 AM, Henrik Nordstrom wrote:

 On ons, 2008-06-11 at 12:51 +0300, Mikko Kettunen wrote:

 Yes, I read something about this on squid-users list, there seems
 to be
 8kB buffer for this if I understood right.

 The buffer is bigger than that. But not unlimited.

 The big change needed is that there currently isn't anything delaying
 forwarding of the request headers until sufficient amount of the
 request
 body has been buffered.

 Regards
 Henrik

 --
 Mark Nottingham   [EMAIL PROTECTED]


 --
 Mark Nottingham   [EMAIL PROTECTED]




 --
 Mark Nottingham   [EMAIL PROTECTED]

Squid-2.HEAD URL regression with CONNECT

2008-09-09 Thread Adrian Chadd

G'day,

Squid-2.HEAD doesn't seem to handle CONNECT URLs anymore; I get something like:

[start]
The requested URL could not be retrieved

While trying to retrieve the URL: www.gmail.com:443

The following error was encountered:

* Invalid URL
[end]

Benno, could you please double/triple check that your method and url
related changes to Squid-2.HEAD didn't break CONNECT?

Thanks!


Adrian

Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.

2008-09-08 Thread Adrian Chadd

I've been thinking about doing exactly this after I've been knee-deep
in the DNS code.
It may not be a bad idea to have generic udp/tcp incoming/outgoing
addresses which can then be over-ridden per-protocol.



Adrian

2008/9/9 Amos Jeffries [EMAIL PROTECTED]:
 
 revno: 9176
 committer: Alex Rousskov [EMAIL PROTECTED]
 branch nick: trunk
 timestamp: Mon 2008-09-08 17:52:06 -0600
 message:
   Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming
 address.
 modified:
   src/htcp.cc


 I think this is one of those cleanup situations where we wanted to split
 the protocol away from generic udp_*_address and make it an
 htcp_outgoing_address. Yes?

 Amos

Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.

2008-09-08 Thread Adrian Chadd

Hah, Amos just exposed my on-set short term memory loss!

(Time to get a bigger whiteboard..)



Adrian

2008/9/9 Amos Jeffries [EMAIL PROTECTED]:
 I've been thinking about doing exactly this after I've been knee-deep
 in the DNS code.
 It may not be a bad idea to have generic udp/tcp incoming/outgoing
 addresses which can then be over-ridden per-protocol.


 WTF? We discussed this months ago and came to the conclusion it would be
 good to have a two layered outgoing address/port assignment.

 a) base default of random system-assigned outbound address port.

 b) override per-component/protocol  in/out bound address/port with
 individual config options.

 Amos


 Adrian

 2008/9/9 Amos Jeffries [EMAIL PROTECTED]:
 
 revno: 9176
 committer: Alex Rousskov [EMAIL PROTECTED]
 branch nick: trunk
 timestamp: Mon 2008-09-08 17:52:06 -0600
 message:
   Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming
 address.
 modified:
   src/htcp.cc


 I think this is one of those cleanup situations where we wanted to split
 the protocol away from generic udp_*_address and make it an
 htcp_outgoing_address. Yes?

 Amos

Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access

2008-09-07 Thread Adrian Chadd

Thanks! Don't forget to bug me if its not sorted out in the next week or so.



Adrian

2008/9/8 Diego Woitasen [EMAIL PROTECTED]:
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2455

 On Sun, Sep 07, 2008 at 09:28:30AM +0800, Adrian Chadd wrote:
 It looks fine; could you dump it into bugzilla for the time being?
 (We're working on the Squid-2 - bzr merge stuff at the moment!)



 Adrian

 2008/9/7 Diego Woitasen [EMAIL PROTECTED]:
  This patch apply to Squid 2.7.STABLE4.
 
  If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user
  isn't authenticated previously, send 407.
 
  regards,
 Diego
 
 
  diff --git a/src/client_side.c b/src/client_side.c
  index 23c4274..4f75ea0 100644
  --- a/src/client_side.c
  +++ b/src/client_side.c
  @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http)
 
   }
 
  -static void
  -clientAccessCheckDone(int answer, void *data)
  +void
  +clientSendErrorReply(clientHttpRequest * http, int answer)
   {
  -clientHttpRequest *http = data;
  err_type page_id;
  http_status status;
  ErrorState *err = NULL;
  char *proxy_auth_msg = NULL;
  +
  +proxy_auth_msg = 
  authenticateAuthUserRequestMessage(http-conn-auth_user_request ? 
  http-conn-auth_user_request : http-request-auth_user_request);
  +
  +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || 
  aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;
  +
  +debug(33, 5) (Access Denied: %s\n, http-uri);
  +debug(33, 5) (AclMatchedName = %s\n,
  +   AclMatchedName ? AclMatchedName : null);
  +debug(33, 5) (Proxy Auth Message = %s\n,
  +   proxy_auth_msg ? proxy_auth_msg : null);
  +
  +/*
  + * NOTE: get page_id here, based on AclMatchedName because
  + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets
  + * clobbered in the clientCreateStoreEntry() call
  + * just below.  Pedro Ribeiro [EMAIL PROTECTED]
  + */
  +page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, 
  answer != ACCESS_REQ_PROXY_AUTH);
  +http-log_type = LOG_TCP_DENIED;
  +http-entry = clientCreateStoreEntry(http, http-request-method,
  +   null_request_flags);
  +if (require_auth) {
  +   if (!http-flags.accel) {
  +   /* Proxy authorisation needed */
  +   status = HTTP_PROXY_AUTHENTICATION_REQUIRED;
  +   } else {
  +   /* WWW authorisation needed */
  +   status = HTTP_UNAUTHORIZED;
  +   }
  +   if (page_id == ERR_NONE)
  +   page_id = ERR_CACHE_ACCESS_DENIED;
  +} else {
  +   status = HTTP_FORBIDDEN;
  +   if (page_id == ERR_NONE)
  +   page_id = ERR_ACCESS_DENIED;
  +}
  +err = errorCon(page_id, status, http-orig_request);
  +if (http-conn-auth_user_request)
  +   err-auth_user_request = http-conn-auth_user_request;
  +else if (http-request-auth_user_request)
  +   err-auth_user_request = http-request-auth_user_request;
  +/* lock for the error state */
  +if (err-auth_user_request)
  +   authenticateAuthUserRequestLock(err-auth_user_request);
  +err-callback_data = NULL;
  +errorAppendEntry(http-entry, err);
  +
  +}
  +
  +static void
  +clientAccessCheckDone(int answer, void *data)
  +{
  +clientHttpRequest *http = data;
  +
  debug(33, 2) (The request %s %s is %s, because it matched '%s'\n,
 RequestMethods[http-request-method].str, http-uri,
 answer == ACCESS_ALLOWED ? ALLOWED : DENIED,
 AclMatchedName ? AclMatchedName : NO ACL's);
  -proxy_auth_msg = 
  authenticateAuthUserRequestMessage(http-conn-auth_user_request ? 
  http-conn-auth_user_request : http-request-auth_user_request);
  http-acl_checklist = NULL;
  if (answer == ACCESS_ALLOWED) {
 safe_free(http-uri);
  @@ -469,47 +521,7 @@ clientAccessCheckDone(int answer, void *data)
 http-redirect_state = REDIRECT_PENDING;
 clientRedirectStart(http);
  } else {
  -   int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || 
  aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;
  -   debug(33, 5) (Access Denied: %s\n, http-uri);
  -   debug(33, 5) (AclMatchedName = %s\n,
  -   AclMatchedName ? AclMatchedName : null);
  -   debug(33, 5) (Proxy Auth Message = %s\n,
  -   proxy_auth_msg ? proxy_auth_msg : null);
  -   /*
  -* NOTE: get page_id here, based on AclMatchedName because
  -* if USE_DELAY_POOLS is enabled, then AclMatchedName gets
  -* clobbered in the clientCreateStoreEntry() call
  -* just below.  Pedro Ribeiro [EMAIL PROTECTED]
  -*/
  -   page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, 
  answer != ACCESS_REQ_PROXY_AUTH);
  -   http-log_type = LOG_TCP_DENIED;
  -   http-entry = clientCreateStoreEntry(http, http-request-method,
  -   null_request_flags);
  -   if (require_auth

Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access

2008-09-06 Thread Adrian Chadd

It looks fine; could you dump it into bugzilla for the time being?
(We're working on the Squid-2 - bzr merge stuff at the moment!)



Adrian

2008/9/7 Diego Woitasen [EMAIL PROTECTED]:
 This patch apply to Squid 2.7.STABLE4.

 If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user
 isn't authenticated previously, send 407.

 regards,
Diego


 diff --git a/src/client_side.c b/src/client_side.c
 index 23c4274..4f75ea0 100644
 --- a/src/client_side.c
 +++ b/src/client_side.c
 @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http)

  }

 -static void
 -clientAccessCheckDone(int answer, void *data)
 +void
 +clientSendErrorReply(clientHttpRequest * http, int answer)
  {
 -clientHttpRequest *http = data;
 err_type page_id;
 http_status status;
 ErrorState *err = NULL;
 char *proxy_auth_msg = NULL;
 +
 +proxy_auth_msg = 
 authenticateAuthUserRequestMessage(http-conn-auth_user_request ? 
 http-conn-auth_user_request : http-request-auth_user_request);
 +
 +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || 
 aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;
 +
 +debug(33, 5) (Access Denied: %s\n, http-uri);
 +debug(33, 5) (AclMatchedName = %s\n,
 +   AclMatchedName ? AclMatchedName : null);
 +debug(33, 5) (Proxy Auth Message = %s\n,
 +   proxy_auth_msg ? proxy_auth_msg : null);
 +
 +/*
 + * NOTE: get page_id here, based on AclMatchedName because
 + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets
 + * clobbered in the clientCreateStoreEntry() call
 + * just below.  Pedro Ribeiro [EMAIL PROTECTED]
 + */
 +page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, 
 answer != ACCESS_REQ_PROXY_AUTH);
 +http-log_type = LOG_TCP_DENIED;
 +http-entry = clientCreateStoreEntry(http, http-request-method,
 +   null_request_flags);
 +if (require_auth) {
 +   if (!http-flags.accel) {
 +   /* Proxy authorisation needed */
 +   status = HTTP_PROXY_AUTHENTICATION_REQUIRED;
 +   } else {
 +   /* WWW authorisation needed */
 +   status = HTTP_UNAUTHORIZED;
 +   }
 +   if (page_id == ERR_NONE)
 +   page_id = ERR_CACHE_ACCESS_DENIED;
 +} else {
 +   status = HTTP_FORBIDDEN;
 +   if (page_id == ERR_NONE)
 +   page_id = ERR_ACCESS_DENIED;
 +}
 +err = errorCon(page_id, status, http-orig_request);
 +if (http-conn-auth_user_request)
 +   err-auth_user_request = http-conn-auth_user_request;
 +else if (http-request-auth_user_request)
 +   err-auth_user_request = http-request-auth_user_request;
 +/* lock for the error state */
 +if (err-auth_user_request)
 +   authenticateAuthUserRequestLock(err-auth_user_request);
 +err-callback_data = NULL;
 +errorAppendEntry(http-entry, err);
 +
 +}
 +
 +static void
 +clientAccessCheckDone(int answer, void *data)
 +{
 +clientHttpRequest *http = data;
 +
 debug(33, 2) (The request %s %s is %s, because it matched '%s'\n,
RequestMethods[http-request-method].str, http-uri,
answer == ACCESS_ALLOWED ? ALLOWED : DENIED,
AclMatchedName ? AclMatchedName : NO ACL's);
 -proxy_auth_msg = 
 authenticateAuthUserRequestMessage(http-conn-auth_user_request ? 
 http-conn-auth_user_request : http-request-auth_user_request);
 http-acl_checklist = NULL;
 if (answer == ACCESS_ALLOWED) {
safe_free(http-uri);
 @@ -469,47 +521,7 @@ clientAccessCheckDone(int answer, void *data)
http-redirect_state = REDIRECT_PENDING;
clientRedirectStart(http);
 } else {
 -   int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || 
 aclIsProxyAuth(AclMatchedName))  !http-request-flags.transparent;
 -   debug(33, 5) (Access Denied: %s\n, http-uri);
 -   debug(33, 5) (AclMatchedName = %s\n,
 -   AclMatchedName ? AclMatchedName : null);
 -   debug(33, 5) (Proxy Auth Message = %s\n,
 -   proxy_auth_msg ? proxy_auth_msg : null);
 -   /*
 -* NOTE: get page_id here, based on AclMatchedName because
 -* if USE_DELAY_POOLS is enabled, then AclMatchedName gets
 -* clobbered in the clientCreateStoreEntry() call
 -* just below.  Pedro Ribeiro [EMAIL PROTECTED]
 -*/
 -   page_id = aclGetDenyInfoPage(Config.denyInfoList, AclMatchedName, 
 answer != ACCESS_REQ_PROXY_AUTH);
 -   http-log_type = LOG_TCP_DENIED;
 -   http-entry = clientCreateStoreEntry(http, http-request-method,
 -   null_request_flags);
 -   if (require_auth) {
 -   if (!http-flags.accel) {
 -   /* Proxy authorisation needed */
 -   status = HTTP_PROXY_AUTHENTICATION_REQUIRED;
 -   } else {
 -   /* WWW authorisation needed */
 -   status = HTTP_UNAUTHORIZED;
 -   }
 -   if (page_id == ERR_NONE)
 -   page_id = ERR_CACHE_ACCESS_DENIED;
 -   } else {
 -

Re: [RFC] COSS removal from 3.0

2008-09-04 Thread Adrian Chadd

2008/9/4 Amos Jeffries [EMAIL PROTECTED]:
 I'm expecting to roll 3.0.STABLE9 sometime over the next 5 days.

 One update still to be done is the removal of COSS.

 I had planned on just dead-coding (disabling) it. But with the configure
 recursion being dynamic thats not easily possible.

 I'm currently considering dropping an #error abortion into the top of all
 COSS code files to kill any builds trying to use it. Anyone have a better
 way? drop the code entirely from 3.0?

I think thats perfectly fine for now.



Adrian

Re: squid-2.HEAD:

2008-09-03 Thread Adrian Chadd

2008/9/3 Alexander V. Lukyanov [EMAIL PROTECTED]:
 Hello!

 I have noticed lots of 'impossible keep-alive' messages in the log.
 It appears that httpReplyBodySize incorrectly returns -1 for 304 Not
 Modified replies. Patch to fix it is attached.

Hm, I'd have to eyeball the rest of the code to make sure thats the right fix.
Can you throw it into a bugzilla ticket for me? I've got a couple
other Squid-2.HEAD patches to stare at and commit.


Adrian

Re: [MERGE] Address Alex and Amos' comments.

2008-09-03 Thread Adrian Chadd

I still need to eyeball the relative URL stuff, but..


bb:approve

2008/9/3 Bundle Buggy [EMAIL PROTECTED]:
 Bundle Buggy has detected this merge request.

 For details, see:
 http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E
 Project: Squid

Re: Using cached headers in ACLs

2008-09-03 Thread Adrian Chadd

nope!



adrian

2008/9/4 Diego Woitasen [EMAIL PROTECTED]:
 Hi,
As I've explained in my introduction, I'm working on changes
over cache statement and refresh_pattern to allow easy flash
video caching and may be other things. The first thing that I'm
trying to change is that ACLs used in cache would match
against cached headers. For example, if the cached headers for
some URL contains Content-Type: video/flv I serve that object
from cache.

Is there any contraindication if I use cached headers in that
way?

Regards,
Diego

 --
 ---
 Diego Woitasen - XTECH
 www.xtech.com.ar

Re: [MERGE] Address Alex and Amos' comments.

2008-09-03 Thread Adrian Chadd

bb:approve

(Sorry, I wasn't setup to vote until now!)

2008/9/4 Adrian Chadd [EMAIL PROTECTED]:
 I still need to eyeball the relative URL stuff, but..


 bb:approve

 2008/9/3 Bundle Buggy [EMAIL PROTECTED]:
 Bundle Buggy has detected this merge request.

 For details, see:
 http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E
 Project: Squid

Re: [MERGE] Address Alex and Amos' comments.

2008-09-03 Thread Adrian Chadd

bb:approve

(third time lucky?)

2008/9/4 Adrian Chadd [EMAIL PROTECTED]:
 bb:approve

 (Sorry, I wasn't setup to vote until now!)

 2008/9/4 Adrian Chadd [EMAIL PROTECTED]:
 I still need to eyeball the relative URL stuff, but..


 bb:approve

 2008/9/3 Bundle Buggy [EMAIL PROTECTED]:
 Bundle Buggy has detected this merge request.

 For details, see:
 http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E
 Project: Squid

Re: [MERGE] Address Alex and Amos' comments.

2008-09-03 Thread Adrian Chadd

bb:approve

2008/9/3 Bundle Buggy [EMAIL PROTECTED]:
 Bundle Buggy has detected this merge request.

 For details, see:
 http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E
 Project: Squid

Re: [MERGE] Address Alex and Amos' comments.

2008-09-03 Thread Adrian Chadd

bb:approve

*sigh!*

2008/9/3 Bundle Buggy [EMAIL PROTECTED]:
 Bundle Buggy has detected this merge request.

 For details, see:
 http://bundlebuggy.aaronbentley.com/project/squid/request/%3C200809030444.m834iIKI048580%40harfy.jeamland.net%3E
 Project: Squid

Re: [squid-users] state of gzip/transfer-encoding?

2008-09-02 Thread Adrian Chadd

So how much is needed exactly to support this when we currently don't
support HTTP/1.1?


Adrian

2008/9/2 Amos Jeffries [EMAIL PROTECTED]:
 Chris Woodfield wrote:
 Squid does not do transfer encoding of objects on its own;


 Just be curious, does Squid have the plan to develop the feature of
 objects compression like Apache's mod_deflate? Thanks.


 Yes. As an eCAP module. It should be out shortly after eCAP support is
 stabilized.

 Amos

Re: pseudo-specs for a String class: char

2008-09-02 Thread Adrian Chadd

2008/9/2 Kinkie [EMAIL PROTECTED]:

 Read: might be useful for the HTTP parser.
 Stuff like:

 KBuf reqhdr = (whatever gets in from comm)
 Kbuf hdrline=reqhdr-nextToken(\n);
 if (hdrline[hdrline.len()-1]=='\r')
hrline.truncate(hdrline.len()-1); //chop \r off

Ok, so what would the underlying manipulations be?



Adrian

Re: [squid-users] state of gzip/transfer-encoding?

2008-09-02 Thread Adrian Chadd

2008/9/3 Amos Jeffries [EMAIL PROTECTED]:

 The existing experimental patch for 3.0.pre2 adds a ClientStreams handler
 for de/encoding as needed.

Right.

 It's really only a matter of caching things properly (ETag from server of
 squid-generated), and fiddling the headers slightly as they transit Squid
 in both directions. Should not matter which HTTP/1.x version is used at
 the header level, since both can handle compressed data objects.

 3.1 already does TE decoding. But we can't do the TE encoding until 1.1,
 only Content-Type re-coding.

So its not specifically TE, its just content-encoded fiddling as appropriate?
OK, that makes much more sense.


Adrian

Re: [MERGE] Rework urlAbsolute to be a little more streamlined.

2008-09-01 Thread Adrian Chadd

one zeroes, one doesn't.
calloc is meant to be x objects of size y, but its effectively also a bzero().


Adrian

2008/9/2 Amos Jeffries [EMAIL PROTECTED]:

 On 01/09/2008, at 1:01 PM, Amos Jeffries wrote:

 Resubmit this patch, including changes based on comments by various
 people.

 - Mention RFC text in relation to changing the default behaviour in
 relation
  to unknown HTTP methods.
 - Use safe_free instead of xfree.
 - Rework urlAbsolute to use snprintf in a slightly better way.
 Snprintf
 is now
  used to construct the initial portion of the url and the rest is
 added
 on
  using POSIX string routines.


 I'm sure you can still crop the xstrdup() usage by adding a JIT
 allocation
 of urlbuf before if (req-protocol == PROTO_URN)
 and returning urlbuf plain at the end.

 As in malloc it?

 Yes with xmalloc or xcalloc. (I'm still not sure why we have two).

 Amos

Re: binary data

2008-09-01 Thread Adrian Chadd

Erk! I just read that patch!

man isprint. Oh, and man ctype. I'm sure there's a C++ equivalent
somewhere which makes more sense to use.


Adrian

2008/9/2 Amos Jeffries [EMAIL PROTECTED]:
 Henrik,
  Theres been some user interest in porting the binary data hack

 http://www.squid-cache.org/Versions/v3/3.0/changesets/b8877.patch

 to 2.7. How say you?

 Amos

Re: pseudo-specs for a String class

2008-08-31 Thread Adrian Chadd

Do you really want to provide a 'consume' interface for a low-level
representation of memory?

I think trying to replace MemBuf with this new buffer is a bit silly.
Sure, use it -in- MemBuf, along with all the other places that buffers
are used.

What about strtok()? Why would you want to tokenise data?


Adrian

2008/8/31 Kinkie [EMAIL PROTECTED]:
 +1. With a view of re-using MemBuf in the final product. Starting from
 KinkieBuf. (Joke, me taking a dig at the content filterers again).

 I've gotten a bit forward, now I'm a bit at a loss about where to go next.

 Current interface:

 class KBuf {
KBuf();
KBuf(const KBuf S);
KBuf(const char *S, u_int32_t Ssize);
KBuf(const char *S); //null-terminated
~KBuf();
bool isNull();
KBuf operator = (KBuf S);
KBuf operator = (char const *S, u_int32_t Ssize);
KBuf operator = (char const *S);  //null-terminated
KBuf append(KBuf S);
KBuf append(const char * S, u_int32_t Slen);
KBuf append(const char * S); //null-terminated
KBuf append(const char c);   //To be removed?
KBuf appendf(const char *fmt, ...); //to be copied over from membuf
std::ostream  print (std::ostream os); // for operator
void dump(ostream os); //dump debugging info
const int operator [] (int pos);
int cmp(KBuf S); //strcmp()
bool operator == (const KBuf  S);
bool operator (KBuf S);
bool operator (KBuf S);
void truncate(u_int32_t to_size);
KBuf consume(u_int32_t howmuch); //from MemBuf
void terminate(void); //null-terminate
static ostream  stats(ostream os);
char *exportCopy(void);
char *exportRefIKnowWhatImDoing(void);
KBuf nextToken(const char *delim); //strtok()
KBuf substr(u_int32_t from, u_int32_t to);
u_int32_t index(char c);
u_int32_t rindex(char c);
 }

 on x86, sizeof(KBuf)=16

 Now I'm a bit at a loss as to how to best integrate with iostream.
 There's basically three possibilities:

 1. KBuf kb; kb.append(stringstream )
   cheapest implementation, but each of those requires two to three
 copies of the contents
 2. Kbuf kb; stringstream ss=kb-stream(); ss (blah).
   this seems to be the official way of extending iostreams;
 performed by making KBuf a subclass of stringbuf.
   extends sizeof(KBuf) by 32 bytes, and many of its calls need to
 housekeep two states.
 3. Kbuf kb; stringstream ss=kb-stream(); ss  (blah)
performed by using an adapter class. The coding effort starts to
 be quite noticeable, as keeping the stringbuf and the KBuf in sync is
 not trivial.
 4 Kbuf kb(blah). requires kbuf to be a subclass of an ostream.
   there's a significant coding effort, AND baloons the size of KBuf
 to 156 bytes.

 What's your take on how to better address this?


 --
  /kinkie

Re: Refresh patterns and ACLs

2008-08-29 Thread Adrian Chadd

2008/8/30 Henrik Nordstrom [EMAIL PROTECTED]:

 Make sure you can collapse those ACLs down to something sensible for
 software processing before you go down that path!

 It's relatively easy to make a unified lookup tree of such structure,
 and even if you don't it's still as fast or faster than the current acl
 scheme.

Oh, I'm sure we could beat the current way we're using ACLs, the
question is whether we can dramatically improve complicated ACL
processing in the future.

A big problem with the way we're inlining fast-path ACL lookups is
that they suddenly become impossible to farm out to other threads.



Adrian

Re: Refresh patterns and ACLs

2008-08-28 Thread Adrian Chadd

2008/8/29 Kinkie [EMAIL PROTECTED]:

 YES please..
 I'm quite familiar with the JunOS ACL format and it resembes this
 pretty closely, it's very flexible..

Make sure you can collapse those ACLs down to something sensible for
software processing before you go down that path!




Adrian

1 2 3 4 5 6 7 8 9 >

1 - 100 of 888 matches

Mail list logo