Re: SMP: logging

2010-02-24 Thread Adrian Chadd
On 24 February 2010 18:06, Adrian Chadd  wrote:

> Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may
> be under Linux and it may be under certain FreeBSD versions, but it's
> likely a side-effect of VFS locking than the actual specification.

.. and it certainly won't be supported for logging-to-NFS.

I'd honestly just investigate a logging layer that implements some
kind of IPC mechanism (sockets, sysvshm, etc) that can handle logs
from multiple processes.

Or you go down the apache path - lock, append, unlock. Eww.



adrian


Re: SMP: logging

2010-02-24 Thread Adrian Chadd
On 24 February 2010 06:55, Amos Jeffries  wrote:

>> Ah, I did not realize cache.log daemon logging is not supported yet. One
>> more reason to start with simple O_APPEND. As a side effect, we would be
>> able to debug daemon log starting problems as well :-).
>>
>
> Yay. Definitely +3 then. :)

Uhm, is O_APPEND defined as an atomic write? I didn't think so. It may
be under Linux and it may be under certain FreeBSD versions, but it's
likely a side-effect of VFS locking than the actual specification.



adrian


Re: [squid-users] 'gprof squid squid.gmon' only shows the initial configuration functions

2009-12-09 Thread Adrian Chadd
Talk to the freebsd guys (eg me) about pmcstat and support for your
hardware. You may just need to find / organise a backport of the
particular hardware support for your platform. I've been working on
profiling Lusca with pmcstat and some new-ish tools which use and
extend it in useful ways.

gprof data is almost certainly uselessly unreliable on modern CPUs.
Too much can and will happen between profiling ticks.

I can hazard a few guesses about where your CPU is going. Likely
candidate is poll() if your Squid is too old. First thing to do is
organise porting the kqueue() stuff if it isn't already included.

I can make more educated guesses about where the likely CPU hog
culprits are given workload and configuration file information.



Adrian

2009/12/10 Guy Bashkansky :
> Is there an oprofile version for FreeBSD?  I thought it is limited to
> Linux.  On FreeBSD I tried pmcstat, but it gives an initialization
> error.
>
> My version of Squid is old and customized (so I can't upgrade) and may
> not have builtin timers.  Since what version did they appear?
>
> As for gprof - even with the event loop on top, still the rest of the
> table might give some idea why the CPU is overloaded.  The problem is
> - I see only initial configuration functions:
>
>                                 called/total       parents
> index  %time    self descendents  called+self    name           index
>                                 called/total       children
>                                                    
> [1]     63.4    0.17        0.00                 _mcount [1]
> ---
>               0.00        0.10       1/1           _start [3]
> [2]     36.0    0.00        0.10       1         main [2]
>               0.00        0.10       1/1           parseConfigFile [4]
> <...>
> ---
>                                                    
> [3]     36.0    0.00        0.10                 _start [3]
>               0.00        0.10       1/1           main [2]
> ---
>               0.00        0.10       1/1           main [2]
> [4]     36.0    0.00        0.10       1         parseConfigFile [4]
>               0.00        0.09       1/1           readConfigLines [5]
>               0.00        0.00     169/6413        parse_line [6]
> ..
> 
>
> System info:
>
> # uname -m -r -s
> FreeBSD 6.2-RELEASE-p9 amd64
>
> # gcc -v
> Using built-in specs.
> Configured with: FreeBSD/amd64 system compiler
> Thread model: posix
> gcc version 3.4.6 [FreeBSD] 20060305
>
>
> There are 7 fork()s for unlinkd/diskd helpers.  Can these fork()s
> affect profiling info?
>
> On Wed, Dec 9, 2009 at 2:04 AM, Robert Collins
>  wrote:
>> On Tue, 2009-12-08 at 15:32 -0800, Guy Bashkansky wrote:
>>> I've built squid with the -pg flag and run it in the no-daemon mode
>>> (-N flag), without the initial fork().
>>>
>>> I send it the SIGTERM signal which is caught by the signal handler, to
>>> flag graceful exit from main().
>>>
>>> I expect to see meaningful squid.gmon, but 'gprof squid squid.gmon'
>>> only shows the initial configuration functions:
>>
>> gprof isn't terribly useful anyway - due to squids callback based model,
>> it will see nearly all the time belonging to the event loop.
>>
>> oprofile and/or squids built in analytic timers will get much better
>> info.
>>
>> -Rob
>>
>
>


Re: your suggestion for range_offset_limit

2009-11-26 Thread Adrian Chadd
the trick at least in squid-2 is to make sure that quick abort isn't
occuring. Or it will begin downloading the whole object, return the
requested range bit, and then abort the remainder of the fetch.



Adrian

2009/11/25 Amos Jeffries :
> Matthew Morgan wrote:
>>
>> On Wed, Nov 25, 2009 at 7:09 PM, Amos Jeffries 
>> wrote:
>>>
>>> Matthew Morgan wrote:

 Sorry it's taking me so long to get this done, but I do have a question.

 You suggested making getRangeOffsetLimit a member of HttpReply.  There
 are
 two places where this method currently needs to be called: one is
 CheckQuickAbort2() in store_client.cc.  This one will be easy, as I can
 just
 do entry->getReply()->getRangeOffsetLimit().

 The other is HttpStateData::decideIfWeDoRanges in http.cc.  Here, all we
 have access to is an HttpRequest object.  I looked through the source to
 see
 if I could find where a request owned or had access to a reply, but I
 don't
 see anything like that.  If getRangeOffsetLimit were a member of
 HttpReply,
 what do you suggest doing here?  I could make a static version of the
 method, but that wouldn't allow caching the result.
>>>
>>> Ah. I see. Quite right.
>>>
>>> After a bit more though I find my original request a bit weird.
>>>
>>> Yes it should be a _Request_ member and do its caching there. You can go
>>> ahead with that now while we discuss whether to do a slight tweak on top
>>> of
>>> the basic feature.
>>>
>>>
>>> [cc'ing squid-dev so others can provide input]
>>>
>>> I'm not certain of the behavior we want here if we do open the ACLs to
>>> reply
>>> details. Some discussion is in order.
>>>
>>> Simple way would be to not cache the lookup the first time when reply
>>> details are not provided.
>>>
>>> It would mean making it return potentially two different values across
>>> the
>>> transaction.
>>>
>>>  1) based on only request detail to
>>>  and other on request+reply details. decide if a range request to
>>> possible.
>>> and then
>>> 2) based on additional reply details to see if the abort could be done.
>>>
>>> No problem if the reply details cause an increase in the limit. But if
>>> they
>>> restrict it we enter grounds of potentially making a request then
>>> canceling
>>> it and being unable to store the results.
>>>
>>>
>>> Or, taking the maximum of the two across two calls? so it can only
>>> increase.
>>>  would be slightly trickier involving a flag a well to short-circuit the
>>> reply lookups instead of just a magic cache value.
>>>
>>> Am I seriously over-thinking things today?
>>>
>>>
>>> Amos
>>
>> Here's a question, too: is this feature going to benefit anyone?  I
>> realized later that it will not solve my problem, because all the
>> traffic that was getting force downloaded ended up being from windows
>> updates.  The urls showing up in netstat and such were just weird
>> because the windows update traffic was actually coming from limelight.
>>  My ultimate solution was to write a script that reads access.log,
>> checks for windows update urls that are not cached, and manually
>> download them one at a time after hours.
>>
>> If there is anyone at all who would benefit from this I would still be
>> *more* than glad to code it (as I said, it would be my first real open
>> source contribution...very exciting), but I just wondered if anyone
>> will actually use it.
>
> I believe people will find more control here useful.
>
> Windows update service packs are a big reason, but there are also similar
> range issues with Adobe Reader online PDFs, google maps/earth, and flash
> videos when paused/resumed. Potentially other stuff, but I have not heard of
> problems.
>
> This will allow anyone to fine tune the places where ranges are permitted or
> forced to fully cache. Avoiding the problems a blanket limit adds.
>
>>
>> As to which approach would be better, I don't know enough about that
>> data path to really suggest.  When I initially made my changes, I just
>> replaced each reference to Config.range_offset_limit or whatever.
>> Today I went back and read some more of the code, but I'm still
>> figuring it out.  How often would the limit change based on the
>> request vs. the reply?
>
> Just the once. On first time being checked for the reply.
> And most likely on the case of testing for a reply mime type. The other
> useful info I can think of are all request data.
>
> You can ignore if you like. I'm just worrying over a borderline case.
> Someone else can code a fix if they find it a problem or need to do mime
> checks.
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20
>  Current Beta Squid 3.1.0.15
>
>


Re: squid-smp: synchronization issue & solutions

2009-11-19 Thread Adrian Chadd
Right. Thats the easy bit. I could even do that in Squid-2 with a
little bit of luck. The hard bit is rewriting the relevant code which
relies on cbdata style reference counting behaviour. That is the
tricky bit.



Adrian

2009/11/20 Robert Collins :
> On Wed, 2009-11-18 at 10:46 +0800, Adrian Chadd wrote:
>> Plenty of kernels nowdays do a bit of TCP and socket process in
>> process/thread context; so you need to do your socket TX/RX in
>> different processes/threads to get parallelism in the networking side
>> of things.
>
> Very good point.
>
>> You could fake it somewhat by pushing socket IO into different threads
>> but then you have all the overhead of shuffling IO and completed IO
>> between threads. This may be .. complicated.
>
> The event loop I put together for -3 should be able to do that without
> changing the loop - just extending the modules that hook into it.
>
> -Rob
>


Re: squid-smp: synchronization issue & solutions

2009-11-17 Thread Adrian Chadd
Plenty of kernels nowdays do a bit of TCP and socket process in
process/thread context; so you need to do your socket TX/RX in
different processes/threads to get parallelism in the networking side
of things.

You could fake it somewhat by pushing socket IO into different threads
but then you have all the overhead of shuffling IO and completed IO
between threads. This may be .. complicated.


Adrian

2009/11/18 Gonzalo Arana :
> On Tue, Nov 17, 2009 at 12:45 PM, Alex Rousskov
>  wrote:
>> On 11/17/2009 04:09 AM, Sachin Malave wrote:
>>
>> 
>>
>>> I AM THINKING ABOUT HYBRID OF BOTH...
>>>
>>> Somebody might implement process model, Then we would merge both
>>> process and thread models .. together we could have a better squid..
>>> :)
>>> What do u think? 
>
> In my limited squid expierence, cpu usage is hardly a bottleneck.  So,
> why not just use smp for the cpu/disk-intensive parts?
>
> The candidates I can think of are:
>  * evaluating regular expressions (url_regex acls).
>  * aufs/diskd (squid already has support for this).
>
> Best regards,
>
> --
> Gonzalo A. Arana
>
>


Re: squid-smp

2009-10-15 Thread Adrian Chadd
Oh, I can absolutely give you guys food for thought. I was just hoping
someone else would already try to do a bit of legwork.

Things to think about:

* Do you really, -really- want to reinvent the malloc wheel? This is
separate from caching results and pre-made class instances. There's
been a lot of work in well-performing, thread-aware malloc libraries
* Do you want to run things in multiple processes or multiple threads?
Or support both?
* How much of the application do you want to push out into separate
threads? run lots of "copies" of Squid concurrently, with some locking
going on? Break up individual parts of the processing pipeline into
threads? (eg, what I'm going to be experimenting with soon - handling
ICP/HTCP in a separate thread for some basic testing)
* Survey the current codebase and figure out what depends upon what -
in a way that you can use for figuring out what needs to be made
re-entrant and what may need locking. Think about how to achieve all
of this. Best example of this - you're going to need to figure out how
to do concurrent debug logging and memory allocation - so see what
that code uses, what that codes' code uses, etc
* 10GE cards are dumping individual PCIe channels to CPUs; which means
that the "most efficient" way of pumping data around will be to
somehow throw individual connections onto specific CPUs, and keep them
there. There's no OS support for this yet, but OSes may be "magical"
(ie, handing you sockets in specific threads via accept() and hoping
that the NIC doesn't reorganise its connection->PCIe channel hash)
* Do you think its worth being able to migrate specific connections
between threads? Or once they're in a thread they're there for good
* If you split up squid into "lots of threads running the whole app",
what and where would you envisage locking and blocking? What about
data sharing? How would that scale given a handful of example
workloads? What about in abnormal situations? How well will things
degrade?
* What about using message passing and message queues? Where would it
be appropriate? Where wouldn't it be appropriate? Why?

Here's an example:

* Imagine you're doing store lookups using message passing with your
"store" being a separate thread with a message queue. Think about how
you'd handle say, ICP peering between two caches doing > 10,000
requests a second. What repercussions does that have for the locking
of the message queues between other threads. What are the other
threads doing?

With that in mind, survey the kinds of ways that current network apps
"do" threading:

* look at the various ways apache does it - eg, the per-connection
thread+process hybrid model, the event-worker thread model, etc
* look at memcached - one thread doing accept'ing, farming requests
off to other threads that just run a squid-like event loop. Minimal
inter-thread communication for the most part
* investigate what the concurrency hooks for various frameworks do -
eg, the boost asio library stuff has "colours" which you mark thread
events with. These colours dictate which events need to be run
sequentially and which can run in parallel
* look at all of the random blogs written by windows networking coders
- they're further ahead of the massively-concurrent network
application stack because Windows has had it for a number of years.

Now. You've mentioned you've looked at the others and you think major
replumbing is going to be needed. Here's a hint - its going to be
needed. Thinking you can avoid it is silly. Figuring out what you can
do right now that doesn't lock you into a specific trajectory is -not-
silly. For example, figuring out what APIs need to be changed to make
them re-enterant is not silly. Most of the stuff in lib/ with static
char buffers that they return need to be changed. That can be done
-now- without having to lock yourself into a particular concurrency
model.

2c,



Adrian

2009/10/15 Amos Jeffries :
> Adrian Chadd wrote:
>>
>> 2009/10/15 Sachin Malave :
>>
>>> Its not like we want to make project bad. Squid was not deployed on
>>> smp before because we did not have shared memory architectures
>>> (multi-cores), also the library support for multi-threading was like
>>> nightmare for people. Now things are changed, it is very easy to
>>> manage threads, people have multi-core machines at their desktops, and
>>> as hardware is available now or later somebody has to try and build
>>> SMP support. think about future...
>>>
>>> To cop with internet speed & increase in number of users, Squid must
>>> use multi-core architecture and distribute its work
>>
>> I 100% agree with your comments. I agree 100% that Squid needs to be
>> made scalable on

Re: squid-smp

2009-10-15 Thread Adrian Chadd
2009/10/15 Sachin Malave :

> Its not like we want to make project bad. Squid was not deployed on
> smp before because we did not have shared memory architectures
> (multi-cores), also the library support for multi-threading was like
> nightmare for people. Now things are changed, it is very easy to
> manage threads, people have multi-core machines at their desktops, and
> as hardware is available now or later somebody has to try and build
> SMP support. think about future...
>
> To cop with internet speed & increase in number of users, Squid must
> use multi-core architecture and distribute its work

I 100% agree with your comments. I agree 100% that Squid needs to be
made scalable on multi-core boxes.

Writing threaded code may be easier now than in the past, but the ways
of screwing stability, debuggability, performance and such -haven't-
changed.. This is what I'm trying to get across. :)



Adrian


Re: Operating System that give best performance on Squid

2009-10-14 Thread Adrian Chadd
I should also say I've pushed the same under Linux in my little
cacheboy CDN. The peak loads were around 200mbit worth of ~ 300kbyte
cached replies. This was on both 32 bit, 64 bit intel and even an
IA-64 box.



Adrian

2009/10/15 Adrian Chadd :
> I'm pushing upwards of 100mbit of small objects on FreeBSD using
> Squid-2.7/Lusca and COSS.
>
> You can push quite a bit more if your workload fits in memory and/or
> is large objects.
>
> 2009/10/14 Paul Khadra :
>>
>> Dear all,
>>
>> We are planning to install squid on a server where we expect to push 100
>> mbps - 200 mbps (if possible).
>>
>> Which OS is best tested with squid and which one can give the highest
>> performance ?
>>
>> Thank you, Paul
>> --
>> View this message in context: 
>> http://www.nabble.com/Operating-System-that-give-best-performance-on-Squid-tp25888502p25888502.html
>> Sent from the Squid - Development mailing list archive at Nabble.com.
>>
>>
>


Re: squid-smp

2009-10-14 Thread Adrian Chadd
2009/10/14 Amos Jeffries :

[snip]

I still find it very amusing that noone else has sat down and talked
about the last 20 + years of writing threaded, concurrent code and
what the pro/cons of them would be here; nor what other projects are
doing.

Please don't sit down and talk about how to shoehorn SMP into some
existing Squid-3 "thing" (be it AsyncCalls, or anything really) before
doing this. You'll just be re-inventing the same mistakes made in the
past and it will make the project look bad.



Adrian


Re: Operating System that give best performance on Squid

2009-10-14 Thread Adrian Chadd
I'm pushing upwards of 100mbit of small objects on FreeBSD using
Squid-2.7/Lusca and COSS.

You can push quite a bit more if your workload fits in memory and/or
is large objects.

2009/10/14 Paul Khadra :
>
> Dear all,
>
> We are planning to install squid on a server where we expect to push 100
> mbps - 200 mbps (if possible).
>
> Which OS is best tested with squid and which one can give the highest
> performance ?
>
> Thank you, Paul
> --
> View this message in context: 
> http://www.nabble.com/Operating-System-that-give-best-performance-on-Squid-tp25888502p25888502.html
> Sent from the Squid - Development mailing list archive at Nabble.com.
>
>


Re: Recent Facebook Issues

2009-10-09 Thread Adrian Chadd
I've emailed the facebook NOC directly about the issue.

Thanks,


Adrian

2009/10/9 Kinkie :
> You can try to access facebook with konqueror. It complains
> rather loudly, drops the excess data and the site generally doesn't
> work (has been dong so for a few days, but only NOW I'm connecting the
> wires...)
>
>  Kinkie
>
> On Fri, Oct 9, 2009 at 2:52 AM, Adrian Chadd  wrote:
>> Ok, this happens for all versions?
>>
>> I can bring this up with facebook engineering if someone provides me
>> with further information.
>>
>>
>> Adrian
>>
>> 2009/10/9 Amos Jeffries :
>>> Thanks to several people I've managed to track down why the facebook issues
>>> are suddenly appearing and why its intermittent.
>>>
>>> On the sometimes works sometimes doesn't problem. facebook.com does
>>> User-Agent header checks and sends back one of four pages.
>>> 1) a generic page saying 'please use another browser'.
>>> 2) a redirect to login for each of IE, Firefox and Safari
>>> 3) a home page (if cookies sent initially)
>>>
>>> going through the login redirects to the page also presented at (3) above.
>>>
>>> The home page is the real problem. When cookies are presented it ships
>>> without Content-Length (fine).
>>> When they _are_ present, ie after the user has logged in it ships with
>>> Content-Length: 18487 and data size of 18576.
>>>
>>> Amos
>>>
>>>
>>
>
>
>
> --
>    /kinkie
>
>


Re: Recent Facebook Issues

2009-10-08 Thread Adrian Chadd
Ok, this happens for all versions?

I can bring this up with facebook engineering if someone provides me
with further information.


Adrian

2009/10/9 Amos Jeffries :
> Thanks to several people I've managed to track down why the facebook issues
> are suddenly appearing and why its intermittent.
>
> On the sometimes works sometimes doesn't problem. facebook.com does
> User-Agent header checks and sends back one of four pages.
> 1) a generic page saying 'please use another browser'.
> 2) a redirect to login for each of IE, Firefox and Safari
> 3) a home page (if cookies sent initially)
>
> going through the login redirects to the page also presented at (3) above.
>
> The home page is the real problem. When cookies are presented it ships
> without Content-Length (fine).
> When they _are_ present, ie after the user has logged in it ships with
> Content-Length: 18487 and data size of 18576.
>
> Amos
>
>


Re: Segfault in HTCP CLR request on 64-bit

2009-10-02 Thread Adrian Chadd
The whole struct is on the local stack. Hence bzero() or memset() to 0.

2009/10/2 Matt W. Benjamin :
> Bzero?  Is it an already-allocated array/byte sequence?  (Apologies, I 
> haven't seen the code.)  Assignment to NULL/0 is in fact correct for 
> initializing a sole pointer, and using bzero for that certainly isn't 
> typical.  Also, for initializing a byte range, memset is preferred [see Linux 
> BZERO(3), which refers to POSIX.1-2008 on that point].
>
> STYLE(9) says use NULL rather than 0, and it is clearer.  But C/C++ 
> programmers should know that NULL is 0.  And note that at least through 1998, 
> initialization to 0 was the preferred style in C++, IIRC.
>
> Matt
>
> - "Adrian Chadd"  wrote:
>
>> I've just replied to the ticket in question. It should probably just
>> be a bzero() rather than setting the pointer to 0. Which should
>> really
>> be setting it to NULL.
>>
>> Anyway, please test whether the bzero() works. If it does then I'll
>> commit that fix to HEAD and 2.7.
>>
>> 2009/9/28 Jason Noble :
>> > I have opened a bug for this issue here:
>> http://bugs.squid-cache.org/show_bug.cgi?id=2788  Also, the previous
>> patch was not generated against head so I re-rolled the patch against
>> current head and attached to the bug report
>
> --
>
> Matt Benjamin
>
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI  48104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309
>
>


Re: Segfault in HTCP CLR request on 64-bit

2009-10-01 Thread Adrian Chadd
I've just replied to the ticket in question. It should probably just
be a bzero() rather than setting the pointer to 0. Which should really
be setting it to NULL.

Anyway, please test whether the bzero() works. If it does then I'll
commit that fix to HEAD and 2.7.

2009/9/28 Jason Noble :
> I have opened a bug for this issue here: 
> http://bugs.squid-cache.org/show_bug.cgi?id=2788  Also, the previous patch 
> was not generated against head so I re-rolled the patch against current head 
> and attached to the bug report


Re: Segfault in HTCP CLR request on 64-bit

2009-09-25 Thread Adrian Chadd
Could you please create a bugzilla report for this, complete with a
patch against Squid-2.HEAD and 2.7? I'll then commit it.

2009/9/26 Jason Noble :
> I recently ran into an issue where Squid 2.7 would segfault trying to issue
> HTCP CLR requests.  I found the segfault only occurred on 64-bit machines.
>  While debugging, I found that the value of stuff.S.req_hdrs was not
> initialized but later, strlen was being called on it.  This seems to -- by
> chance -- not fail on 32 bit builds, but always segfaults on 64-bit.  The
> attached patch fixed the problem for me and it seems good programming
> practice to properly initialize pointers to prevent issues such as this.  As
> the htcpStuff struct is used in other places, I have concerns that other
> issues may be lurking as well, although I have yet to run into them.
>
> Regards,
> Jason
>


Re: Squid-smp : Please discuss

2009-09-14 Thread Adrian Chadd
If you want to start looking at -threading- inside Squid, I'd suggest
thinking first how you'd create a generic thread "helper" framework
that allows Squid to run multiple internal threads that can do
"stuff", and then implement some message/data queues and handle
notification between threads.

You can then push some "stuff" into these worker threads as an
experiment and see exactly what the issues are.

Building worker threads into Squid is easy. Making them do anything?
Not so easy :)


Adrian

2009/9/15 Sachin Malave :
> On Tue, Sep 15, 2009 at 1:38 AM, Adrian Chadd  wrote:
>> 2009/9/15 Sachin Malave :
>>> On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd  
>>> wrote:
>>>> Guys,
>>>>
>>>> Please look at what other multi-CPU network applications do, how they
>>>> work and don't work well, before continuing this kind of discussion.
>>>>
>>>> Everything that has been discussed has already been done to death
>>>> elsewhere. Please don't re-invent the wheel, badly.
>>
>>> Yes synchronization is always expensive . So we must target only those
>>> areas where shared data is updated infrequently. Also if we are making
>>> thread then the amount of work done must be more as compared to
>>> overheads required in thread creation, synchronization & scheduling.
>>
>> Current generation CPUs are a lot, lot better at the thread-style sync
>> primitives than older CPUs.
>>
>> There's other things to think about, such as lockless queues,
>> transactional memory hackery, atomic instructions in general, etc,
>> etc, which depend entirely upon the type of hardware being targetted.
>>
>>> If we try to provide locks to existing data structures then
>>> synchronization factor will definitely affect to our design.
>>
>>> Redesigning of such structures and there behavior is time consuming
>>> and may change whole design of the Squid.
>>
>>
>> Adrian
>>
>
>
>
> And  current generation libraries are also far better than older, like
> OpenMP, creating threads and handling synchronization issues in OpenMP
> is very easy...
>
> Automatic locks are provided, u need not to design your own locking
> mechanisms Just a statement and u can lock the shared
> variable...
> Then the major work remains is to identify the shared access.
>
> I WANT TO USE OPENMP library.
>
> ANY suggestions.
>
>


Re: Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?

2009-09-14 Thread Adrian Chadd
But in that case, ACCESS_REQ_PROXY_AUTH would be returned rather than
ACCESS_DENIED..



Adrian

2009/9/15 Robert Collins :
> On Tue, 2009-09-15 at 15:22 +1000, Adrian Chadd wrote:
>> G'day. This question is aimed mostly at Henrik, who I recall replying
>> to a similar question years ago but without explaining why.
>>
>> Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL?
>>
>> The particular bit in src/client_side.c:
>>
>> int require_auth = (answer == ACCESS_REQ_PROXY_AUTH ||
>> aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent;
>>
>> Is there any particular reason why auth is tried again? it forces a
>> pop-up on browsers that already have done authentication via NTLM.
>
> Because it should? Perhaps you can expand on where you are seeing this -
> I suspect a misconfiguration or some such.
>
> Its entirely appropriate to signal HTTP_PROXY_AUTHENTICATION_REQUIRED
> when a user is denied access to a resource *and if they log in
> differently they could get access*.
>
> -Rob
>


Re: Squid-smp : Please discuss

2009-09-14 Thread Adrian Chadd
2009/9/15 Sachin Malave :
> On Tue, Sep 15, 2009 at 1:18 AM, Adrian Chadd  wrote:
>> Guys,
>>
>> Please look at what other multi-CPU network applications do, how they
>> work and don't work well, before continuing this kind of discussion.
>>
>> Everything that has been discussed has already been done to death
>> elsewhere. Please don't re-invent the wheel, badly.

> Yes synchronization is always expensive . So we must target only those
> areas where shared data is updated infrequently. Also if we are making
> thread then the amount of work done must be more as compared to
> overheads required in thread creation, synchronization & scheduling.

Current generation CPUs are a lot, lot better at the thread-style sync
primitives than older CPUs.

There's other things to think about, such as lockless queues,
transactional memory hackery, atomic instructions in general, etc,
etc, which depend entirely upon the type of hardware being targetted.

> If we try to provide locks to existing data structures then
> synchronization factor will definitely affect to our design.

> Redesigning of such structures and there behavior is time consuming
> and may change whole design of the Squid.


Adrian


Re: Squid-smp : Please discuss

2009-09-14 Thread Adrian Chadd
Guys,

Please look at what other multi-CPU network applications do, how they
work and don't work well, before continuing this kind of discussion.

Everything that has been discussed has already been done to death
elsewhere. Please don't re-invent the wheel, badly.



Adrian

2009/9/15 Robert Collins :
> On Tue, 2009-09-15 at 14:27 +1200, Amos Jeffries wrote:
>>
>>
>> RefCounting done properly forms a lock on certain read-only types like
>> Config. Though we are currently handling that for Config by leaking
>> the
>> memory out every gap.
>>
>> SquidString is not thread-safe. But StringNG with its separate
>> refcounted
>> buffers is almost there. Each thread having a copy of StringNG sharing
>> a
>> SBuf equates to a lock with copy-on-write possibly causing issues we
>> need
>> to look at if/when we get to that scope.
>
> General rule: you do /not/ want thread safe objectse for high usage
> objects like RefCount and StringNG.
>
> synchronisation is expensive; design to avoid synchronisation and hand
> offs as much as possible.
>
> -Rob
>
>


Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on http_access DENY?

2009-09-14 Thread Adrian Chadd
G'day. This question is aimed mostly at Henrik, who I recall replying
to a similar question years ago but without explaining why.

Why does Squid-2 return HTTP_PROXY_AUTHENTICATION_REQUIRED on a denied ACL?

The particular bit in src/client_side.c:

int require_auth = (answer == ACCESS_REQ_PROXY_AUTH ||
aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent;

Is there any particular reason why auth is tried again? it forces a
pop-up on browsers that already have done authentication via NTLM.

I've written a patch to fix this in Squid-2.7:

http://www.creative.net.au/diffs/2009-09-15-squid-2.7-auth_required_on_auth_acl_deny.diff

I'll create a bugtraq entry when I have some more background
information about this.

Thanks,


adrian


Re: separate these to new list?: "Build failed..."

2009-08-16 Thread Adrian Chadd
2009/8/16 Henrik Nordstrom :
> sön 2009-08-16 klockan 10:23 +1000 skrev Robert Collins:
>
>> If the noise is too disturbing to folk we can investigate these... I
>> wouldn't want anyone to leave the list because of these reports.
>
> I would expect the number of reports to decline significantly as we
> learn to check commits better to avoid getting flamed in failed build
> reports an hour later.. combined with the filtering just applied which
> already reduced it to 1/6.
>
> But seriously, it would be a sad day if these reports becomes so
> frequent compared to other discussions that developers no longer would
> like to stay subscribed. We then have far more serious problems..

There's more build failure messages on squid-dev then actual
development discussion.

Perhaps the build failure email should start spamming the person who
did the commit, rather than squid-dev.



Adrian


squid-2 - vary and x-accelerator-vary differences?

2009-08-04 Thread Adrian Chadd
G'day,

I just noticed in src/HttpReply.c that the vary expire option
(Config.onoff.vary_ignore_expire) is checked if the reply has HDR_VARY
set but it does not check if  HDR_X_ACCELERATOR_VARY is set.

Everywhere else in the code checks them both consistently and
assembles "Vary" header contents consistently from both.

Is this an oversight/bug? Is it intentional behaviour?

Thanks,


Adrian


Re: multiple store-dir issues

2009-07-19 Thread Adrian Chadd
2009/7/20 Henrik Nordstrom :

>> I've fixed a potentially risky situation in Lusca relating to the
>> initialisation of the storeIOState cbdata type. Each storedir has a
>> different idea of how the allocation should be free()'ed.
>
> Risky in what sense?

Ah. I just re-re-re-read the code again and I now understand what is
going on. There are multiple definitions of "storeIOState cbdata"
being allocated instead of one. The definitions are local to each
module.

Ok. Sorry for the noise. I'll commit a fix to COSS for the
initialisation issue someone reported during reconfigure.



Adrian


multiple store-dir issues

2009-07-19 Thread Adrian Chadd
G'day,

I've fixed a potentially risky situation in Lusca relating to the
initialisation of the storeIOState cbdata type. Each storedir has a
different idea of how the allocation should be free()'ed.

The relevant commit in Lusca is r14208 -
http://code.google.com/p/lusca-cache/source/detail?r=14208 .

I'd like this approach to be included in Squid-2.HEAD and backported
to Squid-2.7 / Squid-2.6.

Thanks,


adrian


Re: Hello from Mozilla

2009-07-16 Thread Adrian Chadd
2009/7/17 Ian Hickson :

>> That way you are still speaking HTTP right until the "protocol change"
>> occurs, so any and all HTTP compatible changes in the path(s) will
>> occur.
>
> As mentioned earlier, we need the handshake to be very precisely defined
> because otherwise people could trick unsuspecting servers into opting in,
> or rather appearing to opt in, and could then send all kinds of commands
> down to those servers.

Would you please provide an example of where an unsuspecting server is
tricked into doing something?

>> Ian, don't you see and understand the semantic difference between
>> "speaking HTTP" and "speaking a magic bytecode that is intended to look
>> HTTP-enough to fool a bunch of things until the upgrade process occurs"
>> ? Don't you understand that the possible set of things that can go wrong
>> here is quite unbounded ? Don't you understand the whole reason for
>> "known ports" and protocol descriptions in the first place?
>
> Apparently not.

Ok. Look at this.

The byte sequence "GET / HTTP/1.0\r\nHost: foo\r\nConnection:
close\r\n\r\n" is not byte equivalent to the sequence "GET /
HTTP/1.0\r\nConnection: close\r\nHost: foo\r\n\r\n"

The same byte sequence interpreted as a HTTP protocol exchange is equivalent.

There's a mostly-expected understanding that what happens over port 80
is HTTP. The few cases where that has broken (specifically Shoutcast,
but I do see other crap on port 80 from time to time..) has been by
people who have implemented a mostly HTTP looking protocol, tested
that it mostly works via a few gateways/firewalls/proxies, and then
deployed it.

>> My suggestion is to completely toss the whole "pretend to be HTTP" thing
>> out of the window and look at extending or adding a new HTTP mechanism
>> for negotiating proper tunneling on port 80. If this involves making
>> CONNECT work on port 80 then so be it.
>
> Redesigning HTTP is really much more work than I intend to take on here.
> HTTP already has an Upgrade mechanism, reusing it seems the right thing to
> do.

What you intend to take on here and what should be taken on here is
very relevant.
You're intending to do stuff over tcp/80 which looks like HTTP but
isn't HTTP. Everyone who implements anything HTTP gateway related (be
it a transparent proxy, a firewall, a HTTP "router", etc) suddenly may
have to implement your websockets stuff as well. So all of a sudden
your attempt to not extend HTTP ends up extending HTTP.

>> The point is, there may be a whole lot of stuff going on with HTTP
>> implementations that you're not aware of.
>
> Sure, but with the except of man-in-the-middle proxies, this isn't a big
> deal -- the people implementing the server side are in control of what the
> HTTP implementation is doing.

That may be your understanding of how the world works, but out here in
the rest of the world, the people who deploy the edge and the people
who deploy the core may not be the same people. There may be a dozen
layers of red tape, equipment lifecycle, security features, etc, that
need to be handled before "websockets happy" stuff can be deployed
everywhere it needs to.

Please don't discount man-in-the-middle -anything- as being "easy" to deal with.

> In all cases except a man-in-the-middle proxy, this seems to be what we
> do. I'm not sure how we can do anything in the case of such a proxy, since
> by definition the client doesn't know it is present.

.. so you're still not speaking HTTP?

Ian, are you absolutely certain that everywhere you use "the
internet", there is no "man in the middle" between you and the server
you're speaking to? Haven't you ever worked at any form of corporate
or enterprise environment? What about existing "captive portal"
deployments like wifi hotspots, some of which still use squid-2.5
(eww!) as their http firewall/proxy to control access to the internet?
That stuff is going to need upgrading sure, but I'd rather see the
upgrade happen once to a well thought out and reasonably well designed
protocol, versus having lots of little upgrades need to occur because
your "HTTP but not quite HTTP" exchange on port 80 isn't thought out
enough.




Adrian


Re: [PATCH] Bug 2680: ** helper errors after -k rotate

2009-07-15 Thread Adrian Chadd
NOte that winbind has a hard coded limit that is by default very low.

Opening 2n ntlm_auth helpers may make things blow up in horrible ways.



Adrian

2009/7/16 Robert Collins :
> On Thu, 2009-07-16 at 14:08 +1200, Amos Jeffries wrote:
>>
>> Both reconfigure and helper recovery use startHelpers() where the
>> limit
>> needs to take place.
>> The DOS bug fix broke *rotate* (reconfigure has an async step added by
>> Alex
>> that prevents it being a problem).
>
> s/rotate/reconfigure then :) In my mind one is a subset of the other.
>
>> > If someone is running hundreds of helpers on openwrt/olpc then
>> things
>> > are broken already :). I'd really suggest that such environments
>> > pipeline through a single helper rather than many concurrent
>> helpers.
>> > Such platorms are single core and you'll get better usage of memory
>> > doing many requests in a single helper than one request each to many
>> > helpers.
>>
>> lol, NTLM concurrent? try it!
>
> I did. IIRC the winbindd is fully capable of handling multiple
> overlapping requests, and each NTLM helper is *solely* a thunk layer
> between squid's format and the winbindd *state*.
>
> ASCII art time, 3 requests:
> Multiple helpers:
>       /--1-helper--\
> squid-*---2-helper---* winbindd [state1, state2, state3]
>       \--3-helper--/
> One helper:
> squid-*--1-helper---* winbindd [state1, state2, state3]
>
> -Rob
>
>


Re: Hello from Mozilla

2009-07-15 Thread Adrian Chadd
2009/7/16 Ian Hickson :

>> Right down to the HTTP/1.1 reserved protocol label (do get that changed
>> please).
>
> If we're faking HTTP, then it has to look like HTTP.

The message here is "don't fake HTTP". "Speak HTTP over port 80".

> I'm getting very mixed messages here.
>
> Is there a reliable way to open a bidirectional non-HTTP TCP/IP connection
> through a Squid man-in-the-middle proxy over port 80 to a remote server
> that normally acts like an HTTP server? If so, what is the sequence of
> bytes that will act in this way?

That is the wrong question. The whole point of speaking HTTP on port
80 is to be able to speak a variety of sequence of bytes, all which
match the HTTP protocol specification, in order to get the job done.

At the point you're speaking on port TCP/80, you're not just speaking
a sequence of bytes any more. You're speaking HTTP. There are plenty
of sequences of bytes that can occur that are -semantically
identical-.


Adrian


Re: Hello from Mozilla

2009-07-15 Thread Adrian Chadd
2009/7/16 Ian Hickson :
> We actually used to do that, but we got requests to make it more
> compatible with the HTTP Upgrade mechanism so that people could add the
> support to their HTTP servers instead of having to put code in front of
> their servers.

Right. So why not extend the spec a little more to make a tunneling
based upgrade process or something over HTTP?

That way you are still speaking HTTP right until the "protocol change"
occurs, so any and all HTTP compatible changes in the path(s) will
occur.

This includes things like authentication, which I believe Henrik mentioned.

> Well, since Upgrade is a by-hop packet, apparently that's a moot point
> anyway, because man-in-the-middle proxies will always break it if they're
> present. So I'm not convinced that allowing HTTP modifications matters.

Ian, don't you see and understand the semantic difference between
"speaking HTTP" and "speaking a magic bytecode that is intended to
look HTTP-enough to fool a bunch of things until the upgrade process
occurs" ? Don't you understand that the possible set of things that
can go wrong here is quite unbounded ? Don't you understand the whole
reason for "known ports" and protocol descriptions in the first place?

> But the point is that it is a recognisable handshake and so could be
> implemented as a switch before hitting the HTTP server, or it could be
> implemented in the HTTP server itself (as some people apparently want). It
> fails with man-in-the-middle proxies, but then that's what the TLS-over-
> port-443 solution is intended for.

My suggestion is to completely toss the whole "pretend to be HTTP"
thing out of the window and look at extending or adding a new HTTP
mechanism for negotiating proper tunneling on port 80. If this
involves making CONNECT work on port 80 then so be it. The point is,
there may be a whole lot of stuff going on with HTTP implementations
that you're not aware of.

I'd rather invest my time in making certain that what you speak on
port 80 is -still HTTP- (and what you speak to proxies which are
relaying your websocket data around is also HTTP) right until a well
understood protocol upgrade occurs.

Frankly, I'm curious how the process got this far inside the
websockets community -without-having -anyone- with HTTP experience
step up and state all the reasons this is a bad, bad idea. Surely
mozilla has some smart HTTP clued up people on board? :)


Adrian


Re: Hello from Mozilla

2009-07-15 Thread Adrian Chadd
2009/7/15 Amos Jeffries :

> a) Getting a dedicated WebSocket port assigned.
>   * You and the client needing it have an argument to get that port opened
> through the firewall.
>   * Squid and other proxies can be altered to allow CONNECT through to safe
> defined ports (80 is not one). Or to do the WebSocket upgrade itself.
>
> b) accepting that the network being traversed is screwed beyond redemption
> by its own policy or admin.

I think the fundamental mistake being made here by Ian (and
potentially others) is breaking the assumption that specific protocols
exist on the well-known ports. Suddenly treating stuff on port 80 as
"almost but not quite HTTP" is bound to cause issues, both devices
speaking valid HTTP (eg Squid) and firewalls etc which may treat the
exchange as "not HTTP" and decide to start dropping things. Or worse -
passing it through, "sort of".

Ian - I understand your motivations here but I think it shows a
fundamental mis-understanding of the glue which keeps the internet
mostly functioning together. Here's a question for you - would you run
a mythical protocol, call it "foonet", over IP, if it looked
almost-but-not-quite like IP so people could run it on their existing
IP networks? Can you see any particular issues with that? Other slots
in the mythical OSI stack shouldn't be treated any differently.


Adrian


Re: Hello from Mozilla

2009-07-14 Thread Adrian Chadd
2009/7/15 Ian Hickson :
> On Tue, 14 Jul 2009, Alex Rousskov wrote:
>>
>> WebSocket made the handshake bytes look like something Squid thinks it
>> understands. That is the whole point of the argument. You are sending an
>> HTTP-looking message that is not really an HTTP message. I think this is
>> a recipe for trouble, even though it might solve some problem in some
>> environments.
>
> Could you elaborate on what bytes Squid thinks it should change in the
> WebSocket handshake?

Anything which it can under the HTTP/1.x RFCs.

Maybe I missed it - why exactly again aren't you just talking HTTP on
the HTTP port(s), and doing a standard HTTP upgrade?


Adrian


squid-2.HEAD hanging with 304 not modified responses

2009-06-29 Thread Adrian Chadd
G'day guys,

I've fixed a bug in Lusca which was introduced with Benno's method_t
stuff. The specific bug is revalidation replies 'hanging' until the
upstream socket closes, forcing an end of message to occur.

The history and patch are here:
http://code.google.com/p/lusca-cache/source/detail?r=14103

Those of you toying with Squid-2.HEAD (eg Mark) - would you mind
verifying that you can reproduce it on Squid-2.HEAD and comment on the
fix?

Thanks,


adrian


NTLM authentication popups, etc

2009-06-16 Thread Adrian Chadd
I'm working on a couple of paid squid + active directory deployments
and they're both seeing the occasional NTLM auth popup happening.

The workaround is pretty simple - just enable the IP auth cache. This
however doesn't solve the fundamental problem(s), whatever they are.

The symptom is logs like this:

[2009/06/15 16:20:17, 1] libsmb/ntlmssp.c:ntlmssp_update(334)
  got NTLMSSP command 1, expected 3

And vice versa (expected 3, got 1.) These correspond to states in
samba/source/include/ntlmssp.h - 1 is NTLMSSP_NEGOTIATE; 3 is
NTLMSSP_AUTH.

The conclusion here is that there's a disconnect between the
authentication state of the client -and- the authentication state of
ntlm_auth.

I'm trying to eliminate the possibilities here.

The stateful helper stuff seems correct enough, so requests aren't
being queued to already busy stateful helpers.

The other two possibilities I can immediately think of:

* 1 - authentication is aborted somewhere for whatever reason; an
authentication helper is stuck at the wrong point in the state engine;
the next request coming along starts at NTLMSSP_NEGOTIATE but the
ntlm_auth helper it is handed to is at NTLMSSP_AUTH (from the partial
authentication attempt earlier); error
* 2 - the web browser is stuffing different phases of the negotiation
down different connections to the proxy.

Now, debugging (1) shouldn't be difficult at all. I'm going to try and
determine the code paths that lead to and from an aborted auth
request, add in some debugging and see if the helper is closed.

Debugging (2) without full logs (impractical in this environment) and
full traffic dump (again, impractical in production) is going to be a
bit more difficult. I'm thinking about adding some hacky code to the
Squid ntlm auth class which keeps a log of the auth blobs
sent/received from/to the client and ntlm_auth. I can then dump the
entire conversation out to cache.log whenever authentication
fails/errors. This should at least give me a hint as to what is going
on.

(1) can explain the client state == NTLMSSP_NEGOTATE but ntlm_auth
state is NTLMSSP_AUTH problem but not vice versa. (2) explains both.
It is quite possible it is the combination of both however.

Now, the reason this is getting somewhat annoying and why I'd like to
try and understand/fix it is that -another- problem seen by one of
these clients is negotiate/ntlm authentication from IE (at least IE8)
through Squid. I've got packet dumps showing the browser sending
different phases of the negotiation down separate proxy connections
and then reusing the original one incorrectly. My medium term plan is
to take whatever evidence I have of this behaviour and throw it at the
IE group(s) at Microsoft but in the short term I'd like to make
certain the proxy authentication side of things is completely
blameless before I hand off stuff to third parties.

Ideas? Comments?



adrian


Re: Very odd problem running squid 2.7 on Windows

2009-05-25 Thread Adrian Chadd
Actually, it should probably be 1 vs 0; -1 still evaluates to true if
you go (if func()).

 I think the last few hours of fixing bad C and putting in error
checking messed me around a bit. I just wrote a quick bit of C to
double-check.

(Of course in C++ there's native bool types, no? :)

Sorry!


Adrian

2009/5/25 Kinkie :
> On Mon, May 25, 2009 at 2:21 PM, Adrian Chadd  wrote:
>> int
>> isUnsignedNumeric(const char *str)
>> {
>>    for (; *str; str++) {
>>        if (! isdigit(*str))
>>            return -1;
>>    }
>>    return 1;
>> }
>
> Wouldn't returning 0 on false instead of -1 be easier?
> Just a random thought..
>
>
> --
>    /kinkie
>
>


Re: Very odd problem running squid 2.7 on Windows

2009-05-25 Thread Adrian Chadd
int
isUnsignedNumeric(const char *str)
{
for (; *str; str++) {
if (! isdigit(*str))
return -1;
}
return 1;
}

2009/5/25 Adrian Chadd :
> strtoul(). But if you want to verify the -whole- thing is numeric,
> just write a bit of C which does this:
>
> int isNumeric(const char *str)
> {
>
> }
>
> 2009/5/25 Amos Jeffries :
>> Guido Serassio wrote:
>>>
>>> Hi,
>>>
>>> At 16.17 24/05/2009, Adrian Chadd wrote:
>>>>
>>>> Well as Amos said, this isn't the way to call getservbyname().
>>>>
>>>> getservbyname() doesn't translate ports to ports; it translates
>>>> tcp/udp service names to ports. It should be returning NULL if it
>>>> can't find the service string in the file.
>>>>
>>>> Methinks numeric values shouldn't be handed to getservbyname() under
>>>> Windows. :)
>>>
>>> So, we have just found a Squid bug  :-)
>>>
>>> Regards
>>
>> Yes. Question becomes though, fastest way to detect numeric-only strings.
>>
>> Amos
>> --
>> Please be using
>>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
>>  Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1
>>
>>
>


Re: Very odd problem running squid 2.7 on Windows

2009-05-25 Thread Adrian Chadd
strtoul(). But if you want to verify the -whole- thing is numeric,
just write a bit of C which does this:

int isNumeric(const char *str)
{

}

2009/5/25 Amos Jeffries :
> Guido Serassio wrote:
>>
>> Hi,
>>
>> At 16.17 24/05/2009, Adrian Chadd wrote:
>>>
>>> Well as Amos said, this isn't the way to call getservbyname().
>>>
>>> getservbyname() doesn't translate ports to ports; it translates
>>> tcp/udp service names to ports. It should be returning NULL if it
>>> can't find the service string in the file.
>>>
>>> Methinks numeric values shouldn't be handed to getservbyname() under
>>> Windows. :)
>>
>> So, we have just found a Squid bug  :-)
>>
>> Regards
>
> Yes. Question becomes though, fastest way to detect numeric-only strings.
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
>  Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1
>
>


Re: Very odd problem running squid 2.7 on Windows

2009-05-24 Thread Adrian Chadd
Well as Amos said, this isn't the way to call getservbyname().

getservbyname() doesn't translate ports to ports; it translates
tcp/udp service names to ports. It should be returning NULL if it
can't find the service string in the file.

Methinks numeric values shouldn't be handed to getservbyname() under Windows. :)

adrian

2009/5/24 Guido Serassio :
> Hi,
>
> At 04.38 24/05/2009, Adrian Chadd wrote:
>>
>> Can you craft a small C program to replicate the behaviour?
>
> Sure, I wrote the following test program:
>
> #include 
> #include 
>
> void main(void)
> {
>    u_short i, converted;
>    WSADATA wsaData;
>    struct servent *port = NULL;
>    char token[32];
>    const char proto[] = "tcp";
>
>    WSAStartup(2, &wsaData);
>
>    for (i=1; i<65535; i++)
>    {
>        sprintf(token, "%d", i);
>        port = getservbyname(token, proto);
>        if (port != NULL) {
>            converted=ntohs((u_short) port->s_port);
>            if (i != converted)
>                printf("%d %d\n", i, converted);
>       }
>    }
>    WSACleanup();
> }
>
> And this is the result on my Windows XP x64 machine (similar results on
> Windows 2000 and Vista):
>
> 2 512
> 258 513
> 524 3074
> 770 515
> 782 3587
> 1288 2053
> 1792 7
> 1807 3847
> 2050 520
> 2234 47624
> 2304 9
> 2311 1801
> 2562 522
> 2564 1034
> 2816 11
> 3328 13
> 3586 526
> 3853 3343
> 4352 17
> 4354 529
> 4610 530
> 4864 19
> 4866 531
> 5120 20
> 5122 532
> 5376 21
> 5632 22
> 5888 23
> 6400 25
> 7170 540
> 7938 543
> 8194 544
> 8706 546
> 8962 547
> 9472 37
> 10752 42
> 10767 3882
> 11008 43
> 11266 556
> 12054 5679
> 13058 563
> 13568 53
> 13570 565
> 13579 2869
> 14380 11320
> 14856 2106
> 15372 3132
> 15629 3389
> 16165 9535
> 16897 322
> 17920 70
> 18182 1607
> 18183 1863
> 19977 2382
> 20224 79
> 20233 2383
> 20480 80
> 20736 81
> 20738 593
> 21764 1109
> 22528 88
> 22550 5720
> 22793 2393
> 23049 2394
> 23809 349
> 24335 3935
> 25602 612
> 25856 101
> 25858 613
> 26112 102
> 27392 107
> 27655 1900
> 27904 109
> 28160 110
> 28416 111
> 28928 113
> 29952 117
> 30208 118
> 30222 3702
> 30464 119
> 31746 636
> 34049 389
> 34560 135
> 35072 137
> 35584 139
> 36106 2701
> 36362 2702
> 36608 143
> 36618 2703
> 36874 2704
> 37905 4500
> 38400 150
> 38919 1944
> 39173 1433
> 39426 666
> 39429 1434
> 39936 156
> 39945 2460
> 40448 158
> 42250 2725
> 43520 170
> 44806 1711
> 45824 179
> 45826 691
> 47383 6073
> 47624 2234
> 47873 443
> 47878 1723
> 48385 445
> 49166 3776
> 49664 194
> 49926 1731
> 50188 3268
> 50437 1477
> 50444 3269
> 50693 1478
> 51209 2504
> 52235 3020
> 53005 3535
> 53249 464
> 53510 1745
> 54285 3540
> 55309 3544
> 56070 1755
> 56579 989
> 56585 2525
> 56835 990
> 57347 992
> 57603 993
> 57859 994
> 58115 995
> 59397 1512
> 60674 749
> 62469 1524
> 62980 1270
> 64257 507
> 65040 4350
>
> It seems that sometime (!!!) getservbyname() will incorrectly return
> something ...
>
> Regards
>
> Guido
>
>
>> adrian
>>
>> 2009/5/24 Guido Serassio :
>> > Hi,
>> >
>> > One user has reported a very strange problem using cache_peer directive
>> > on
>> > 2.7 STABLE6 running on Windows:
>> >
>> > When using the following config:
>> >
>> > cache_peer 192.168.0.63 parent 3329 0 no-query
>> > cache_peer rea.acmeconsulting.loc parent 3328 3130
>> >
>> > the result is always:
>> >
>> > 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0
>> > 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent
>> > rea.acmeconsulting.loc/13/3130
>> >
>> > Very odd 
>> >
>> > Debugging the code, I have found where is situated the problem.
>> >
>> > The following if GetService() from cache_cf.c:
>> >
>> > static u_short
>> > GetService(const char *proto)
>> > {
>> >    struct servent *port = NULL;
>> >    char *token = strtok(NULL, w_space);
>> >    if (token == NULL) {
>> >        self_destruct();
>> >        return -1;              /* NEVER REACHED */
>> >    }
>> >    port = getservbyname(token, proto);
>> >    if (port != NULL) {
>> >        return ntohs((u_short) port->s_port);
>> >    }
>> >    return xatos(token);
>> > }
>> >
>> > When the value of port->s_port is 3328, ntohs() always returns 13.
>> > Other values seems to work fine.
>> >
>> > Any idea ?
>> >
>> > Regards
>> >
>> > Guido
>> >
>> >
>> >
>> > -
>> > 
>> > Guido Serassio
>> > Acme Consulting S.r.l. - Microsoft Certified Partner
>> > Via Lucia Savarino, 1           10098 - Rivoli (TO) - ITALY
>> > Tel. : +39.011.9530135  Fax. : +39.011.9781115
>> > Email: guido.seras...@acmeconsulting.it
>> > WWW: http://www.acmeconsulting.it/
>> >
>> >
>
>
> -
> 
> Guido Serassio
> Acme Consulting S.r.l. - Microsoft Certified Partner
> Via Lucia Savarino, 1           10098 - Rivoli (TO) - ITALY
> Tel. : +39.011.9530135  Fax. : +39.011.9781115
> Email: guido.seras...@acmeconsulting.it
> WWW: http://www.acmeconsulting.it/
>
>


Re: Very odd problem running squid 2.7 on Windows

2009-05-23 Thread Adrian Chadd
Can you craft a small C program to replicate the behaviour?






adrian

2009/5/24 Guido Serassio :
> Hi,
>
> One user has reported a very strange problem using cache_peer directive on
> 2.7 STABLE6 running on Windows:
>
> When using the following config:
>
> cache_peer 192.168.0.63 parent 3329 0 no-query
> cache_peer rea.acmeconsulting.loc parent 3328 3130
>
> the result is always:
>
> 2009/05/23 12:35:28| Configuring 192.168.0.63 Parent 192.168.0.63/3329/0
> 2009/05/23 12:35:28| Configuring rea.acmeconsulting.loc Parent
> rea.acmeconsulting.loc/13/3130
>
> Very odd 
>
> Debugging the code, I have found where is situated the problem.
>
> The following if GetService() from cache_cf.c:
>
> static u_short
> GetService(const char *proto)
> {
>struct servent *port = NULL;
>char *token = strtok(NULL, w_space);
>if (token == NULL) {
>self_destruct();
>return -1;  /* NEVER REACHED */
>}
>port = getservbyname(token, proto);
>if (port != NULL) {
>return ntohs((u_short) port->s_port);
>}
>return xatos(token);
> }
>
> When the value of port->s_port is 3328, ntohs() always returns 13.
> Other values seems to work fine.
>
> Any idea ?
>
> Regards
>
> Guido
>
>
>
> -
> 
> Guido Serassio
> Acme Consulting S.r.l. - Microsoft Certified Partner
> Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
> Tel. : +39.011.9530135  Fax. : +39.011.9781115
> Email: guido.seras...@acmeconsulting.it
> WWW: http://www.acmeconsulting.it/
>
>


Re: Is it really necessary for fatal() to dump core?

2009-05-19 Thread Adrian Chadd
2009/5/19 Mark Nottingham :
> I'm going to push back on that; the administrator doesn't really have any
> need to get a core when, for example, append_domain doesn't start with .'.
>
> Squid.conf is bloated as it is; if there are cases where a core could be
> conceivably useful, they should be converted to fatal_dump. From what I've
> seen they'll be a small minority at best...

Well, I'd be interested in seeing some better defined characteristics
of "stuff" with some sort of defined expectations and behaviour. Like
an API. :)

Right now, fatal, assert, etc are all used interchangably for quite a
wide variety of reasons and the codebase may be much better off if
someone starts off by fixing these a bit.




Adrian


Re: Is it really necessary for fatal() to dump core?

2009-05-18 Thread Adrian Chadd
just make that behaviour configurable?

core_on_fatal {on|off}



Adrian

2009/5/19 Mark Nottingham :
> tools.c:fatal() dumps core because it calls abort.
>
> Considering that the core can be quite large (esp. on a 64bit system), and
> that there's fatal_dump() as well if you really want one, can we just make
> fatal() exit(1) instead of abort()ing?
>
> Cheers,
>
> --
> Mark Nottingham       m...@yahoo-inc.com
>
>
>


Re: 3.0 assertion in comm.cc:572

2009-05-11 Thread Adrian Chadd
2009/5/11 Amos Jeffries :
> We have one user with a fairly serious production machine hitting this
> assertion.
> It's an attempted comm_read of closed FD after reconfigure.
>
> Nasty, but I think the asserts can be converted to a nop return. Does anyone
> know of a subsystem that would fail badly after a failed read with all its
> sockets and networking closed anyway?

That will bite you later on if/when you wanted to move to support
Windows overlapped IO / POSIX AIO style kernel async IO on network
sockets. You don't want read's scheduled on FDs that are closed; nor
do you want the FD closed during the execution of the read.

Figure out what is scheduling a read / what is scheduling the
completion incorrectly and fix the bug.



Adrian


Re: Squid logs into MySQL database

2009-05-11 Thread Adrian Chadd
G'day!

Thanks for that. Would you like to have it included in Squid-2.HEAD
(and thus in the next Squid-2.x release?)

thanks,


Adrian


2009/5/11 Visolve Squid Team :
> Hi All,
>
> We have released an earlier version of an external program( plug-in ) to log
> squid access to MySQL database using logfile_daemon feature in squid 2.7.
> The plug-in is available at :
> http://www.visolve.com/squid/squid-mysqllog.php
>
> Do send your comments for the improvement.
>
> Thanks,
> ViSolve Squid Team.
>
>


/dev/poll solaris 10 fixes

2009-05-03 Thread Adrian Chadd
I'm giving my /dev/poll (Solaris 10) code a good thrashing on some
updated Sun hardware. I've fixed one silly bug of mine in 2.7 and
2.HEAD.

If you're running Solaris 10 and not using the /dev/poll code then
please try out the current CVS version(s) or wait for tomorrow's
snapshots.

I'll commit whatever other fixes are needed in this environment here :)

Thanks,


Adrian


Squid-2/Lusca async io shortcomings..

2009-04-10 Thread Adrian Chadd
Hi all,

I've been braindumping my thoughts into the Lusca blog during some
experimental development to eliminate the data copy in the disk store
read path. This shows up as the number 1 CPU abuser in my test CDN
deployment - where I see a 99% hit rate on a set of large objects (>
16meg.)

My first idea was to avoid having to paper over the storage code
shortcomings with refcounted buffers, and modify various bits of code
to keep the store supplied read buffer around until the completion of
said read IO. This mirrors the requirements for various other
underlying async io implementations such as posix AIO and windows
completion IO.

Unfortunately the store layer and the async IO code doesn't handle
event cancellation right (ie, you can't do it) but the temporary read
buffer in async_io.c + the callback data pointer check papers over
that. Store reads and writes may be scheduled and in flight when some
other part of code calls storeClose() and nothing really tries to wait
around for the read IO to complete.

So either the store layer needs to be made slightly more sane (which I
may attempt later), or the whole mess can stay a mess and be papered
over by abusing refcounted buffers all the way down to the IO layer.

Anyway, I know there are other developers out there working on
filesystem code for Squid-3 and I'm reasonably certain (read: at last
check a few months ago) the store layer and IO layers are just as
grimey - so hopefully my braindumping will save some more of you a
whole lot of headache. :)




Adrian


Re: Feature: quota control

2009-03-27 Thread Adrian Chadd
Just to add to this - implementing it as a delay pool inside Squid
flattens traffic into one byte pool. Various places may not do this at
all - there may be "free" versus "non-free" (which means one set of
ACLs inside Squid); there may be "cheap" versus "expensive" (again,
possibly requiring multiple delay pools and multiple ACLs to map it
all together; again all inside Squid) - things get very messy, very
quickly.

This is why my proposal (which I hope -finally- gets approved so I can
begin work on it ASAP! :) involves passing off the traffic assignment
to an external daemon that implements -all- of the traffic assignment
and accounting logic. Squid will then just send requests for traffic
and interim updates like you've said.

2c,


Adrian

2009/2/26 Amos Jeffries :
> Robert Collins wrote:
>>
>> On Fri, 2009-02-27 at 10:00 +1100, Mark Nottingham wrote:
>>>
>>> Honestly, if I wanted to do byte-based quotas today, I'd have an
>>>  external ACL helper talking to an external logging helper; that way,  you
>>> can just log the response sizes to a daemon and then another  daemon would
>>> use that information to make a decision at access time.  The only even
>>> mildly hard part about this is sharing state between the  daemons, but if
>>> you don't need the decisions to be real-time, it's not  that bad (especially
>>> considering that in any serious deployment,  you'll have state issues
>>> between multiple boxes anyway).
>>
>> Sure; I think that would fit with 'ensuring enough hooks' :P
>>
>> -Rob
>
> The brief description of what I gave Pieter to start with was:
>
>  A pool based on DelayPools in that Squid decrements live as traffic goes
> through. With a helper/ACL hook to retrieve the initial pool size and to
> call as needed to check for current quotas.
>
>  How the helper operates is not relevant to Squid. Thats important.
>
> The key things being that; its always called for new visitors to assign the
> start quota, and when the quota is nearing empty its called again to see if
> they get more.
>
> Helper would need to send back "UNITS AMOUNT MINIMUM" where UNITS is the
> unit of quota (seconds, bytes, requests, misses?, other?), AMOUNT being a
> integer count of units the client is allowed to use, and MINIMUM is the
> level of units where the helper is to be asked for an update.
>
> 0 remaining units results in an Error page 'quota exceeded' or somesuch.
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13
>  Current Beta Squid 3.1.0.5
>
>


Re: Feature: quota control

2009-02-26 Thread Adrian Chadd
I'm looking at implementing this as part of a contract for squid-2.

I was going to take a different approach - that is, i'm not going to
implement quota control or management in squid; I'm going to provide
the hooks to squid to allow external controls to handle the "quota".



adrian

2009/2/21 Pieter De Wit :
> Hi Guys,
>
> I would like to offer my time in working on this feature - I have not done
> any squid dev, but since I would like to see this feature in Squid, I
> thought I would take it on.
>
> I have briefly contacted Amos off list and we agreed that there is no "set
> in stone" way of doing this. I would like to propose that we then start
> throwing around some ideas and let's see if we can get this into squid :)
>
> Some ideas that Amos quickly said :
>
>   - "Based" on delay pools
>   - Use of external helpers to track traffic
>
>
> The way I see this happening is that a Quota is like a pool that empties
> based on 2 classes - bytes and requests. Requests will be for things like
> the number of requests, i.e. a person is only allowed to download 5 exe's
> per day or 5 requests of >1meg or something like that (it just popped into
> my head :) )
>
> Bytes is a pretty straight forward one, the user is only allowed x amount of
> bytes per y amount of time.
>
> Anyways - let the ideas fly :)
>
> Cheers,
>
> Pieter
>
>


Resigning from squid-core

2009-01-31 Thread Adrian Chadd
Hi all,

It's been a tough decision, but I'm resigning from any further active
role in the Squid core group and cutting back on contributing towards
Squid development.

I'd like to wish the rest of the active developers all the best in the
future, and thank everyone here for helping me develop and test my
performance and feature related Squid work.



Adrian


Re: IRC Meetup logs up in the wiki

2009-01-21 Thread Adrian Chadd
> Uhm, guess I go on holiday and miss out on EVERYTHING I got back on the
> 17th and would have loved to attend had I the precence of mind to have
> checked.

:) Hey, someone got a holiday! Quick, he's relaxed enough now to work! :)

>
> Sorry guys.
>
> In other news I've got some new exposed counters for squid-2 performance -
> will port to 3.1 and then submit for review. Also planning to extend
> cachemgr to output in xml as alternative, will allow far simpler processing
> and xsl transforms.

Do you have the patches against Squid-2 available?


adrian

>
> Extended cacti monitoring of all relevant bits is in process and will be
> available soon.
>
> Regardt
>
>


Re: Ref-counted strings in Squid-2/Cacheboy

2009-01-21 Thread Adrian Chadd
I'd like to avoid having to write to those pages if possible. Leaving
the incoming data as read-only will save another write-back pass for
those pages through the cache/bus, and in the case of tiny objects
(ie, where parsing becomes a -big- part of the overhead), that may end
up hurting.

NUL terminated strings make iteration easier (you only need an address
register and a check for 0) but current CPUs with their
plenty-of-registers and superscalar execution mostly make that point
moot. You can check, increment the pointer and decrement a length
value pretty damned quickly. :)

There aren't all that many places that assume C buffer semantics for
String. Most of it isn't all that hairy (access_log, etc); some of it
is only hairy because of the use of _C_ string library functions with
String.buf() (ftp); the biggest annoyance is the vary code and the
client-side code. Oh, and one has to copy the buffer anyway for regexp
lookups (POSIX regex API requires a NUL terminated string), at least
until we convert to PCRE which can and does take a length parameter to
a regex run function. :)

The point is, once you've been forced to tidy up the String users by
removing the assumption that NUL will occur, you'll (hopefully) have
been forced to write nicer replacement code, and everyone benefits
from that.


Adrian

2009/1/21 Henrik Nordstrom :
> fre 2009-01-16 klockan 12:53 -0500 skrev Adrian Chadd:
>
>> So far, so good. It turns out doing this as an intermediary step
>> worked out better than trying to replace the String code in its
>> entirety with replacement code which doesn't assume NUL terminated
>> strings.
>
> Just a thought, but is there really any parsing step where we can not
> just overwrite the next octet with a \0 to get null-terminated strings?
> This is what the parser does today, right?
>
> The HTTP parser certainly can in-place null-terminate everything. Header
> names always ends with a : which we always throw away, and the data ends
> with a newline which is also thrown away.
>
> Regards
> Henrik
>
>


Re: Buffer/String split, take2

2009-01-21 Thread Adrian Chadd
2009/1/21 Kinkie :

> What I fear from the D&C approach is that we'll end up with lots of
> duplicate code between the 'buffer' classes, to gain a tiny little bit
> of efficiency and semantic clarity. If that approach has to be taken,
> then I'd rather take the variant of the note - in fact that's quite in
> line with what the current (agreeably ugly) code does.

The trouble is that the current, agreeably ugly code, actually works
(for values of "works") right now, and the last thing the project
needs is for that "works" bit to be disturbed too much.

> In my opinion the 'universal buffer' model can be adapted quite easily
> to address different uses by extending its allocation strategy - it's
> a self-contained function of code exactly for this purpose, and it
> could be extended again by using Strategy patterns to do whatever the
> caller wishes. It would be trivial for instance for users to request
> that the underlying memory be allocated by the pageful, or to request
> preallocation of a certain amount of memory if they know they'll be
> using, etc.
> Having a wide interface is a drawback of the Universal approach,

But you don't know how that memory should be arranged. If its just for
strings, then you know the memory should be arranged in whatever makes
sense to minimise memory allocator overheads. In the parsing codepath,
that involves parsing and creating references to an already-allocated
large chunk of RAM, instead of copying into separately allocated
areas. For things like disk IO (and later on, network IO too!) this
may not be as obvious a case. In fact, based on the -provider-
(anonymous? disk? network? some peer module?) you may want to request
pages from -them- to put data into for various reasons, as simply
grabbing an anonymous page from the system allocator and filling it
with data may need -another- copy step later on.

This is why I'm saying that right now, focusing on -just- the String
stuff and the minimum required to do copy-free parsing and copying in
and out of the store is probably the best bet. A "universal" buffer
method is probably over-reaching things. There's a lot of code in
Squid which needs tidying up and whatever we come up and -all- of it
-has- to happen -regardless- of what buffer abstraction(s) we choose.

> Regarding vector i/o, it's almost a no-brainer at a first glance:
> given UniversalBuffer, implement UniversalBufferList and make MemBuf
> use the latter to implement producer-consumer semantics. Then use this
> for writev(). produce and consume become then extremely lightweight
> calls. Let me remind you that currently MemBuf happily memmoves
> contents at each consume, and other producer-consumer classes I could
> find (BodyPipe and StoreEntry) are entirely different beasts, which
> would benefit from having their interfaces changed to use
> UniversalBuffers, but probably not their innards.

And again, what I'm saying here is that a conservative, cautious
approach now is likely to save a lot of risk in the development path
forward.

> Regarding Adrian's proposal, he and I discussed the issue extensively.
> I don't agree with him that the current String will give us the best
> long-term benefits. My expectation is (but we can only know after we
> have at least some extensive use of it) that the cheap substringing
> features of the current UniversalBuffer implementation will give us
> substantial benefits in the long term.
> I agree with him that fixing the most broken parts of the String
> interface is a sensible strategy for merging whatever String
> implementation we end up choosing.

> I fear that if we focus too much on the long-term, we may end up
> losing sight of the medium-term, and thus we risk reaching neither
> because short-term noone does anything. EVERYONE keeps on asserting
> that squid (2 and 3) has low-level issues to be fixed, yet at the same
> time only Adrian does something in squid-2, and I feel I'm the only
> one trying to do something in squid-3 - PLEASE correct me and prove me
> wrong.

*shrug* I think people keep choosing the wrong bits to bite off. I'm
not specifically talking about you Kinkie, this certainly isn't the
only instance where the problem isn't really fully understood.

The problem in my eyes is that noone understands the entire Squid-3
codebase enough to start to understand what needs to happen and begin
engineering an actual path forward. Everyone knows their little
"corner" of the codebase. Squid-3 seems to be plagued by little
mini-projects which focus on specific areas without much knowledge of
how it all holds together, and all kinds of busted behaviour ensues.

> There's another issue which worries me: the current implementation has
> been in the works for 5 months; there have been two extensive reviews,
> two half-rewrites and endless discussions. Now the issue crops up that
> the basic design - whose blueprint has also been available for 5
> months in the wiki - is not good, and that we may end up having to
> basical

Re: Buffer/String split, take2

2009-01-20 Thread Adrian Chadd
2009/1/20 Alex Rousskov :

> Please voice your opinion: which design would be best for Squid 3.2 and
> the foreseeable future.

[snip]

I'm about 2/3rds of the way along the actual implementation path of
this in Cacheboy so I can provide an opinion based on increasing
amounts of experience. :)

[Warning: long, somewhat rambly post follows, from said experience.]

The thing I'm looking at right now is what buffer design is required
to adequately handle the problem set. There's a few things which we
currently do very stupidly in any Squid related codebase:

* storeClientCopy - which Squid-2.HEAD and Cacheboy avoid the copy on,
but it exposes issues (see below);
* storeAppend - the majority of data coming -into- the cache (ie,
anything from an upstream server, very applicable today for forward
proxies, not as applicable for high-hit-rate reverse proxies) is still
memcpy()'ed, and this can use up a whole lot of bus time;
* creating strings - most strings are created during parsing; few are
generated themselves, and those which are, are at least half static
data which shouldn't be re-generated over and over and over again;
* duplicating strings - httpHeaderClone() and friends - dup'ing
happens quite often, and making it cheap for the read only copies
which are made would be fantastic
* later on, being able to use it for disk buffers, see below
* later on, being able to properly use it for the memory cache, again see below

The biggest problems I've hit thus far stem from the data pipeline
from server -> memstore -> store client -> client side. At the moment,
the storeClientCopy() call aggregates data across the 4k stmem page
size (at least in squid-2/cacheboy, I think its still 4k in squid-3)
and thus if your last access gave you half a page, your next access
can get data from both the other half of the page and whatever is in
the next buffer. Just referencing the stmem pages in 2.HEAD/Cacheboy
means that you can (and do) end up with a large number of small reads
from the memory store. You save on the referencing, but fail on the
"work chunk size." You end up having to have a sensible reference
counted buffer design -and- a vector list to operate on it with.

The string type right now makes sense if it references a contiguous,
linear block of memory (ie, a sub-region of a contig buffer). This is
how its treated today. For almost all of the lifting inside Squid
proper, that may be enough. There may however be a need later on for
string-like and buffer-like operations on buffer -vectors- - for
example, if you're doing some kind of content scanning over incoming
data, you may wish to buffer your incoming data until you have enough
data to match that string which is straddling two buffers - and the
current APIs don't support it. Well, nothing in Squid supports it
currently, but I think its worth thinking about for the longer term.

Certainly though, I think that picking a sensible string API with
absolutely no direct buffer access out of a few controlled areas (eg,
translating a list of strings or list of buffers into an iovec for
writev(), for example) is the way to go. That will equip Squid with a
decent enough set of tools to start converting everything else which
currently uses C strings over to using Squid Strings and eventually
reap the benefits of the zero-cost string duplication.

Ok, to summarise, and this may not exactly be liked by the majority of
fellow developers:

I think the benefits that augmenting/fixing the current SquidString
API and tidying up all the bad places where its used right now is
going to give you the maximum long-term benefit. There's a lot of
legacy code right now which absolutely needs to be trashed and
rewritten. I think the smartest path forward is to ignore 95% of the
decision about deciding which buffering method to use for now, fix the
current String API and all the code which uses it so its sensible (and
fixing it so its "sensible" won't take long; fixing the code which
uses it will take longer) and at that point the codebase will be in
much better shape to decide which will be the better path forward.

Now, just so people don't think I'm stirring trouble, I've gone
through this myself in both a squid-2 branch and Cacheboy, and here's
what I found:

* there's a lot of code which uses C strings created from Strings;
* there's a lot of code which init'ed strings from C strings, where
the length was already known and thrown out;
* there's a lot of code which init'ed strings from C strings which
were once Strings;
* there's even code which init's strings -from- a string, but only by
using strBuf(s) (I'm pointing at the http header related code here,
ugh)
* all the stuff which directly accesses the string buffer code can and
should be tossed, immediately - unfortunately there's a lot of it, the
majority being in what I gather is very long-lived code in
src/client_side.c (and what it became in squid-3)

So what I'm sort of doing now in Cacheboy-head, combined with tidying
up some of

Ref-counted strings in Squid-2/Cacheboy

2009-01-16 Thread Adrian Chadd
I've just created a branch off of my Cacheboy tree and dumped in the
first set of changes relating to ref-counted strings into it.

They're not as useful and flexible as the end-goal we all want -
specifically, this pass just creates ref counted NUL-terminated C
strings, so creating references of regions of other strings / buffers
isn't possible. But it does mean that duplicating header sets (ie,
httpHeaderClone() I think?) becomes bloody cheap. The next move -
removing the requirement for the NUL-termination - is slightly hairer,
but still completely doable (and I've done it in a previous branch in
sourceforge, so I know whats required.) Thats when the real benefits
start to appear.

So far, so good. It turns out doing this as an intermediary step
worked out better than trying to replace the String code in its
entirety with replacement code which doesn't assume NUL terminated
strings.

http://code.google.com/p/cacheboy/source/list?path=/branches/CACHEBOY_HEAD_strref

This, and all the other gunk thats gone into cacheboy over the last
few months during the reorganisation and tidyup, still mostly
represents where I think Squid core codebase should have gone / should
be going at the present time.

Enjoy. :)


Adrian


Re: [PATCH] WCCPv2 documentation and cleanup for bug 2404

2009-01-10 Thread Adrian Chadd
Have you tested these changes against various WCCPv2 implementations?

I do recall some structure definitions in the draft mis-matching the
wide number of IOS versions out there, this is why I'm curious.



Adrian

2009/1/10 Amos Jeffries :
> This patch:
>  - adds a reference to each struct mentioning the exact draft
>   RFC section where that struct is defined.
>  - fixes sent mask structure fields to match draft. (bug 2404)
>  - removes two duplicate useless structs
>
> Submitting as a patch to give anyone interested time to double-check the
> code changes.
>
>
> As a result we are a step closer toward splitting the code into a separate
> library. It's highlighted some of the WCCPv2 issues and a pathway forward
> now clear:
>  - move types definitions to a protocol types header (wccp2_types.h ?)
>  - correct mangled definitions for generic use. including code in that.
>  - add capability handling
>  - add hash/mask service negotiation
>  - add sibling peer discovery through WCCP group details ??
>
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11
>  Current Beta Squid 3.1.0.3
>


Re: When can we make Squid using multi-CPU?

2009-01-07 Thread Adrian Chadd
2009/1/8 Alex Rousskov :

> SMP support has been earmarked for Squid v3.2 but there is currently not
> enough resources to make it happen (AFAICT) so it may have to wait until
> v3.3 or later.
>
> FWIW, I think that multi-core scalability in many environments would not
> require another Squid rewrite, especially if initial support does not
> have to do better than running multiple Squids.

Well, people are already doing that where its suitable. Whats really
missing for those sorts of setups is a simple(!) storage-only backend
and some smarts in Squid to be able to push and pull stuff out of a
shared storage backend, rather than relaying through it.

The trouble, as I've found here, is if you're trying to aggregate a
bunch of forward proxy squid instances on one box through one backend
squid instance - all of a sudden you end up with lots of RAM wastage
and things die at high loads with all the duplicate data floating
around in socket buffers. :/



Adrian


Re: When can we make Squid using multi-CPU?

2009-01-05 Thread Adrian Chadd
I've been looking into what would be needed to thread squid as part of
my cacheboy squid-2 fork.

Basically, I've been working on breaking out a bunch of the core code
into libraries, which I can then check and verify are thread-safe. I
can then use these bits in threaded code.

My first goal was probably to break out the ACL and internal URL
rewriter code into threads, but the current use of the callback data
setup in Squid makes passing cbdata pointers into other threads quite
uhm, "tricky".

The basic problem is that although a given chunk of memory backing a
cbdata pointer will remain valid for as long as the reference exists,
the -data itself- may not be valid at any point. So if thread A
creates a cbdata pointer and passes it into thread B to do something
(say an ACL lookup), there's no way (at the moment) for thread B to
guarantee at any/all points during its execution that the data in B
will stay valid without a whole lot of pissing around with locking,
which I'd absolutely like to avoid doing in a high performance network
application even the apparent wonderful performance current hardware
has w/ lots of locking. :)

So for the time being, I'm looking at what would be needed for a basic
inter-thread "batch" event/callback message queue, sort of like
AsyncCalls in squid-3 but minus 100% of the legacy cruft; and then
I'll see what kind of tasks can be pushed out to the threads.

Hopefully a bunch of stuff can be easily pushed out to threads with a
minimum amount of effort, such as some/all of the ACL lookups, some
URL rewriting, some GZIP and other kind of basic content mainpulation,
and the freakishly simple (comparitively) server-side HTTP code
(src/http.c). But doing that requires making sure a bunch of the low
level code is suitably re-enterant/thread-safe/etc, and this includes
a -lot- of stuff (lib/, debug, logging, memory allocation, some
statistics gathering, chunks of the HTTP parsing and packing routines,
the packer routines, membufs, etc.)

Thankfully (in Cacheboy) I've broken out almost all of the needed
stuff into top-level libraries which can be independently audited for
thread-happiness. There's just some loose ends which need tidying up.
For example, almost all of the code in libhttp/ in cacheboy (ie, basic
http header and header entry stuff, parsing, range request headers,
cc, headers, etc) are thread-safe, but the functions -they- call (such
as the base64 functions) use static buffers which may or may not be
thread-safe. Stuff which calls the legacy non-safe inet_* routines, or
perhaps the non thread-safe strtok() and other string.h functions, all
need to be fixed.

Threading the rest of it would take a lot, -lot- more time. A
thread-aware storage backend (disk, memory, store index) is definitely
an integral part of making a threaded Squid, and a whole lot more code
modularity and reorganisation would have to take place for that to
occur.

Want to help? :)


Adrian

2009/1/4 ShuXin Zheng :
> I've ever do this to run multi-squid on one machine which can use multi-CPU,
> but can't share the same store-fs, and must configure multi-IP on the same
> machine. Can we rewrite squid as follow:
>
> thread0(client side, no block, can accept many connections)  thread1
> ..threadn(n=CPU number)
>   |
>  |
>   v
> v
> access check
>   access check
>  |
> |
>  v
>v
> http header parse
>   http header parse
>  |
> |
>  v
>v
> acl filter
>   acl filter
>  |
> |
>  v
>v
> check local cache
>  check local cache
>  |
> |
>  v
>v
> ---
>  |
> neighbor| ||-ufs
> webserver--|--- forward - |store fs |-aufs
>| |
> |-coss
> ---
>  |(thread0)
>  |(thread1) ..
>  v
> v
>  ...
>        
>
>
>
> 2009/1/4  :
>>
>> I've found the best way is to run multiple copies of squid on a single
>> machine, and use LVS to load balance between the squid processes.
>>
>> -- Joe
>>
>> Quoting Adrian Chadd :
>>
>>> when someone decides to either help code it up, or donate towards the
>>> effort.
>>>
>>>
>>>
>>> adrian
>>>
>>> 2009/1/3 ShuXin Zheng :
&g

Re: When can we make Squid using multi-CPU?

2009-01-03 Thread Adrian Chadd
when someone decides to either help code it up, or donate towards the effort.



adrian

2009/1/3 ShuXin Zheng :
> Hi, Squid current can only use one CPU, but multi-CPU hardware
> machines are so popular. These are so greatly wastely. How can we use
> the multi-CPU? Can we separate some parallel sections which are CPU
> wasting to run on different CPU? OMP(http://openmp.org/wp/) gives us
> some thinking about using multi-CPU, so can we use these technology in
> Squid?
>
> Thanks
>
> --
> zsxxsz
>
>


Re: Introductions

2008-12-31 Thread Adrian Chadd
Welcome!

2008/12/30 Regardt van de Vyver :
> Hi Dev Team.
>
> My name is Regardt van de Vyver, a technology enthusiast who tinkers with
> squid on a regular basis. I've been involved in development for around 12
> years and am an active participant on numerous open source projects.
>
> Right now I'm focussed on improving and extending performance metrics for
> squid, specifically related to SNMP and the cachemanager.
>
> I'd like to take a more active role in the coming year from a dev
> perspective and feel the 1st step here is to at least get my butt onto the
> dev mailing list ;-)
>
> I look forward to getting involved.
>
> Regards,
>
> Regardt van de Vyver
>
>


Re: Migrating debug code from src/ to src/debug/

2008-12-25 Thread Adrian Chadd
Ok, besides the lacking build dependency on src/core and src/debug, I
think the first round of changes are finished. That is, the ctx/debug
routines and all that they depend on have been shuffled out of src/
and into src/core / src/debug as appropriate.

I've pushed the changes to the launchpad URL mentioned previously.

I'd like some feedback and some assistance figuring out how/where to
convince src/Makefile.am that the two above directories are build
prereqs for almost everything. There are a -lot- of build targets in
that Makefile under Squid-3 and I'm not sure that I want to add to the
mess in a naive way.

Thanks,



Adrian


src/debug.cc : amos?

2008-12-25 Thread Adrian Chadd
Amos, whats this for in src/debug.cc ?

//*AYJ:*/if (!Config.onoff.buffered_logs)
fflush(debug_log);



Adrian


Re: Migrating debug code from src/ to src/debug/

2008-12-21 Thread Adrian Chadd
Would someone perhaps enlighten me why Squid-3 is trying to install
src/SquidTime.h as part of some build rule, and why moving it out of
the way (into src/core/) has resulted in "make install" completely
failing?

I'm having some real trouble understanding all of the gunk thats in
the Squid-3 src/Makefile.am and its starting to give me a headache.

Thanks,


Adrian


Re: Migrating debug code from src/ to src/debug/

2008-12-21 Thread Adrian Chadd
2008/12/18 Adrian Chadd :
> I've begun fiddling with migrating the bulk of the debug code out of
> src/ and into src/debug/; as per the source reorganisation wiki page.

The next step is migrating some other stuff out and doing some API
hiding hijinx of the debugging logfile code - a bunch of code directly
frobs the debug log fd/filehandle for various nefarious purposes. Grr.

The other next thing is to sort out where to put the SquidTime stuff,
which is used by the debug code. I'll create "src/core" for now in my
branch to put this random stuff; I'll worry about the final
destination for it all later.

I couldn't tease apart ctx and debug all that much in cacheboy (and I
couldn't figure out how it should or may be done as an exercise
either) so I'll just lump them together.



Adrian


Re: X-Vary-Options support

2008-12-21 Thread Adrian Chadd
2008/12/20 Mark Nottingham :
> I agree. My impression was that it's pretty specific to their requirements,
> not a good general solution.

Well, I'm all ears about a slightly more flexible solution. I mean,
this is an X-* header; we could simply document it as a Squid specific
feature once a few basic concerns have been addressed, and leave
nutting out the "right" solution to the IETF group. :)



Adrian


Migrating debug code from src/ to src/debug/

2008-12-18 Thread Adrian Chadd
I've begun fiddling with migrating the bulk of the debug code out of
src/ and into src/debug/; as per the source reorganisation wiki page.

The first step is to just relocate the syslog facility code out, which
I've done.

The next step is to break out the debug code which handles the actual
debugging into src/debug/.

The changes can be viewed at
http://bazaar.launchpad.net/~adrian-squid-cache/squid/adrian_src_reorganise/
.

I'll post again when I've finished the debug code shuffle so I can
figure out the "right way" to submit the change request.



Adrian


Re: [MERGE] Polish for ZPH Patch. Creating single-line config

2008-12-18 Thread Adrian Chadd
(As I've forgotten my bundlebuggy login!)

This patch actually starts breaking code out into src/ip versus -just-
implementing the above modification to the zph code.

Which I'm all for, but I think the commit line is misleading.



Adrian

2008/12/12 Bundle Buggy :
> Bundle Buggy has detected this merge request.
>
> For details, see:
> http://bundlebuggy.aaronbentley.com/project/squid/request/%3C20081212080618.3EDD887DA3%40treenet.co.nz%3E
> Project: Squid
>
>


Re: X-Vary-Options support

2008-12-17 Thread Adrian Chadd
2008/12/17 Henrik Nordstrom :

> I am a bit uneasy about adding features known to be flawed in design.
> Once the header format is added it becomes hard to change.
>
> Sorry, don't remember the details right now what was flawed. See
> archives for earlier discussion.

>From the archives:

* not handling q= values, and
* not handling cookie names and specific attribute values, versus just
arbitrary bits of the cookie (as I think it does now?)

Is that it? I can take a look at the patch and see if I can extend it
to support the cookie stuff at least.


X-Vary-Options support

2008-12-17 Thread Adrian Chadd
Hi,

I've got a small contract to get Squid going in front of a small group
of Mediawiki servers and one of the things which needs adding is the
X-Vary-Options support.

So is there any reason whatsoever that it can't be committed to
Squid-2.HEAD as-is, and at least backported (but not committed to
start with) to squid-2.7?

I remember the Wiki guys' issues wrt Variant purging, which I'm hoping
Y! and Benno have sorted out, and I'm not looking to commit anything
relating to that now - just the X-Vary-Options support.

Thanks,


Adrian


Re: Request for new round of SBuf review

2008-12-06 Thread Adrian Chadd
Howdy,

As most of you aren't aware, Kinkie, alex and I had a bit of a
discussion about this on IRC rather than on the mailing list, so
there's probably some other stuff which should be posted here.

Kinkie, are you able to post some updated code + docs after our discussion?

My main suggestion to Kinkie was to take his code and see how well it
worked with some test use cases - the easiest and most relevant one
being parsing HTTP requests and building HTTP replies. I think that a
few test case implementations outside of the Squid codebase will be
helpful in both understanding the issues which this sort of class is
trying to solve.

I would really be against integrating it into Squid mainline until
we've all had a chance to play with it without being burdened by the
rest of Squid. :)



Adrian


2008/12/4 Kinkie <[EMAIL PROTECTED]>:
> Hi all,
>   I feel that SBuf may just be complete enough to be considered a
> viable replacement for SquidString, as a first step towards
> integration.
> I'd appreciate anyone's help in giving it a check to gather feedback
> and suggestions.
>
> Doxygen documentation for the relevant classes is available at
> http://eu.squid-cache.org/~kinkie/sbuf-docs/ , the code is at
> lp:~kinkie/squid/stringng
> (https://code.launchpad.net/~kinkie/squid/stringng).
>
> Thanks!
>
> --
>/kinkie
>
>


Re: The cache deny QUERY change... partial rollback?

2008-12-01 Thread Adrian Chadd
2008/12/1 Henrik Nordstrom <[EMAIL PROTECTED]>:
> After analyzing a large cache with significantly declining hit ratio
> over the last months I have came to the conclusion that the removal of
> cache deny QUERY can have a very negative impact on hit ratio, this due
> to a number of flash video sites (youtube, google, various porno sites
> etc) who include per-view unique query parameters in the URL and
> responding with a cachable response.
>
> Because of this I suggest that we add back the cache deny rule in the
> recommended config, but leave the refresh_pattern change as-is.
>
> People running reverse proxies or combating these cache busting sites
> using store rewrites know how to change the cache rules, while many
> users running general proxy servers are quite negatively impacted by
> these sites if caching of query urls is allowed.

Hm, thats kind of interesting actually. Whats it displacing from the
cache? Is the drop of hit ratio due to the removal of other cachable
large objects, or other cachable small objects? Is it -just- flash
video thats exhibiting this behaviour?

Are you able to put up some examples and statistics? I really think
the right thing to do here is look at what various sites are doing and
try to open a dialogue with them. Chances are they don't really know
exactly how to (ab)use HTTP to get the semantics they want whilst
retaining control over their content.



Adrian


Re: omit to loop-forever processing some regex acls

2008-11-26 Thread Adrian Chadd
G'day!

If these are patches against Squid-2 then please put them into the
Squid bugzilla so we don't lose them.

There's a different process for Squid-3 submissions.

Thanks!


Adrian


2008/11/26 Matt Benjamin <[EMAIL PROTECTED]>:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
>
>
>
> - --
>
> Matt Benjamin
>
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI  48104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.7 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFJLYaAJiSUUSaRdSURCNBMAJ90xJm8VjlLJuubuxqi2drt8plR7QCdHXDs
> zBhdg5Gf8JScY8BdXqMZf8I=
> =Kd5i
> -END PGP SIGNATURE-
>


Re: Rv: Why not BerkeleyDB based object store?

2008-11-26 Thread Adrian Chadd
I thought about it a while ago but i'm just out of time to be honest.
Writing objects to disk only if they're popular or you need the RAM to
handle concurrent accesses for large objects for some reason would
probably way way improve disk performance as the amount of writing
would drop drastically.

Sponsorship for investigating and developing this is gladly accepted :)


Adrian


2008/11/26 Mark Nottingham <[EMAIL PROTECTED]>:
> Just a tangental thought; has there been any investigation into reducing the
> amount of write traffic with the existing stores?
>
> E.g., establishing a floor for reference count; if it doesn't have n refs,
> don't write to disk? This will impact hit rate, of course, but may mitigate
> in situations where disk caching is desirable, but writing is the
> bottleneck...
>
>
> On 26/11/2008, at 9:14 AM, Kinkie wrote:
>
>> On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti
>> <[EMAIL PROTECTED]> wrote:
>>>
>>> Amazon uses BerkeleyDB for several critical parts of its website. The
>>> Chicago Mercatile Exchange uses BerkeleyDB for backup and recovery of its
>>> trading database. And Google uses BerkeleyDB to process Gmail and Google
>>> user accounts. Are you sure BerkeleyDB is not a good idea to replace the
>>> Squid filesystems even COSS?
>>
>> Squid3 uses a modular storage backend system, so you're more than
>> welcome to try to code it up and see how it compares.
>> Generally speaking, the needs of a data cache such as squid are very
>> different from those of a general-purpose backend storage.
>> Among the other key differences:
>> - the data in the cache has little or no value.
>>  it's important to know whether a file was corrupted, but it can
>> always be thrown away and fetched from the origin server at a
>> relatively low cost
>> - workload is mostly writes
>>  a well-tuned forward proxy will have a hit-rate of roughly 30%,
>> which means 3 writes for every read on average
>> - data is stored in incremental chunks
>>
>> Given these characteristics, a long list of mechanisms database-like
>> systems have such as journaling, transactions etc. are a  waste of
>> resources.
>> COSS is explicitly designed to handle a workload of this kind. I would
>> not trust any valuable data to it, but it's about as fast as it gets
>> for a cache.
>>
>> IMHO BDB might be much more useful as a metadata storage engine, as
>> those have a very different access pattern than a general-purpose
>> cache store.
>> But if I had any time to devote to this, my priority would be in
>> bringing 3.HEAD COSS up to speed with the work Adrian has done in 2.
>>
>> --
>>   /kinkie
>
> --
> Mark Nottingham   [EMAIL PROTECTED]
>
>
>


Re: Associating accesses with cache.log entries

2008-11-26 Thread Adrian Chadd
I like the idea too.

2008/11/27 Kinkie <[EMAIL PROTECTED]>:
> On Thu, Nov 27, 2008 at 4:21 AM, Mark Nottingham <[EMAIL PROTECTED]> wrote:
>> I've been playing around with associating specific requests with the debug
>> output they generate, with a simple patch to _db_print along these lines:
>>
>>if (Config.Log.accesslogs && Config.Log.accesslogs->logfile) {
>>  seqnum = LOGFILE_SEQNO(Config.Log.accesslogs->logfile);
>>}
>>snprintf(f, BUFSIZ, "%s %i| %s",
>>debugLogTime(squid_curtime),
>>seqnum,
>>format);
>>
>> This leverages the sequence number that's available in custom access logs
>> (%sn).
>>
>> It's really useful for debugging requests that are causing problems, etc;
>> rather than having to correlate times and URLs, you can just correlate
>> sequence numbers. It also makes it possible to automate debug output (which
>> is the direction I want to take this in).
>
> Looks interesting to me.
>
>> beyond the obvious cleanup that needs to happen (e.g., outputting '-' or
>> blank instead of 0 if there isn't an access log line associated, a few
>> questions;
>>
>> * How do people feel about putting this in cache.log all the time? I don't
>> think it'll break any scripts (there aren't many, and those that are tend to
>> grep for specific phrases, rather than do an actual parse, AFAICT). Is the
>> placement above appropriate?
>
> I'd avoid the | character, but apart from that it makes sense to me
>
>> * The sequence number mechanism doesn't guarantee uniqueness in the log
>> file; if squid is started between rotates, it will reset the counters. Has
>> fixing this been discussed?
>
> I don't think that uniqueness has much value, correlating seqnum with
> the timestamp will address any uncertain cases.
>
>> * Is it reasonable to hardcode this to associate the numbers with the first
>> configured access_log?
>>
>> * To make this really useful, it would be necessary to be able to trigger
>> debug_options (or just all debugging) based upon an ACL match. However, this
>> looks like it would require changing how debug is #defined. Any comments on
>> this?
>
> YES! It's something I've been thinking about for some time.
> Count me in.
>
> --
>/kinkie
>
>


Re: access_log acl not observing my_port

2008-11-13 Thread Adrian Chadd
g'day!

Just create a ticket in the Squid bugzilla and put the patch into there.

Thanks for your contribution!



Adrian


2008/11/13 Stephen Thorne <[EMAIL PROTECTED]>:
> G'day,
>
> I've been looking into a problem we've observed where this situation
> does not work as expected, this is in squid-2.7.STABLE4:
>
> acl direct myport 8080
> access_log /var/log/squid/direct_proxy.log common direct
>
> I did some tracing through the code and established that this chain of
> events occurs:
> httpRequestFree calls clientAclChecklistCreate calls aclChecklistCreate
>
> But aclChecklistCacheInit is the function that populates the
> checklist->my_port, which is required for a myport acl to work, and it
> isn't called.
>
> I have attached a patch that fixes this particular problem for me, which
> simply calls aclChecklistCacheInit in clientAclChecklistCreate.
>
> --
> Regards,
> Stephen Thorne
> Development Engineer
> NetBox Blue - 1300 737 060
>
> Scanned by the NetBox from NetBox Blue
> (http://netboxblue.com/)
>
>
> Scanned by the NetBox from NetBox Blue
> (http://netboxblue.com/)
>
>


delayed forwarding is in Squid-2.HEAD

2008-10-16 Thread Adrian Chadd
G'day,

I've just committed the delayed forwarding stuff into Squid-2.HEAD.

Thanks,



Adrian


Re: [PATCH] Check half-closed descriptors at most once per second.

2008-09-24 Thread Adrian Chadd
2008/9/25 Alex Rousskov <[EMAIL PROTECTED]>:

> This revision resurrects 1 check/sec limit, but hopefully with fewer
> bugs. In my limited tests, CPU usage seems to be back to normal.

Woo, thanks!

> The DescriptorSet class has O(1) complexity for search, insertion,
> and deletion. It uses about 2*sizeof(int)*MaxFD bytes. Splay tree that
> used to store half-closed descriptors previously uses less RAM for small
> number of descriptors but has O(log n) complexity.
>
> The DescriptorSet code should probably get its own .h and .cc files,
> especially if it is going to be used by deferred reads.

Could you do that sooner rather than later? I'd like to try using this
code for deferred reads and delay pools.

Thanks!



Adrian


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd
2008/9/24 Martin Langhoff <[EMAIL PROTECTED]>:

> Good hint, thanks! If we did have such a control, what is the wired
> memory that squid will use for each entry? In an email earlier I
> wrote...

sizeof(StoreEntry) per index entry, basically.


>  - Each index entry takes between 56 bytes and 88 bytes, plus
> additional, unspecificed overhead. Is 1KB per entry a reasonable
> conservative estimate?

1kb per entry is pretty conservative. The per-object overhead includes
the StoreEntry, the couple of structures for the memory/disk
replacement policies, plus the MD5 URL for the index hash, whatever
other stuff hangs off MemObject for in-memory objects.

You'll find that the RAM requirements grow a bit more for things like
in-memory cache objects as the full reply headers stay in memory, and
are copied whenever anyone wants to request it.

>  - Discussions about compressing or hashing the URL in the index are
> recurrent - is the uncompressed URL there? That means up to 4KB per
> index entry?

The uncompressed URL and headers are in memory during:

* request/reply handling
* in-memory object; (objects with MemObject's allocated); on-disk
entries just have the MD5 URL hash per StoreEntry.

HTH,

Oh, and I'll be in the US from October for a few months; I can always
do a side-trip out to see you guys if there's enough interest.


Adrian


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd
2008/9/23 Martin Langhoff <[EMAIL PROTECTED]>:

> Any way we can kludge our way around it for the time being? Does squid
> take any signal that gets it to shed its index?

It'd be pretty trivial to write a few cachemgr hooks to implement that
kind of behaviour. 'flush memory cache', 'flush disk cache entirely',
etc.

The trouble is that the index is -required- at the moment for the disk
cache. if you flush the index you flush the disk cache entirely.

>> There's no "hard limit" for squid and squid (any version) handles
>> memory allocation failures very very poorly (read: crashes.)
>
> Is it relatively sane to run it with a tight rlimit and restart it
> often? Or just monitor it and restart it?

It probably won't like that very much if you decide to also use disk caching.

>> You can limit the amount of cache_mem which limits the memory cache
>> size; you could probably modify the squid codebase to start purging
>> objects at a certain object count rather than based on the disk+memory
>> storage size. That wouldn't be difficult.
>
> Any chance of having patches that do this?

I could probably do that in a week or so once I've finished my upcoming travel.
Someone could try beating me to it..

>
>> The big problem: you won't get Squid down to 24meg of RAM with the
>> current tuning parameters. Well, I couldn't; and I'm playing around
>
> Hmmm...
>
>> with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
>> RAM.) Its something which will require quite a bit of development to
>> "slim" some of the internals down to scale better with restricted
>> memory footprints. Its on my personal TODO list (as it mostly is in
>> line with a bunch of performance work I'm slowly working towards) but
>> as the bulk of that is happening in my spare time, I do not have a
>> fixed timeframe at the moment.
>
> Thanks for that -- at whatever pace, progress is progress. I'll stay
> tuned. I'm not on squid-devel, but generally interested in any news on
> this track; I'll be thankful if you CC me or rope me into relevant
> threads.

Ok.

> Is there interest within the squid dev team in moving towards a memory
> allocation model that is more tunable and/or relies more on the
> abilities of modern kernels to do memory mgmt? Or an alternative
> approach to handle scalability (both down to small devices and up to
> huge kit) more dynamically and predictably?

You'll generally find the squid dev team happy to move in whatever
directions make sense. The problem isn't direction as so much as the
coding to make it happen. Making Squid operate well in small memory
footprints turns out to be quite relevant to higher performance and
scalability; the problem is in the "doing".

I'm hoping to start work on some stuff to reduce the memory footprint
in my squid-2 branch (cacheboy) once the current round of IPv6
preparation is completed and stable. The developers working on Squid-3
are talking about similar stuff.


Adrian


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Adrian Chadd
G'day,

I've looked into this a bit (and have a couple of OLPC laptops to do
testing with) and .. well, its going to take a bit of effort to make
squid "fit".

There's no "hard limit" for squid and squid (any version) handles
memory allocation failures very very poorly (read: crashes.)

You can limit the amount of cache_mem which limits the memory cache
size; you could probably modify the squid codebase to start purging
objects at a certain object count rather than based on the disk+memory
storage size. That wouldn't be difficult.

The big problem: you won't get Squid down to 24meg of RAM with the
current tuning parameters. Well, I couldn't; and I'm playing around
with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
RAM.) Its something which will require quite a bit of development to
"slim" some of the internals down to scale better with restricted
memory footprints. Its on my personal TODO list (as it mostly is in
line with a bunch of performance work I'm slowly working towards) but
as the bulk of that is happening in my spare time, I do not have a
fixed timeframe at the moment.


Adrian


2008/9/23 Martin Langhoff <[EMAIL PROTECTED]>:
> Hi!
>
> I am working on the School Server (aka XS: a Fedora 9 spin, tailored
> to run on fairly limited hw), I'm preparing the configuration settings
> for it. It's a somewhat new area for me -- I've setup Squid before on
> mid-range hardware... but this is... different.
>
> So I'm interested in understanding more aobut the variables affecting
> memory footprint and how I can set a _hard limit_ on the wired memory
> that squid allocates.
>
> In brief:
>
>  - The workload is relatively "light" - 3K clients is the upper bound.
>
>  - The XS will (in some locations) be hooked to *very* unreliable
> power... uncontrolled shutdowns are the norm. Is this ever a problem with 
> Squid?
>
>  - After a bad shutdown, graceful recovery is the most important
> aspect. If a few cached items are lost, we can cope...
>
>  - The XS hardware runs many services (mostly webbased), so Squid gets
> only a limited slice of memory. To make matters worse, I *really*
> don't want the core working set (Squid, Pg, Apache/PHP) to get paged
> out. So I am interested in pegging the max memory Squid will take to itself.
>
>  - The XS hw is varied. In small schools it may have 256MB RAM (likely
> to be running on XO hardware + usb-connected ext hard-drive).
> Medium-to-large schools will have the recommended 1GB RAM and a cheap
> SATA disk. A few very large schools will be graced with more RAM (2 or
> 4GB).
>
> .. so RAM allocation for Squid will prob range between 24MB at the
> lower-end and 96MB at the 1GB "recommended" RAM.
>
> My main question is: how would you tune Squid 3 so that
>
>  - it does not allocate directly more than 24MB / 96MB? (Assume that
> the linux kernel will be smart about mmapped stuff, and aggressive
> about caching -- I am talking about the memory Squid will claim to
> itself).
>
>  - still gives us good thoughput? :-)
>
>
>
> So far Google has turned up very little info, and it seems to be
> rather old. What I've found can be summarised as follows:
>
>  - The index is malloc'd, so the number of entries in the index will
> be the dominant concern WRT memory footprint.
>
>  - Each index entry takes between 56 bytes and 88 bytes, plus
> additional, unspecificed overhead. Is 1KB per entry a reasonable
> conservative estimate?
>
>  - Discussions about compressing or hashing the URL in the index are
> recurrent - is the uncompressed URL there? That means up to 4KB per
> index entry?
>
>  - The index does nto seem to be mmappable or otherwise
>
> We can rely on the (modern) linux kernel doing a fantastic job at
> caching disk IO and shedding those cached entries when under memory
> pressure, so I am likely to set Squid's own cache to something really
> small. Everything I read points to the index being my main concern -
> is there a way to limit (a) the total memory the index is allowed to
> take or (b) the number of index entries allowed?
>
> Does the above make sense in general? Or am I barking up the wrong tree?
>
>
> cheers,
>
>
>
> martin
> --
>  [EMAIL PROTECTED]
>  [EMAIL PROTECTED] -- School Server Architect
>  - ask interesting questions
>  - don't get distracted with shiny stuff - working code first
>  - http://wiki.laptop.org/go/User:Martinlanghoff
> ___
> Server-devel mailing list
> [EMAIL PROTECTED]
> http://lists.laptop.org/listinfo/server-devel
>
>


Re: Strategy

2008-09-21 Thread Adrian Chadd
"only focus" should really have been "our main focus at that short
period of time", not "the only thing we care about."

Sheesh. :P



Adrian

2008/9/22 Alex Rousskov <[EMAIL PROTECTED]>:
> On Mon, 2008-09-22 at 10:36 +0800, Adrian Chadd wrote:
>> Put this stuff on hold, get Squid-3.1 out of the way, sort out the
>> issues surrounding that before you start throwing more code into
>> Squid-3 trunk, and -then- have this discussion.
>
> If "this stuff" is WordList, then "put this stuff on hold" is my
> suggestion as well.
>
> If "this stuff" is String, then I think the basic design choices can be
> discussed now, but waiting is even better for me, so I am happy to
> follow your suggestion :-).
>
> If "this stuff" is how we improve "teamwork", then I am happy to
> continue any _constructive_ discussions since releasing 3.1 can benefit
> from teamwork as well.
>
>> We can sort this stuff out in a short period of time if its our only focus.
>
> The only focus? You must be dreaming :-).
>
> Alex.
>
>
>> 2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>:
>> >> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
>> >>> Alex Rousskov wrote:
>> >>>
>> >>> > * Look for simpler warts with localized impact. We have plenty of them
>> >>> > and your energy would be well spent there. If you have a choice, do
>> >>> not
>> >>> > try to improve something as fundamental and as critical as String.
>> >>> > Localized single-use code should receive a lot less scrutiny than
>> >>> > fundamental classes.
>> >>> >
>> >>>
>> >>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes
>> >>> a thorough discussion, as String has, and comes up with a good API. That
>> >>> not just a step in the rght direction but a giant leap. And worth doing
>> >>> if you can spare the time (months in some cases).
>> >>> The follow on effects will be better and easier code in other areas
>> >>> depending on it.
>> >>
>> >> Amos,
>> >>
>> >> I think the above work-long-enough-and-you-will-make-it analysis and
>> >> a few other related comments do not account for one important factor:
>> >> cost (and the limited resources this project has). Please compare the
>> >> following estimates (all numbers are very approximate, of course):
>> >>
>> >>  Kinkie's time to draft a String class:   2 weeks
>> >>  Kinkie's time to fix the String class:   6 weeks
>> >>  Reviewers' time to find bugs and
>> >>   convince Kinkie that they are bugs: 2 weeks
>> >>  Total:  10 weeks
>> >>
>> >>  Reviewer's time to write a String class: 3 weeks
>> >>  Total:   3 weeks
>> >>
>> >
>> > Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
>> > reviewers gain 1 week themselves. So I stand by, if he feels strongly
>> > enough to do it.
>> >
>> >> If you add to the above that one reviewer cannot review and work on
>> >> something else at the same time, the waste goes well above 200%.
>> >
>> > Which is wrong. We can review one thing and work on another project.
>> >
>> >>
>> >> Compare the above with a regular project that does not require writing
>> >> complex or fundamental classes (again, numbers are approximate):
>> >>
>> >> Kinkie's time to complete a regular project:   1 week
>> >> Reviewer's time to complete a regular project: 1 week
>> >
>> > After which both face the hard project again. Which remains hard and could
>> > have cut off 5 days of the regular project.
>> >
>> >>
>> >> If we want Squid code to continue to be a playground for half-finished
>> >> code and ideas, then we should abandon the review process. Let's just
>> >> commit everything that compiles and that the committer is happy with.
>> >
>> > I assume you are being sarcastic.
>> >
>> >> Otherwise, let's do our best to find a project for everyone, without
>> >> sacrificing the quality of the output or wasting resources. For example,
>> >> if a person wants String to implement his pet project, but cannot make a

Re: Strategy

2008-09-21 Thread Adrian Chadd
And in the meantime, if someone (eg kinkie) wants to work on this
stuff some more, I suggest sitting down and writing some of the
support code which would use it.

Write a HTTP parser, HTTP response builder, do some benchmaking,
perhaps glue it to something like libevent or some other comm
framework and do some benchmarking there.
See how it performs, how it behaves, see if it does everything y'all
want cleanly. _Then_ have this discussion.



Adrian

2008/9/22 Adrian Chadd <[EMAIL PROTECTED]>:
> Put this stuff on hold, get Squid-3.1 out of the way, sort out the
> issues surrounding that before you start throwing more code into
> Squid-3 trunk, and -then- have this discussion.
>
> We can sort this stuff out in a short period of time if its our only focus.
>
>
>
> Adrian
>
> 2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>:
>>> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
>>>> Alex Rousskov wrote:
>>>>
>>>> > * Look for simpler warts with localized impact. We have plenty of them
>>>> > and your energy would be well spent there. If you have a choice, do
>>>> not
>>>> > try to improve something as fundamental and as critical as String.
>>>> > Localized single-use code should receive a lot less scrutiny than
>>>> > fundamental classes.
>>>> >
>>>>
>>>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes
>>>> a thorough discussion, as String has, and comes up with a good API. That
>>>> not just a step in the rght direction but a giant leap. And worth doing
>>>> if you can spare the time (months in some cases).
>>>> The follow on effects will be better and easier code in other areas
>>>> depending on it.
>>>
>>> Amos,
>>>
>>> I think the above work-long-enough-and-you-will-make-it analysis and
>>> a few other related comments do not account for one important factor:
>>> cost (and the limited resources this project has). Please compare the
>>> following estimates (all numbers are very approximate, of course):
>>>
>>>  Kinkie's time to draft a String class:   2 weeks
>>>  Kinkie's time to fix the String class:   6 weeks
>>>  Reviewers' time to find bugs and
>>>   convince Kinkie that they are bugs: 2 weeks
>>>  Total:  10 weeks
>>>
>>>  Reviewer's time to write a String class: 3 weeks
>>>  Total:   3 weeks
>>>
>>
>> Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
>> reviewers gain 1 week themselves. So I stand by, if he feels strongly
>> enough to do it.
>>
>>> If you add to the above that one reviewer cannot review and work on
>>> something else at the same time, the waste goes well above 200%.
>>
>> Which is wrong. We can review one thing and work on another project.
>>
>>>
>>> Compare the above with a regular project that does not require writing
>>> complex or fundamental classes (again, numbers are approximate):
>>>
>>> Kinkie's time to complete a regular project:   1 week
>>> Reviewer's time to complete a regular project: 1 week
>>
>> After which both face the hard project again. Which remains hard and could
>> have cut off 5 days of the regular project.
>>
>>>
>>> If we want Squid code to continue to be a playground for half-finished
>>> code and ideas, then we should abandon the review process. Let's just
>>> commit everything that compiles and that the committer is happy with.
>>
>> I assume you are being sarcastic.
>>
>>> Otherwise, let's do our best to find a project for everyone, without
>>> sacrificing the quality of the output or wasting resources. For example,
>>> if a person wants String to implement his pet project, but cannot make a
>>> good String, it may be possible to trade String implementation for a few
>>> other pet projects that the person can do.
>>
>> Then that trade needs to be discussed with the person before they start.
>> I get the idea you are trying to manage this FOSS like you would a company
>> project. That approach has been tried and failed miserably in FOSS.
>>
>>> This will not be smooth and
>>> easy, but it is often doable because most of us share the goal of making
>>> the best open source proxy.
>>>
>>>> > * When assessing the impact of your changes, do not just compare the
&g

Re: Strategy

2008-09-21 Thread Adrian Chadd
Put this stuff on hold, get Squid-3.1 out of the way, sort out the
issues surrounding that before you start throwing more code into
Squid-3 trunk, and -then- have this discussion.

We can sort this stuff out in a short period of time if its our only focus.



Adrian

2008/9/22 Amos Jeffries <[EMAIL PROTECTED]>:
>> On Sun, 2008-09-21 at 23:36 +1200, Amos Jeffries wrote:
>>> Alex Rousskov wrote:
>>>
>>> > * Look for simpler warts with localized impact. We have plenty of them
>>> > and your energy would be well spent there. If you have a choice, do
>>> not
>>> > try to improve something as fundamental and as critical as String.
>>> > Localized single-use code should receive a lot less scrutiny than
>>> > fundamental classes.
>>> >
>>>
>>> Agreed, but that said. If you kinkie, picking oe of the hard ones causes
>>> a thorough discussion, as String has, and comes up with a good API. That
>>> not just a step in the rght direction but a giant leap. And worth doing
>>> if you can spare the time (months in some cases).
>>> The follow on effects will be better and easier code in other areas
>>> depending on it.
>>
>> Amos,
>>
>> I think the above work-long-enough-and-you-will-make-it analysis and
>> a few other related comments do not account for one important factor:
>> cost (and the limited resources this project has). Please compare the
>> following estimates (all numbers are very approximate, of course):
>>
>>  Kinkie's time to draft a String class:   2 weeks
>>  Kinkie's time to fix the String class:   6 weeks
>>  Reviewers' time to find bugs and
>>   convince Kinkie that they are bugs: 2 weeks
>>  Total:  10 weeks
>>
>>  Reviewer's time to write a String class: 3 weeks
>>  Total:   3 weeks
>>
>
> Which shows that if Kinkie wants to work on it, he is out 8 weeks, and the
> reviewers gain 1 week themselves. So I stand by, if he feels strongly
> enough to do it.
>
>> If you add to the above that one reviewer cannot review and work on
>> something else at the same time, the waste goes well above 200%.
>
> Which is wrong. We can review one thing and work on another project.
>
>>
>> Compare the above with a regular project that does not require writing
>> complex or fundamental classes (again, numbers are approximate):
>>
>> Kinkie's time to complete a regular project:   1 week
>> Reviewer's time to complete a regular project: 1 week
>
> After which both face the hard project again. Which remains hard and could
> have cut off 5 days of the regular project.
>
>>
>> If we want Squid code to continue to be a playground for half-finished
>> code and ideas, then we should abandon the review process. Let's just
>> commit everything that compiles and that the committer is happy with.
>
> I assume you are being sarcastic.
>
>> Otherwise, let's do our best to find a project for everyone, without
>> sacrificing the quality of the output or wasting resources. For example,
>> if a person wants String to implement his pet project, but cannot make a
>> good String, it may be possible to trade String implementation for a few
>> other pet projects that the person can do.
>
> Then that trade needs to be discussed with the person before they start.
> I get the idea you are trying to manage this FOSS like you would a company
> project. That approach has been tried and failed miserably in FOSS.
>
>> This will not be smooth and
>> easy, but it is often doable because most of us share the goal of making
>> the best open source proxy.
>>
>>> > * When assessing the impact of your changes, do not just compare the
>>> old
>>> > code with the one submitted for review. Consider how your classes
>>> stand
>>> > on their own and how they _will_ be used. Providing a poor but
>>> > easier-to-abuse interface is often a bad idea even if that interface
>>> is,
>>> > in some aspects, better than the old hard-to-use one.
>>> >
>>> >> Noone else is tackling the issues that I'm working on. Should they be
>>> >> left alone? Or should I aim for the "perfect" solution each time?
>>>
>>> Perfect varies, and will change. As the baseline 'worst' code in Squid
>>> improves. The perfect API this year may need changing later. Aim for the
>>> best you can find to do, and see if its good enough for inclusion.
>>
>> Right. The problems come when it is not good enough, and you cannot fix
>> it on your own. I do not know how to avoid these ugly situations.
>
> Teamwork. Which I thought we were starting to get in the String API after
> earlier attempts at solo by whoever wrote SquidString and myself on the
> BetterString mk1, mk2, mk3.
>
> I doubt any of us could do a good job of something so deep without help.
> Even you needed Henrik to review and find issues with AsyncCalls, maybe
> others I don't know about before that.
>
> The fact remains these things NEED someone to kick us into a team and work
> on it.
>
>>
>>> for example, Alex had no issues with wordlist when it first came out.
>>
>> This

Re: [MERGE] Connection pinning patch

2008-09-21 Thread Adrian Chadd
2008/9/22 Alex Rousskov <[EMAIL PROTECTED]>:

>
> It would help if there was a document describing what connection pinning
> is and what are the known pitfalls. Do we have such a document? Is RFC
> 4559 enough?

I'll take another read. I think we should look at documenting these
sorts of features somewhere else though.

> If not, Christos, can you write one and have Adrian and others
> contribute pitfalls? It does not have to be long -- just a few
> paragraphs describing the basics of the feature. We can add that
> description to code documentation too.

I'd be happy to help troll over the 2.X code and see what its doing.
Henrik and Steven know the code better than I do; I've just spent some
time figuring out how it interplays with load balancing to peers and
such.

> ICAP and eCAP do not care about HTTP connections or custom headers. Is
> connection pinning more than connection management via some custom
> headers?

Nope; it just changes the semantics a little and some code may assume
things work a certain way.

> Sine NTLM authentication forwarding appears to be a required feature for
> many and since connection pinning patch is not trivial (but is not huge
> either), I would rather see it added now (after the proper review
> process, of course). It could be the right icing on 3.1 cake for many
> users. I do realize that, like any 900-line patch, it may cause problems
> even if it is reviewed and tested.

*nodnod* I'm just making sure the reasons for pushing it through are
recorded somewhere during the process.



Adrian


Re: [MERGE] Connection pinning patch

2008-09-21 Thread Adrian Chadd
Its a 900-odd line patch; granted, a lot of it is boiler plate for
config parsing and management, but I recall the issues connection
pinning had when it was introduced and I'd hate to try and be the one
debugging whatever crazy stuff pops up in 3.1 combined with the
changes to the workflow connection pinning introduces.

I don't pretend to completely understand the implications for ICAP
either. Is there any documentation for how connection pinning should
behave with ICAP and friends?

Is there any particular rush to get this in for this release at such a
late point in the release cycle?

Could we hold off of it until the next release, and just focus on
getting whats currently in 3.HEAD released and stable?



Adrian


2008/9/21 Tsantilas Christos <[EMAIL PROTECTED]>:
> Hi all,
>
> This patch fixes the bug 1632
> (http://www.squid-cache.org/bugs/show_bug.cgi?id=1632)
> It is based on the original squid2.5 connection pinning patch developed by
> Henrik (http://devel.squid-cache.org/projects.html#pinning) and the related
> squid 2.6 connection pinning code.
>
> Although I spend many hours looking on pined connections I am still not
> absolutely  sure that does not have  bugs. However the code is very similar
> with this in squid2.6 (where the pinning code runs for years) and I hope
> will be easy to fix problems and bugs.
>
> Regards,
>Christos
>


Re: SBuf review

2008-09-18 Thread Adrian Chadd
2008/9/19 Amos Jeffries <[EMAIL PROTECTED]>:

> I kind of fuzzily disagree, the point of this is to replace MemBuf + String
> with SBuf. Not implement both again independently duplicating stuff.

I'll say it again - ignore MemBuf. Ignore MemBuf for now. Leave it as
a NUL-terminated dynamic buffer with some printf append like
semantics.

When you've implemented a non-NUL-terminated ref-counted memory region
implementation and you layer some basic strings semantics on top of
it, you can slowly convert or eliminate the bulk of the MemBuf users
over.

You're going to find plenty of places where the string handling is
plain old horrible. Don't try to cater for those situations with
things like "NULL strings". I tried that, its ugly. Aim to implement
something which'll cater to something narrow to begin with - like
parsing HTTP headers - and look to -rewrite- larger parts of the code
later on. Don't try to invent things which will somehow seamlessly fit
into the existing code and provide the same semantics. Some of said
semantics is plain shit.

I still don't get why this is again becoming so freakishly complicated.



Adrian


Re: [MERGE] WCCPv2 Config Cleanup

2008-09-13 Thread Adrian Chadd
2008/9/13 Amos Jeffries <[EMAIL PROTECTED]>:

> This one was easy and isolated, so I went and did it early.
> It's back-compatible, so people don't have to use the new names if they
> like. But its clearer for the newbies until the big cleanup you mention
> below is stable.

Well, the newbies still need to know about the different kinds of
redirection/assignment methods; what would be nice is if it were
mostly autonegotiated per-host per-service group, and if wccp2d could
setup/teardown the GRE tunnels as required.

>> The WCCPv2 stuff works fine (for what it does); it could do with some
>> better documentation but what it really needs is to be broken out from
>> Squid itself and run as a seperate daemon.
>>
>
> I've been waiting most of a year for your work on that direction in Squid-2
> to be ported over. There does not appear to be any sign of it happening in
> time for 3.1.
> The rest of us are largely concentrating on cleaning other components.

I still haven't done all that much with the WCCPv2 stuff yet. I'll be
breaking out the source code in Cacheboy after I finish the next set
of IPv6 changes; the wccp2d code will then use the config registry
type stuff we've discussed and reuse the core code for comms,
debugging, logging, etc.



Adrian


Re: squid-2.HEAD:

2008-09-12 Thread Adrian Chadd
have you dumped this into bugzilla?

Thanks!

2008/9/3 Alexander V. Lukyanov <[EMAIL PROTECTED]>:
> Hello!
>
> I have noticed lots of 'impossible keep-alive' messages in the log.
> It appears that httpReplyBodySize incorrectly returns -1 for "304 Not
> Modified" replies. Patch to fix it is attached.
>
> --
>   Alexander.
>


Re: squid-2.HEAD: fwdComplete/Fail before comm_close

2008-09-12 Thread Adrian Chadd
Hiya,

Could you please verify this is still a problem in the latest 2.HEAD
and if so lodge a bugzilla bug report with the patch?

Thanks!


Adrian


2008/8/5 Alexander V. Lukyanov <[EMAIL PROTECTED]>:
> Hello!
>
> Some time ago I had core dumps just after these messages:
>Short response from ...
>httpReadReply: Excess data from ...
>
> I beleave this patch fixes these problems.
>
> Index: http.c
> ===
> RCS file: /squid/squid/src/http.c,v
> retrieving revision 1.446
> diff -u -p -r1.446 http.c
> --- http.c  25 Jun 2008 22:11:20 -  1.446
> +++ http.c  5 Aug 2008 06:05:29 -
> @@ -755,6 +757,7 @@ httpAppendBody(HttpStateData * httpState
> /* Is it a incomplete reply? */
> if (httpState->chunk_size > 0) {
>debug(11, 2) ("Short response from '%s' on port %d. Expecting %" 
> PRINTF_OFF_T " octets more\n", storeUrl(entry), comm_local_port(fd), 
> httpState->chunk_size);
> +   fwdFail(httpState->fwd, errorCon(ERR_INVALID_RESP, HTTP_BAD_GATEWAY, 
> httpState->fwd->request));
>comm_close(fd);
>return;
> }
> @@ -774,6 +777,7 @@ httpAppendBody(HttpStateData * httpState
>("httpReadReply: Excess data from \"%s %s\"\n",
>RequestMethods[orig_request->method].str,
>storeUrl(entry));
> +   fwdComplete(httpState->fwd);
>comm_close(fd);
>return;
> }
>
>


Re: squid-2.HEAD: storeCleanup and -F option (foreground rebuild)

2008-09-12 Thread Adrian Chadd
I've committed a slightly modified version of this - store_rebuild.c
r1.80 . Take a look and see if it works for you.

Thanks!



Adrian

2008/8/5 Alexander V. Lukyanov <[EMAIL PROTECTED]>:
> Hello!
>
> I use squid in transparent mode, so I don't want degraded performance
> while rebuilding and cleanup. Here is a patch I use to make storeCleanup
> do all the work at once before squid starts processing requests, when
> -F option is specified on command line.
>
> Index: store_rebuild.c
> ===
> RCS file: /squid/squid/src/store_rebuild.c,v
> retrieving revision 1.80
> diff -u -p -r1.80 store_rebuild.c
> --- store_rebuild.c 1 Sep 2007 23:09:32 -   1.80
> +++ store_rebuild.c 5 Aug 2008 05:51:43 -
> @@ -68,7 +68,8 @@ storeCleanup(void *datanotused)
> hash_link *link_ptr = NULL;
> hash_link *link_next = NULL;
> validnum_start = validnum;
> -while (validnum - validnum_start < 500) {
> +int limit = opt_foreground_rebuild ? 1 << 30 : 500;
> +while (validnum - validnum_start < limit) {
>if (++bucketnum >= store_hash_buckets) {
>debug(20, 1) ("  Completed Validation Procedure\n");
>debug(20, 1) ("  Validated %d Entries\n", validnum);
> @@ -147,8 +148,8 @@ storeRebuildComplete(struct _store_rebui
> debug(20, 1) ("  Took %3.1f seconds (%6.1f objects/sec).\n", dt,
>(double) counts.objcount / (dt > 0.0 ? dt : 1.0));
> debug(20, 1) ("Beginning Validation Procedure\n");
> -eventAdd("storeCleanup", storeCleanup, NULL, 0.0, 1);
> safe_free(RebuildProgress);
> +storeCleanup(0);
>  }
>
>  /*
>
>


Re: [MERGE] WCCPv2 Config Cleanup

2008-09-12 Thread Adrian Chadd
Amos, why are you pushing through changes to the WCCP configuration
stuff at this point in the game?

The WCCPv2 stuff works fine (for what it does); it could do with some
better documentation but what it really needs is to be broken out from
Squid itself and run as a seperate daemon.




Adrian

2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>:
> With the patch the code uses WCCP2_METHOD_.. in some places (config
> parsing/dumping) and the context specific ones in other places. This is
> even more confusing.
>
> Very minor detail in any case.
>
>
> On lör, 2008-09-13 at 09:49 +0800, Adrian Chadd wrote:
>> The specification defines them as separate entities and using them in
>> this fashion makes it clearer for people working on the code.
>>
>>
>>
>> Adrian
>>
>> 2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>:
>> > On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote:
>> >
>> >> +#define WCCP2_FORWARDING_METHOD_GRE  WCCP2_METHOD_GRE
>> >> +#define WCCP2_FORWARDING_METHOD_L2   WCCP2_METHOD_L2
>> >
>> >> +#define WCCP2_PACKET_RETURN_METHOD_GRE   WCCP2_METHOD_GRE
>> >> +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2
>> >
>> > Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are
>> > the same value?
>> >
>> > Regards
>> > Henrik
>> >
>> >
>
>


Re: [MERGE] WCCPv2 Config Cleanup

2008-09-12 Thread Adrian Chadd
The specification defines them as separate entities and using them in
this fashion makes it clearer for people working on the code.



Adrian

2008/9/13 Henrik Nordstrom <[EMAIL PROTECTED]>:
> On fre, 2008-09-12 at 20:39 +1200, Amos Jeffries wrote:
>
>> +#define WCCP2_FORWARDING_METHOD_GRE  WCCP2_METHOD_GRE
>> +#define WCCP2_FORWARDING_METHOD_L2   WCCP2_METHOD_L2
>
>> +#define WCCP2_PACKET_RETURN_METHOD_GRE   WCCP2_METHOD_GRE
>> +#define WCCP2_PACKET_RETURN_METHOD_L2WCCP2_METHOD_L2
>
> Do we still need these? Why not use WCCP2_METHOD_ everywhere if ther are
> the same value?
>
> Regards
> Henrik
>
>


Australian Development Meetup 2008 - Notes

2008-09-11 Thread Adrian Chadd
G'day,

I've started publishing the notes from the presentations and developer
discussions that we held at the Yahoo!7 offices last month.
You can find them at
http://www.squid-cache.org/Conferences/AustraliaMeeting2008/ .

I'm going to try and make sure any further
mini-conferences/discussions/etc which happen go up there so people
get more of an idea of whats going on.

Who knows, eventually there may be enough interest to hold a
reasonably formal Squid conference somewhere.. :)



Adrian


Re: Where to document APIs?

2008-09-11 Thread Adrian Chadd
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>:

>> To clarify:
>>
>> Longer API documents, .dox file in docs/, or maybe src/ next to the .cc
>>
>> Basic rules the code need to fulfill, or until the API documentation
>> grows large, in the .h or .cc file.
>
> You all have seen the current API notes for Comm and AsyncCalls. Do you
> think they should go into a .dox or .h file?
>
> I think they are big enough (and growing) to justify a .dox file. I will
> probably add those files to trunk (next to the corresponding .h files)
> unless there are better ideas.

Whats wrong with inline documentation again?



Adrian


Re: Comm API notes

2008-09-10 Thread Adrian Chadd
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>:
> Here is a replacement text:
>
>  The comm_close API will be used exclusively for "stop future I/O,
>  schedule a close callback call, and cancel all other callbacks"
>  purposes. New user code should not use comm_close for the purpose of
>  immediately ending a job via a close handler call.

Yup.

(As part of another email) I'd also make it completely clear that the
underlying socket and IO may not be immediately closed via a
comm_close() until pending scheduled IO events occur; and that the
callers should be prepared for the situation where the underlying
buffer(s) and other resources must stay immutable until the completion
of the kernel-side stuff.

This is partially why I wanted explicit notification, cancellation or
not, so the owners of things like buffers would know when they were
able to modify/reuse them again - or the "immutable" semantics must be
enforced some other way.



Adrian


Re: Comm API notes

2008-09-10 Thread Adrian Chadd
2008/9/11 Alex Rousskov <[EMAIL PROTECTED]>:
> * I/O cancellation.
>
>  To cancel an interest in a read operation, call comm_read_cancel()
>  with an AsyncCall object. This call guarantees that the passed Call
>  will be canceled (see the AsyncCall API for call cancellation
>  definitions and details). Naturally, the code has to store the
>  original read callback Call pointer to use this interface. This call
>  does not guarantee that the read operation has not already happen.
>  This call guarantees that the read operation will not happen.

As I said earlier, you can't guarantee that with asynchronous IO. The
call may be in progress and not completed. I'm assuming you'd count
"in progress" as "has already happened" but unlike the latter, you
can't cancel it at the OS level.

As long as the API keeps all the relevant OS related structures in
place to allow the IO to complete, and callers to the cancellation
function are prepared to handle the case where the IO is happening
versus has already happened, then i'm happy.

>  You cannot reliably cancel an interest in read operation using the old
>  comm_read_cancel call that uses a function pointer. The handler may
>  get even called after old comm_read_cancel was called. This old API
>  will be removed.

I really did think I had fixed removing the pending callbacks from the
callback queue when I implemented this. (Ie, I thought I implemented
enough for the POSIX read/write API but not enough for
overlapped/POSIX IO.) What were people seeing pre-AsyncCalls?

>  It is OK to call comm_read_cancel (both old and new) at any time as
>  long as the descriptor has not been closed and there is either no read
>  interest registered or the passed parameters match the registered
>  ones. If the descriptor has been closed, the behavior is undefined.
>  Otherwise, if parameters do not match, you get an assertion.
>
>  To cancel other operations, close the descriptor with comm_close.

I'm still not happy with comm_close() being used in that way; it seems
you aren't either and are stipulating new user code aborts jobs via
alternative paths.

I'm also not happy with the idea of close handlers to unwind state
associated with it; how "deep" do close handlers actually get? Would
we be better off in the long run by stipulating a more rigid shutdown
process (eg - shutting down a client-side fd would not involve
comm_close(fd); but ConnStateData::close() which would handle clearing
the clientHttpRequests and such, then itself + fd?)

>  Raw socket descriptors may be replaced with unique IDs or small
>  objects that help detect stale descriptor/socket usage bugs and
>  encapsulate access to socket-specific information. New user code
>  should treat descriptor integers as opaque objects.

I do agree with this. As Henrik said, this makes Windows porting a bit
easier. There are still other problems to tackle to properly abuse
overlapped IO in any sensible fashion, mostly surrounding IO
scheduling and callback scheduling..



adrian


Re: [MERGE] Config cleanups

2008-09-10 Thread Adrian Chadd
You have the WCCPv2 stuff around the wrong way.

the redirection has nothing to do with the assignment method.

You can and do have L2 redirection with hash assignment. You probably
won't have GRE redirection with mask assignment though, but I think
its entirely possible.

Keep the options separate, and named whatever they are in the wccp2 draft.

I'd also suggest committing each chunk thats "different" seperately -
ie, the wccp stuff seperate, the ACL tidyup seperate, the default
storage stuff seperate, etc. That makes backing out patches easier if
needed.

2c,



Adrian

2008/9/10 Amos Jeffries <[EMAIL PROTECTED]>:
> This update removes several magic number options in the WCCPv2
> configuration. Replacing them with user-freindly text options.
>
> This should help with a lot of config confusion where these are needed until
> they are obsoleted properly.
>
> # Bazaar merge directive format 2 (Bazaar 0.90)
> # revision_id: [EMAIL PROTECTED]
> # target_branch: file:///src/squid/bzr/trunk/
> # testament_sha1: 7b319238106ae2926697f85b2ec58c3476abc121
> # timestamp: 2008-09-11 03:50:49 +1200
> # base_revision_id: [EMAIL PROTECTED]
> #   q5rnfdpug13p94fl
> #
> # Begin patch
> === modified file 'src/cf.data.depend'
> --- src/cf.data.depend  2008-04-03 05:31:29 +
> +++ src/cf.data.depend  2008-09-10 15:22:08 +
> @@ -47,6 +47,7 @@
>  tristate
>  uri_whitespace
>  ushort
> +wccp2_method
>  wccp2_service
>  wccp2_service_info
>  wordlist
>
> === modified file 'src/cf.data.pre'
> --- src/cf.data.pre 2008-08-09 06:24:33 +
> +++ src/cf.data.pre 2008-09-10 15:47:36 +
> @@ -831,8 +831,8 @@
>
>  NOCOMMENT_START
>  #Allow ICP queries from local networks only
> -icp_access allow localnet
> -icp_access deny all
> +#icp_access allow localnet
> +#icp_access deny all
>  NOCOMMENT_END
>  DOC_END
>
> @@ -856,8 +856,8 @@
>
>  NOCOMMENT_START
>  #Allow HTCP queries from local networks only
> -htcp_access allow localnet
> -htcp_access deny all
> +#htcp_access allow localnet
> +#htcp_access deny all
>  NOCOMMENT_END
>  DOC_END
>
> @@ -883,7 +883,7 @@
>  NAME: miss_access
>  TYPE: acl_access
>  LOC: Config.accessList.miss
> -DEFAULT: none
> +DEFAULT: allow all
>  DOC_START
>Use to force your neighbors to use you as a sibling instead of
>a parent.  For example:
> @@ -897,11 +897,6 @@
>
>By default, allow all clients who passed the http_access rules
>to fetch MISSES from us.
> -
> -NOCOMMENT_START
> -#Default setting:
> -# miss_access allow all
> -NOCOMMENT_END
>  DOC_END
>
>  NAME: ident_lookup_access
> @@ -1555,9 +1550,7 @@
>
>  icp-port:  Used for querying neighbor caches about
> objects.  To have a non-ICP neighbor
> -specify '7' for the ICP port and make sure the
> -neighbor machine has the UDP echo port
> -enabled in its /etc/inetd.conf file.
> +specify '0' for the ICP port.
>NOTE: Also requires icp_port option enabled to send/receive
>  requests via this method.
>
> @@ -1955,7 +1948,7 @@
>  NAME: maximum_object_size_in_memory
>  COMMENT: (bytes)
>  TYPE: b_size_t
> -DEFAULT: 8 KB
> +DEFAULT: 512 KB
>  LOC: Config.Store.maxInMemObjSize
>  DOC_START
>Objects greater than this size will not be attempted to kept in
> @@ -2124,7 +2117,7 @@
>which can be changed with the --with-coss-membuf-size=N configure
>option.
>  NOCOMMENT_START
> -cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256
> +# cache_dir ufs @DEFAULT_SWAP_DIR@ 100 16 256
>  NOCOMMENT_END
>  DOC_END
>
> @@ -2291,7 +2284,7 @@
>  NAME: access_log cache_access_log
>  TYPE: access_log
>  LOC: Config.Log.accesslogs
> -DEFAULT: none
> +DEFAULT: @DEFAULT_ACCESS_LOG@ squid
>  DOC_START
>These files log client request activities. Has a line every HTTP or
>ICP request. The format is:
> @@ -2314,9 +2307,9 @@
>
>And priority could be any of:
>err, warning, notice, info, debug.
> -NOCOMMENT_START
> -access_log @DEFAULT_ACCESS_LOG@ squid
> -NOCOMMENT_END
> +
> +   Default:
> +   access_log @DEFAULT_ACCESS_LOG@ squid
>  DOC_END
>
>  NAME: log_access
> @@ -2342,14 +2335,17 @@
>
>  NAME: cache_store_log
>  TYPE: string
> -DEFAULT: @DEFAULT_STORE_LOG@
> +DEFAULT: none
>  LOC: Config.Log.store
>  DOC_START
>Logs the activities of the storage manager.  Shows which
>objects are ejected from the cache, and which objects are
> -   saved and for how long.  To disable, enter "none". There are
> -   not really utilities to analyze this data, so you can safely
> +   saved and for how long.  To disable, enter "none" or remove the
> line.
> +   There are not really utilities to analyze this data, so you can
> safely
>disable it.
> +NOCOMMENT_START
> +# cache_store_log @DEFAULT_STORE_LOG@
> +NOCOMMENT_END
>  DOC_END
>
>  NAME: cache_swap_state cache_swap_log
> @@ -3085,7 +3081,7 @@
>  NAME: request_heade

Squid-2.HEAD URL regression with CONNECT

2008-09-09 Thread Adrian Chadd
G'day,

Squid-2.HEAD doesn't seem to handle CONNECT URLs anymore; I get something like:

[start]
The requested URL could not be retrieved

While trying to retrieve the URL: www.gmail.com:443

The following error was encountered:

* Invalid URL
[end]

Benno, could you please double/triple check that your method and url
related changes to Squid-2.HEAD didn't break CONNECT?

Thanks!


Adrian


Re: How to buffer a POST request

2008-09-09 Thread Adrian Chadd
Well, I've got a proof of concept which works well but its -very-
ugly. This is one of those things may have been slightly easier to do
in Squid-3 with Alex's BodyPipe changes. I haven't stared at the
BodyPipe code to know whether its doing all the right kinds of
buffering for this application.

The problem is that Squid-2's request body data pipeline doesn't do
any of its own buffering - it doesn't do anything at all until a
consumer says "give me some more request body data please" at which
point its copied out of conn->in.buf (the client-side incoming socket
buffer), consumed, and passed onto the caller.

I thought about a "clean" implementation which would involve the
request body pipeline code consuming socket buffer data until a
certain threshold is reached, then feeding that back up to the request
body consumer but I decided that was too difficult for this particular
contract.

Instead, the "hack" here is to just keep reading data into the
client-side socket buffer - its already doing double duty as a request
body buffer anyway - until an ACL match fires to begin forwarding. Its
certainly not clean but it seems to work in local testing. I haven't
yet tested connection aborts and such to make sure that connections
are properly cleaned up.

I'll look at posting a patch to squid-dev in a day or two once my
client has had a look at it.

Thanks,



Adrian


2008/8/8 Adrian Chadd <[EMAIL PROTECTED]>:
> Well I'm still going through the process of planning out what changes
> need to happen.
>
> I know what changes need to happen long-term but this project doesn't
> have that sort of scope..
>
>
>
> Adrian
>
> 2008/8/8 Mark Nottingham <[EMAIL PROTECTED]>:
>> You said you were doing it :)
>>
>>
>> On 08/08/2008, at 4:40 PM, Adrian Chadd wrote:
>>
>>> Way to dob me in!
>>>
>>>
>>> Adrian
>>>
>>> 2008/8/8 Mark Nottingham <[EMAIL PROTECTED]>:
>>>>
>>>> I took at stab at:
>>>> http://wiki.squid-cache.org/Features/RequestBuffering
>>>>
>>>>
>>>> On 22/07/2008, at 4:40 PM, Henrik Nordstrom wrote:
>>>>
>>>>> It's not a bug. A feature request in the wiki is more appropriate.
>>>>>
>>>>> wiki.squid-cache.org/Features/
>>>>>
>>>>> Regards
>>>>> Henrik
>>>>>
>>>>> On mån, 2008-07-21 at 17:50 -0700, Mark Nottingham wrote:
>>>>>>
>>>>>> I couldn't find an open bug for this, so I opened
>>>>>> http://www.squid-cache.org/bugs/show_bug.cgi?id=2420
>>>>>>
>>>>>>
>>>>>> On 11/06/2008, at 3:29 AM, Henrik Nordstrom wrote:
>>>>>>
>>>>>>> On ons, 2008-06-11 at 12:51 +0300, Mikko Kettunen wrote:
>>>>>>>
>>>>>>>> Yes, I read something about this on squid-users list, there seems
>>>>>>>> to be
>>>>>>>> 8kB buffer for this if I understood right.
>>>>>>>
>>>>>>> The buffer is bigger than that. But not unlimited.
>>>>>>>
>>>>>>> The big change needed is that there currently isn't anything delaying
>>>>>>> forwarding of the request headers until sufficient amount of the
>>>>>>> request
>>>>>>> body has been buffered.
>>>>>>>
>>>>>>> Regards
>>>>>>> Henrik
>>>>>>
>>>>>> --
>>>>>> Mark Nottingham   [EMAIL PROTECTED]
>>>>>>
>>>>
>>>> --
>>>> Mark Nottingham   [EMAIL PROTECTED]
>>>>
>>>>
>>>>
>>
>> --
>> Mark Nottingham   [EMAIL PROTECTED]
>>
>>
>>
>


Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.

2008-09-08 Thread Adrian Chadd
Hah, Amos just exposed my on-set short term memory loss!

(Time to get a bigger whiteboard..)



Adrian

2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>:
>> I've been thinking about doing exactly this after I've been knee-deep
>> in the DNS code.
>> It may not be a bad idea to have generic udp/tcp incoming/outgoing
>> addresses which can then be over-ridden per-"protocol".
>>
>
> WTF? We discussed this months ago and came to the conclusion it would be
> good to have a two layered outgoing address/port assignment.
>
> a) base default of random system-assigned outbound address port.
>
> b) override per-component/protocol  in/out bound address/port with
> individual config options.
>
> Amos
>
>>
>> Adrian
>>
>> 2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>:
 
 revno: 9176
 committer: Alex Rousskov <[EMAIL PROTECTED]>
 branch nick: trunk
 timestamp: Mon 2008-09-08 17:52:06 -0600
 message:
   Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming
 address.
 modified:
   src/htcp.cc

>>>
>>> I think this is one of those cleanup situations where we wanted to split
>>> the protocol away from generic udp_*_address and make it an
>>> htcp_outgoing_address. Yes?
>>>
>>> Amos
>>>
>>>
>>>
>>
>
>
>


Re: /bzr/squid3/trunk/ r9176: Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming address.

2008-09-08 Thread Adrian Chadd
I've been thinking about doing exactly this after I've been knee-deep
in the DNS code.
It may not be a bad idea to have generic udp/tcp incoming/outgoing
addresses which can then be over-ridden per-"protocol".



Adrian

2008/9/9 Amos Jeffries <[EMAIL PROTECTED]>:
>> 
>> revno: 9176
>> committer: Alex Rousskov <[EMAIL PROTECTED]>
>> branch nick: trunk
>> timestamp: Mon 2008-09-08 17:52:06 -0600
>> message:
>>   Fixed typo: Config.Addrs.udp_outgoing was used for the HTCP incoming
>> address.
>> modified:
>>   src/htcp.cc
>>
>
> I think this is one of those cleanup situations where we wanted to split
> the protocol away from generic udp_*_address and make it an
> htcp_outgoing_address. Yes?
>
> Amos
>
>
>


Re: squid-2.HEAD: some changes to client_side.c for invalid requests.

2008-09-08 Thread Adrian Chadd
G'day,

Please make sure you put these patches into bugzilla so they're not lost.



adrian

2008/9/8 Alexander V. Lukyanov <[EMAIL PROTECTED]>:
> On Mon, Sep 08, 2008 at 02:49:50PM +0400, Alexander V. Lukyanov wrote:
>> 3. create method object even for invalid requests (this fixes null pointer
>> dereferences in many other places).
>
> I also suggest this patch to detect attempts to create method-less requests.
>
> --
>   Alexander.
>


Re: [PATCH] Send 407 on url_rewrite_access/storeurl_access

2008-09-07 Thread Adrian Chadd
Thanks! Don't forget to bug me if its not sorted out in the next week or so.



Adrian

2008/9/8 Diego Woitasen <[EMAIL PROTECTED]>:
> http://www.squid-cache.org/bugs/show_bug.cgi?id=2455
>
> On Sun, Sep 07, 2008 at 09:28:30AM +0800, Adrian Chadd wrote:
>> It looks fine; could you dump it into bugzilla for the time being?
>> (We're working on the Squid-2 -> bzr merge stuff at the moment!)
>>
>>
>>
>> Adrian
>>
>> 2008/9/7 Diego Woitasen <[EMAIL PROTECTED]>:
>> > This patch apply to Squid 2.7.STABLE4.
>> >
>> > If we use a proxy_auth acl on {storeurl,url_rewrite}_access and the user
>> > isn't authenticated previously, send 407.
>> >
>> > regards,
>> >Diego
>> >
>> >
>> > diff --git a/src/client_side.c b/src/client_side.c
>> > index 23c4274..4f75ea0 100644
>> > --- a/src/client_side.c
>> > +++ b/src/client_side.c
>> > @@ -448,19 +448,71 @@ clientFinishRewriteStuff(clientHttpRequest * http)
>> >
>> >  }
>> >
>> > -static void
>> > -clientAccessCheckDone(int answer, void *data)
>> > +void
>> > +clientSendErrorReply(clientHttpRequest * http, int answer)
>> >  {
>> > -clientHttpRequest *http = data;
>> > err_type page_id;
>> > http_status status;
>> > ErrorState *err = NULL;
>> > char *proxy_auth_msg = NULL;
>> > +
>> > +proxy_auth_msg = 
>> > authenticateAuthUserRequestMessage(http->conn->auth_user_request ? 
>> > http->conn->auth_user_request : http->request->auth_user_request);
>> > +
>> > +int require_auth = (answer == ACCESS_REQ_PROXY_AUTH || 
>> > aclIsProxyAuth(AclMatchedName)) && !http->request->flags.transparent;
>> > +
>> > +debug(33, 5) ("Access Denied: %s\n", http->uri);
>> > +debug(33, 5) ("AclMatchedName = %s\n",
>> > +   AclMatchedName ? AclMatchedName : "");
>> > +debug(33, 5) ("Proxy Auth Message = %s\n",
>> > +   proxy_auth_msg ? proxy_auth_msg : "");
>> > +
>> > +/*
>> > + * NOTE: get page_id here, based on AclMatchedName because
>> > + * if USE_DELAY_POOLS is enabled, then AclMatchedName gets
>> > + * clobbered in the clientCreateStoreEntry() call
>> > + * just below.  Pedro Ribeiro <[EMAIL PROTECTED]>
>> > + */
>> > +page_id = aclGetDenyInfoPage(&Config.denyInfoList, AclMatchedName, 
>> > answer != ACCESS_REQ_PROXY_AUTH);
>> > +http->log_type = LOG_TCP_DENIED;
>> > +http->entry = clientCreateStoreEntry(http, http->request->method,
>> > +   null_request_flags);
>> > +if (require_auth) {
>> > +   if (!http->flags.accel) {
>> > +   /* Proxy authorisation needed */
>> > +   status = HTTP_PROXY_AUTHENTICATION_REQUIRED;
>> > +   } else {
>> > +   /* WWW authorisation needed */
>> > +   status = HTTP_UNAUTHORIZED;
>> > +   }
>> > +   if (page_id == ERR_NONE)
>> > +   page_id = ERR_CACHE_ACCESS_DENIED;
>> > +} else {
>> > +   status = HTTP_FORBIDDEN;
>> > +   if (page_id == ERR_NONE)
>> > +   page_id = ERR_ACCESS_DENIED;
>> > +}
>> > +err = errorCon(page_id, status, http->orig_request);
>> > +if (http->conn->auth_user_request)
>> > +   err->auth_user_request = http->conn->auth_user_request;
>> > +else if (http->request->auth_user_request)
>> > +   err->auth_user_request = http->request->auth_user_request;
>> > +/* lock for the error state */
>> > +if (err->auth_user_request)
>> > +   authenticateAuthUserRequestLock(err->auth_user_request);
>> > +err->callback_data = NULL;
>> > +errorAppendEntry(http->entry, err);
>> > +
>> > +}
>> > +
>> > +static void
>> > +clientAccessCheckDone(int answer, void *data)
>> > +{
>> > +clientHttpRequest *http = data;
>> > +
>> > debug(33, 2) ("The request %s %s is %s, because it matched '%s'\n",
>> >RequestMethods[http->request->method].str, http->uri,
>> >answer == ACCESS_ALLOWED ? "ALLOWED" : "DENIED",
>> >AclMatchedName ? AclMatchedName : "

  1   2   3   4   5   6   7   8   9   10   >