Re: Squid-2.7 being moved to old releases

2011-06-24 Thread Mark Nottingham
I'd really like to see some of the features in a release, and am happy to help 
where I can.


On 21/06/2011, at 9:08 AM, Henrik Nordström wrote:

 With the release of Squid-3.2 it's about time we drop Squid-2.7 as a
 supported release. Maintenance of Squid-2 have already fallen way
 behind, not receiving anywhere close to the attention deserved for a
 supported release with all currently active developers mainly focusing
 on Squid-3.
 
 Squid-3 have in the meantime also gained a lot of interesting
 functionality, both ported from Squid-2 and new functionality,
 considerably closing the upgrade feature gap. There is still some
 feature gap left, but it is getting smaller and smaller.
 
 One way to see this is that if you or someone you know is stuck in
 Squid-2.7 for some reason then it's about time to look into solving that
 now. In reality that should have been done years ago.
 
 In practice it does not mean that much as maintenance of Squid-2.7 is
 already practically stopped with last release over a year ago. Partly
 because there isn't very much left to fix in the release, but also
 because there simply isn't any project resources actively working on
 Squid-2 maintenance.
 
 There is some bug fixes pending in the Squid-2 merge queue, sufficient
 to compose a final release, but if you know of any more Squid-2 bugs
 that really should be fixed please speak up now.
 
 Then there is some feature stuff in Squid-2.HEAD, but I think these will
 remain there. I do not see sufficient momentum for branching a Squid-2.8
 release.
 
 See http://www.squid-cache.org/Versions/v2/2.HEAD/changesets/merge.html
 for lists of both pending merges to 2.7 and 2.HEAD specific features.
 
 So at the moment it looks like there will be one final Squid-2.7 release
 collecting some pending bug fixes, after which maintenance of the
 Squid-2 tree will cease completely, enabling the project to focus
 entirely on Squid-3.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




FYI: Forwarded-For

2011-04-11 Thread Mark Nottingham
There's currently a proposal floating around to formalise a Forwarded-For 
header, with the same semantics as X-Forwarded-For, but with IPv6 support and 
perhaps more.

See:
  http://tools.ietf.org/html/draft-petersson-forwarded-for-00

Currently being discussed a fair amount on the HTTP WG list, starting at:
  http://lists.w3.org/Archives/Public/ietf-http-wg/2011AprJun/0024.html

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Bug 2956 - Collapsed forwarding breaks large varying responses

2011-04-04 Thread Mark Nottingham
http://bugs.squid-cache.org/show_bug.cgi?id=2956

Has anyone looked into this, or even confirmed it? 

It's kind of a major breakage, and I'm surprised no one else has run into it...

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




FYI: http timeout headers

2011-03-10 Thread Mark Nottingham
http://tools.ietf.org/html/draft-thomson-hybi-http-timeout-00

In a nutshell, this draft introduces two new headers: Request-Timeout, which is 
an end-to-end declaration of how quickly the client wants the response, and 
Connection-Timeout, which is a hop-by-hop declaration of how long an idle conn 
can stay open.

I'm going to give feedback to Martin about this (I can see a few places where 
there may be issues, e.g., it doesn't differentiate between a read timeout on 
an open request and an idle timeout), but I wanted to get a sense of what Squid 
developers thought; in particular -

1) is this interesting enough that you'd implement if it came out?

2) if someone submitted a patch for this, would you include it?

3) do you see any critical issues, esp. regarding performance impact?

Thanks,

--
Mark Nottingham   m...@yahoo-inc.com




Re: FYI: http timeout headers

2011-03-10 Thread Mark Nottingham
Right. I think the authors hope that intermediaries (proxies and gateways) will 
adapt their policies (within configured limits) based upon what they see in 
incoming connection-timeout headers, and rewrite the outgoing 
connection-timeout headers appropriately.

I'm not sure whether that will happen, hence my question. I'm even less sure 
about the use cases for request-timeout, for the reasons you mention.

Cheers,


On 10/03/2011, at 2:10 PM, Robert Collins wrote:

 On Fri, Mar 11, 2011 at 6:22 AM, Mark Nottingham m...@yahoo-inc.com wrote:
 http://tools.ietf.org/html/draft-thomson-hybi-http-timeout-00
 
 In a nutshell, this draft introduces two new headers: Request-Timeout, which 
 is an end-to-end declaration of how quickly the client wants the response, 
 and Connection-Timeout, which is a hop-by-hop declaration of how long an 
 idle conn can stay open.
 
 I'm going to give feedback to Martin about this (I can see a few places 
 where there may be issues, e.g., it doesn't differentiate between a read 
 timeout on an open request and an idle timeout), but I wanted to get a sense 
 of what Squid developers thought; in particular -
 
 1) is this interesting enough that you'd implement if it came out?
 
 Not personally, because the sites I'm involved with set timeouts on
 the backend as policy: dropping reads early won't save backend
 computing overhead (because request threads aren't cleanly
 interruptible), and permitting higher timeouts would need guards to
 prevent excessive resource consumption being permitted
 inappropriately.
 
 2) if someone submitted a patch for this, would you include it?
 
 If it was clean, sure. But again there will be an interaction with
 site policy. e.g. can a client ask for a higher timeout than the squid
 admin has configured, or can they solely lower it to give themselves a
 snappier responsiveness. (and if the latter, why not just drop the
 connection if they don't get an answer soon enough).
 
 3) do you see any critical issues, esp. regarding performance impact?
 
 I would worry a little about backend interactions: unless this header
 is honoured all the way through it would be easy for multiple backend
 workers to be calculating expensive resources for the same client
 repeatedly trying something with an inappropriately low timeout.
 
 I guess this just seems a bit odd overall : servers generally have a
 very good idea about the urgency of things it can serve based on what
 they are delivering.
 
 -Rob

--
Mark Nottingham   m...@yahoo-inc.com




dedup fs's?

2011-02-02 Thread Mark Nottingham
Has anyone tried squid with lessfs or ZFS in dedup mode?

Just curious; I read a paper once that implied that Traffic Server used to do 
dedup internally, but apparently it doesn't any more...


--
Mark Nottingham   m...@yahoo-inc.com




Re: Unparseable HTTP header field DNS weirdness

2011-01-25 Thread Mark Nottingham
Do you have an upstream proxy configured?

Cheers,


On 21/01/2011, at 3:29 AM, Alex Ray wrote:

 This might be nothing, but I notice the following errors in my build
 of squid 3.HEAD:
 kid1| ctx: enter level  0:
 'http://trailers.apple.com/trailers/global/styles/ipad_black.css
 kid1| WARNING: unparseable HTTP header field {: , 1.1 cup-www-cache01: 80}
 kid1| WARNING: unparseable HTTP header field {: , 1.1 cup-www-cache02: 80}
 kid1| ctx: exit level  0

--
Mark Nottingham   m...@yahoo-inc.com




Re: [PATCH] HTTP Compliance: improve age calculation

2010-09-29 Thread Mark Nottingham
How does this interact with 
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/29?

While you're at it, any thoughts about 
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/212?

Cheers,


On 28/09/2010, at 3:01 PM, Alex Rousskov wrote:

 HTTP Compliance: improve entity age calculation.
 
 Account for response delay in entity age calculation as described in RFC 
 2616 section 13.2.3.
 
 Co-Advisor test cases:
 test_case/rfc2616/ageCalc-none-7-none
 test_case/rfc2616/ageCalc-5400-4-5
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Does no-store in request imply no-cache?

2010-09-22 Thread Mark Nottingham
Strictly, as a request directive it means you can't store the response to this 
request -- it says nothing about whether or not you can satisfy the request 
from a cache.

However, I imagine most people would interpret it as implying no-cache; 
you're still conformant if you do.

See also:
  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-11#section-3.2.1


On 23/09/2010, at 4:27 AM, Alex Rousskov wrote:

 Hello,
 
 One interpretation of RFC 2616 allows the proxy to serve hits when 
 the request contains Cache-Control: no-store. Do you think such an 
 interpretation is valid?
 
   no-store
   The purpose of the no-store directive is to prevent the
   inadvertent release or retention of sensitive information (for
   example, on backup tapes). The no-store directive applies to the
   entire message, and MAY be sent either in a response or in a
   request. If sent in a request, a cache MUST NOT store any part of
   either this request or any response to it.
 
 Thank you,
 
 Alex.

--
Mark Nottingham   m...@yahoo-inc.com




Re: Does no-store in request imply no-cache?

2010-09-22 Thread Mark Nottingham

On 23/09/2010, at 9:47 AM, Alex Rousskov wrote:
 
 Hi Mark,
 
 Let's assume the above is correct and Squid satisfied the no-store 
 request from the cache. Should Squid purge the cached response afterwards?
 
 If Squid does not purge, the next regular request will get the same 
 cached response as the no-store request got, kind of violating the MUST NOT 
 store any response to it no-store requirement.

Sort of, but not really. I agree this could be worded better; we'll work on it.

 If Squid purges, it is kind of silly because earlier requests could have 
 gotten the same sensitive information before the no-store request came and 
 declared the already cached information sensitive.

Agreed. 

This has been discussed in the WG before (can't remember the ref); basically, 
it boiled down to each request being independent; you don't want requests 
affecting other ones (beyond anything, it's a security issue if you allow 
clients to purge your cache indescriminantly). 

--
Mark Nottingham   m...@yahoo-inc.com




Re: FYI: github

2010-08-15 Thread Mark Nottingham
Absolutely. The only reason I didn't do it is that I'm not as familiar with 
bazaar...

Cheers,


On 16/08/2010, at 8:22 AM, Robert Collins wrote:

 Its fine by me; we could push squid3 up as well using bzr-git, if folk
 are interested.
 
 -Rob

--
Mark Nottingham   m...@yahoo-inc.com




Re: FYI: github

2010-08-15 Thread Mark Nottingham
Yep. I didn't hit the 'publicise' button to make that public, tho; figured I'd 
leave that to you.

Cheers,


On 16/08/2010, at 12:52 PM, Henrik Nordström wrote:

 fre 2010-08-13 klockan 18:54 -0500 skrev Mark Nottingham:
 
 P.S., if any other squid-dev people are on github, we can add you to the 
 group, FWIW, although like I said, this is read-only...
 
 I have a github account. hno as mostly everywhere else. But I see you
 already noticed that.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




FYI: github

2010-08-13 Thread Mark Nottingham
I've noticed a few people creating Squid2 trees using git. The problem with 
this is that when they do so, they get a snapshot of squid at that time.

To make it easier for them to track HEAD, I've created a mirror of the squid2 
source on github:
  http://github.com/squid-cache/squid2

This is semi-automatically updated from HEAD (and will be automatic once I get 
my cron jobs in order). Now, people can fork that project and more easily 
integrate updates. Note that it's read-only; i.e., patches won't be accepted 
there (although it should be easy to take patches from a forked version back to 
CVS).

I asked on IRC if anyone minded this, and no one seemed to, but if it's a big 
problem I'm happy to delete the repository.

Some may be interested in this visualisation (scroll to the right):
  http://github.com/squid-cache/squid2/graphs/impact

Cheers,

P.S., if any other squid-dev people are on github, we can add you to the group, 
FWIW, although like I said, this is read-only...



--
Mark Nottingham   m...@yahoo-inc.com




Re: [MERGE] SMP implementation, part 1

2010-07-01 Thread Mark Nottingham

On 30/06/2010, at 6:46 PM, Alex Rousskov wrote:

   ICP and HTCP _servers_: share listening sockets.


Am I right that currently, it's effectively a random process that handles the 
query or CLR, and no state / effects are shared among processes?

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Proxy-Connection

2010-06-30 Thread Mark Nottingham
FYI, I've asked Mozilla to stop sending Proxy-Connection:
  https://bugzilla.mozilla.org/show_bug.cgi?id=570283

Any additional thoughts here? Should Squid stop using it, to encourage its 
demise (I won't use the word early, as it's been around far too long already)?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: Caching of the POST messages

2010-06-24 Thread Mark Nottingham
...although there is a patch for this in squid2-HEAD. It buffers to memory, 
though, not disk.


On 24/06/2010, at 6:51 AM, Henrik Nordström wrote:

 ons 2010-06-23 klockan 07:36 -0500 skrev Sandeep Kuttal:
 
 I am looking for changing the Squid code little bit to cache POST
 messages.
 
 caching POST messages in Squid is hard to implement due to Squid not
 buffering the POST body before forwarding the request.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: problems collapsing large responses

2010-06-16 Thread Mark Nottingham
Now http://bugs.squid-cache.org/show_bug.cgi?id=2956. 

Can anyone else reproduce this? I'm a bit surprised it hasn't bitten someone 
else.


On 16/02/2010, at 5:45 PM, Mark Nottingham wrote:

  - it *does* hit VARY_REFRESH first
  - threshold for the vary problem is much lower
  - evident since (at least) Squid 2.7STABLE1

--
Mark Nottingham   m...@yahoo-inc.com




Bug 2957 - only-if-cached shouldn't count when we're not caching

2010-06-16 Thread Mark Nottingham
Any thoughts about this one?

http://bugs.squid-cache.org/show_bug.cgi?id=2957


--
Mark Nottingham   m...@yahoo-inc.com




Re: Bug 2957 - only-if-cached shouldn't count when we're not caching

2010-06-16 Thread Mark Nottingham
Ah, no -- the cache ACL has to be explicitly applied, e.g.,

cache deny all

Cheers,


On 17/06/2010, at 12:58 PM, Robert Collins wrote:

 Well it sounds totally fine in principle; I'm wondering (without
 reading the patch) how you define 'we are not caching' - just no
 cachedirs ? That excludes mem-only caching (or perhaps thats not
 supported now).
 
 -Rob

--
Mark Nottingham   m...@yahoo-inc.com




Re: Joining squid-dev List

2010-06-16 Thread Mark Nottingham
Has anything been applied to 3 yet? I'd like to apply this patch, but don't 
want conflicting  / slightly different configuration or implementation.

Cheers,


On 22/05/2010, at 1:53 PM, Mark Nottingham wrote:

 Bug w/ patch for 2.HEAD at:
  http://bugs.squid-cache.org/show_bug.cgi?id=2931
 
 
 On 18/05/2010, at 4:33 PM, Henrik Nordström wrote:
 
 tis 2010-05-18 klockan 15:12 +1000 skrev Mark Nottingham:
 
 /*
  * The 'need_validation' flag is used to prevent forwarding
  * loops between siblings.  If our copy of the object is stale,
  * then we should probably only use parents for the validation
  * request.  Otherwise two siblings could generate a loop if
  * both have a stale version of the object.
  */
 r-flags.need_validation = 1;
 
 Is the code in Squid3 roughly the same?
 
 Should be.
 
 I'm tempted to get rid of the need_validation flag, as there are other
 ways that Squid does loop suppression (e.g., only-if-cached on peer
 requests, icp_stale_hit). What do people think of this? Is this howyou
 addressed it.
 
 Don't get rid of the flag, but an option to not skip siblings based on
 it unless the sibling is configured with allow-miss
 (peer-options.allow_miss) is fine.
 
 When using ICP or Digests forwarding loop conditions is quite common,
 triggered by clients sending their own freshness requirements or slight
 difference in configuration between the siblings.
 
 Regards
 Henrik
 
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Joining squid-dev List

2010-05-21 Thread Mark Nottingham
Bug w/ patch for 2.HEAD at:
  http://bugs.squid-cache.org/show_bug.cgi?id=2931


On 18/05/2010, at 4:33 PM, Henrik Nordström wrote:

 tis 2010-05-18 klockan 15:12 +1000 skrev Mark Nottingham:
 
  /*
   * The 'need_validation' flag is used to prevent forwarding
   * loops between siblings.  If our copy of the object is stale,
   * then we should probably only use parents for the validation
   * request.  Otherwise two siblings could generate a loop if
   * both have a stale version of the object.
   */
  r-flags.need_validation = 1;
 
 Is the code in Squid3 roughly the same?
 
 Should be.
 
 I'm tempted to get rid of the need_validation flag, as there are other
 ways that Squid does loop suppression (e.g., only-if-cached on peer
 requests, icp_stale_hit). What do people think of this? Is this howyou
 addressed it.
 
 Don't get rid of the flag, but an option to not skip siblings based on
 it unless the sibling is configured with allow-miss
 (peer-options.allow_miss) is fine.
 
 When using ICP or Digests forwarding loop conditions is quite common,
 triggered by clients sending their own freshness requirements or slight
 difference in configuration between the siblings.
 
 Regards
 Henrik
 



Re: Joining squid-dev List

2010-05-18 Thread Mark Nottingham
OK.

Any thoughts about moving where refreshCheck is called for HTCP?

Cheers,


On 18/05/2010, at 4:33 PM, Henrik Nordström wrote:

 tis 2010-05-18 klockan 15:12 +1000 skrev Mark Nottingham:
 
  /*
   * The 'need_validation' flag is used to prevent forwarding
   * loops between siblings.  If our copy of the object is stale,
   * then we should probably only use parents for the validation
   * request.  Otherwise two siblings could generate a loop if
   * both have a stale version of the object.
   */
  r-flags.need_validation = 1;
 
 Is the code in Squid3 roughly the same?
 
 Should be.
 
 I'm tempted to get rid of the need_validation flag, as there are other
 ways that Squid does loop suppression (e.g., only-if-cached on peer
 requests, icp_stale_hit). What do people think of this? Is this howyou
 addressed it.
 
 Don't get rid of the flag, but an option to not skip siblings based on
 it unless the sibling is configured with allow-miss
 (peer-options.allow_miss) is fine.
 
 When using ICP or Digests forwarding loop conditions is quite common,
 triggered by clients sending their own freshness requirements or slight
 difference in configuration between the siblings.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Joining squid-dev List

2010-05-18 Thread Mark Nottingham
Well, refreshCheckHTCP() is called when a peer receives a HTCP request, rather 
than when the HTCP response is evaluated on the sending peer.

RFC2756 defines the semantics as 

RESPONSE codes for TST are as follows:
 
0   entity is present in responder's cache
1   entity is not present in responder's cache

which to me says that it's simply an indication of whether it's in-cache or 
not, not whether it's stale. Since the response headers come back on the HTCP 
response, the querying peer can figure out whether or not it's fresh -- 
according to its local rules, rather than possibly disparate configuration on 
the peer being queried.

This isn't an urgent issue for me (usually, my peers are configured 
identically), it just surprised me a bit when I came across it in the code.

Cheers,


On 19/05/2010, at 5:42 AM, Henrik Nordström wrote:

 tis 2010-05-18 klockan 22:59 +1000 skrev Mark Nottingham:
 
 Any thoughts about moving where refreshCheck is called for HTCP?
 
 Haen't looked at it. What is the problem?
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Joining squid-dev List

2010-05-17 Thread Mark Nottingham
Hi,

I'm looking at the same issue in Squid 2.HEAD (fortuitously, one of my users 
complained about this at the same time).

It seems like the root of this (at least in 2) is in neighbors.c's 
peerAllowedToUse(), which tests request-flags.need_validation.

clientCacheHit's refreshcheck block sets it with this comment:

/*
 * We hold a stale copy; it needs to be validated
 */
/*
 * The 'need_validation' flag is used to prevent forwarding
 * loops between siblings.  If our copy of the object is stale,
 * then we should probably only use parents for the validation
 * request.  Otherwise two siblings could generate a loop if
 * both have a stale version of the object.
 */
r-flags.need_validation = 1;

Is the code in Squid3 roughly the same?

I'm tempted to get rid of the need_validation flag, as there are other ways 
that Squid does loop suppression (e.g., only-if-cached on peer requests, 
icp_stale_hit). What do people think of this? Is this howyou addressed it.

Also, I'm mostly interested in the HTCP case, where I *believe* that there's 
enough information sent in the request to avoid a forwarding loop, as long as 
their refresh_patterns are the same.

As an aside to the Squid devs -- I was somewhat surprised to see the HTCP 
refreshCheck being done on the HTCP server side, rather than the client side; 
wouldn't it be better to have the freshness decision made where it's going to 
be applied?

Cheers,



On 17/05/2010, at 11:19 PM, senad.ci...@thomsonreuters.com 
senad.ci...@thomsonreuters.com wrote:

 Thank you Amos. Currently the new option can be set globally. I can see
 some advantages of having it set on cache_peer per-peer basis - do you
 think that would be better option for this patch? I can look into
 changing it, shouldn't take much more effort...
 
 Thanks again,
 -Senad
 
 -Original Message-
 From: Amos Jeffries [mailto:squ...@treenet.co.nz] 
 Sent: Saturday, May 15, 2010 6:09 AM
 To: squid-dev@squid-cache.org
 Subject: Re: Joining squid-dev List
 
 senad.ci...@thomsonreuters.com wrote:
 Hi,
 
 I'm following directions at
 http://www.squid-cache.org/Support/mailing-lists.dyn in joining this
 list... My main motivation is to submit a patch for the minor changes
 I
 made to the squid source described below:
 
 We're in the process of implementing squid 3.1.1 version for reverse
 proxying. The issue we ran into is related to running a cluster of
 Squids in sibling mode. Problem was that for stale (cached but
 expired)
 resources Squids always go to the backend servers to verify freshness
 instead of contacting their sibling Squids (this was the case for ICP,
 HTCP, and cache digests).
 
 Changes I made include adding new squid.conf directive (on|off option)
 that makes this behavior configurable. By default, it will behave as
 it
 is in current Squid version, but if turned on it will verify freshness
 of stale resources with its siblings (ICP, HTCP, and cache digests)
 prior to contacting backend servers.
 
 I will work on re-formatting changed code to match Squid3 coding
 standards and will afterwards follow the process to submit patch
 request.
 
 Thanks,
 Senad
 
 Greetings and Welcome Senad,
 
  I look forward to seeing the patch when it comes through. Are you 
 looking at an option that can be set on cache_peer per-peer? or
 globally?
 
 FYI, Some administrative details to be aware of:
 
  There is about 2 months to go before 3.2 goes into beta. I'm hoping 
 for July 1st, depending on the SMP project work. Config file features 
 should aim to be done and approved by that point.
 
  3.1 is officially closed to new features etc., though private patches 
 are always a possibility.
 
 
 Amos
 -- 
 Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.3
 

--
Mark Nottingham   m...@yahoo-inc.com




problems collapsing large responses

2010-02-15 Thread Mark Nottingham
I'm seeing some strange behaviour when using collapsed_forwarding on large 
responses in squid-2.7 and squid2-HEAD.

Two separate symptoms:
  1) large responses not being cached when collapsed
  2) large responses not being completely sent; i.e., part of the response is 
sent, then it 'locks up'

#2 is more worrisome.

To recreate:
  - compile a squid-2.7 or HEAD, configure with collapsed_forwarding on
  - serve content with a script like this:

---8---
#!/usr/bin/env python

import sys, time
time.sleep(2)
print Status: 200 OK
print Content-Type: text/plain
print Cache-Control: max-age=45
print Vary: Accept-Encoding
print
for i in range(1024):
 print abcdefghij * 12
---8---

and drive traffic like this:
  httperf --server localhost --port 3128 --hog --num-calls 1 --num-conns 10 
--rate 2 --uri `cat urls.txt`
or this:
  http_load -rate 2 -seconds 20 -proxy localhost:3128 urls.txt
assuring that the cache is empty first.

See access.log as well as load generation results.

Observations:
  - there seems to be a threshold response size of somewhere around 110K that 
triggers this
  - does not appear to rely on value of maximum_object_size_in_memory
  - does not appear to be specific to disk or null disk caching
  - problem #2 seems to be caused by a Vary header
  - does not appear to be related to VARY_RESTART; clientCacheHit: Vary MATCH!

Does this seem familiar to anyone? I'll file a bug, but thought I'd check and 
see if it was a known issue.

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Re: problems collapsing large responses

2010-02-15 Thread Mark Nottingham
A bit more;

  - it *does* hit VARY_REFRESH first
  - threshold for the vary problem is much lower
  - evident since (at least) Squid 2.7STABLE1


On 15/02/2010, at 8:59 PM, Mark Nottingham wrote:

 I'm seeing some strange behaviour when using collapsed_forwarding on large 
 responses in squid-2.7 and squid2-HEAD.
 
 Two separate symptoms:
  1) large responses not being cached when collapsed
  2) large responses not being completely sent; i.e., part of the response is 
 sent, then it 'locks up'
 
 #2 is more worrisome.
 
 To recreate:
  - compile a squid-2.7 or HEAD, configure with collapsed_forwarding on
  - serve content with a script like this:
 
 ---8---
 #!/usr/bin/env python
 
 import sys, time
 time.sleep(2)
 print Status: 200 OK
 print Content-Type: text/plain
 print Cache-Control: max-age=45
 print Vary: Accept-Encoding
 print
 for i in range(1024):
 print abcdefghij * 12
 ---8---
 
 and drive traffic like this:
  httperf --server localhost --port 3128 --hog --num-calls 1 --num-conns 10 
 --rate 2 --uri `cat urls.txt`
 or this:
  http_load -rate 2 -seconds 20 -proxy localhost:3128 urls.txt
 assuring that the cache is empty first.
 
 See access.log as well as load generation results.
 
 Observations:
  - there seems to be a threshold response size of somewhere around 110K that 
 triggers this
  - does not appear to rely on value of maximum_object_size_in_memory
  - does not appear to be specific to disk or null disk caching
  - problem #2 seems to be caused by a Vary header
  - does not appear to be related to VARY_RESTART; clientCacheHit: Vary MATCH!
 
 Does this seem familiar to anyone? I'll file a bug, but thought I'd check and 
 see if it was a known issue.
 
 Cheers,
 
 
 --
 Mark Nottingham   m...@yahoo-inc.com
 
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Squid-2 maintenance update

2010-02-13 Thread Mark Nottingham


* Make storeurl_rewriter work with Vary - bug 2678
* Make miss_access a slow lookup - bug 2688
* Don't collapse only-if-cached requests - bug 2861



On 13/02/2010, at 8:36 AM, Henrik Nordström wrote:

 New 2.7  2.6 releases with patches for the resent security issues is
 currently being prepared.
 
 If I have overlooked any other patches you think should have been
 included then speak up now. You have approximately 16 hours before the
 releases is frozen.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Squid-3 and HTTP/1.1

2010-01-28 Thread Mark Nottingham
FWIW, I have an XSLT stylesheet that can format the results pleasantly; it 
could be a starting point for something automated.


On 29/01/2010, at 12:34 AM, Amos Jeffries wrote:

 Robert Collins wrote:
 On Wed, 2010-01-27 at 22:49 -0700, Alex Rousskov wrote:
 c) Co-Advisor currently only tests MUST-level requirements. Old Robert's
 checklist contained some SHOULD-level requirements as well. I see that
 Sheet1 on the spreadsheet has SHOULDs. Are we kind of ignoring them (and
 Sheet1) for now, until all MUSTs on Sheet2 are satisfied?
 
 d) I do not know who created the spreadsheet. Whoever it was, thank you!
 Is there a script that takes Co-Advisor results and produces a
 spreadsheet column for cut-and-pasting?
 It looks nice. It might be based on the xls spreadsheet I made, but I
 don't know ;)
 I would not worry about SHOULD's until the MUSTs are done (but if a
 SHOULD is in reach while doing a MUST, doing it would be good).
 -Rob
 
 Spreadsheet by me. Item format + naming by the Co-Advisor authors. Data by Y! 
 testing + me for the estimated column.
 
 I tried to base it on yours Rob for a historical view of the 2.x support. But 
 the item naming and crossover was too different and too much work to be 
 reliable and easy. half-sorry ;)
 
 Alex: the current form is effectively a XLS dump cross-test of the Co-Advisor 
 results. Plus manual estimations for the guess column.
 
 I've forgotten what was on the Sheet2. So yes it's missed a few rounds of 
 updates.
 
 Amos
 -- 
 Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE21
  Current Beta Squid 3.1.0.15

--
Mark Nottingham   m...@yahoo-inc.com




Squid 2.8 Roadmap

2010-01-22 Thread Mark Nottingham
Now that Adrian has moved his work to Lusca, it looks like the Squid 2.8 
roadmap http://wiki.squid-cache.org/RoadMap/Squid2 isn't reflecting reality 
(please correct me if I'm wrong, adri!).

Although I don't want to accelerate 2.x work, nor get in the way of 3.x work, I 
think it would be good to solidify a lot of the improvements in 2.HEAD into a 
proper release, so that people don't have to run with a lot of patches from 
HEAD.

In particular, I'd like to see the following patches in 2.8 (or 2.7STABLE, but 
AIUI that's not appropriate, in most cases).

* Logging rewritten URLs - bug 2406
* Make PEER_TCP_MAGIC_COUNT configurable - bug 2377
* hier_code ACL - bug 2390
* HTCP / extension method patches - by benno, including 1235[3-5], 12358, 
12364, 1236[7,8], 12427, 1245[5,6] patches
* 64bit crash with PURGE and HTCP - bug 2799
* Add old entry back to async object - bug 2832
* CLR segfault - bug 2788
* Direct peer monitoring - bug 2643
* Adjustable latency stats - bug 2345
* Adjustable collapsed forwarding timeouts - bug 2504
* Idempotent start - bug 2599
* Configurable forward max tries - bug 2632
* Request body buffering - bug 2420
* HTCP logging - bug 2627
* ignore must-revalidate - bug 2645
* Aggressive caching - bug 2631
* Don't make fatal() dump core - bug 2673
* Make storeurl_rewriter work with Vary - bug 2678
* Make miss_access a slow lookup - bug 2688

I'm happy to help with documenting these, etc. as much as required, although 
I'm not really up to full release management. Any guidance, etc. would be 
helpful.

WRT the roadmap, is the best thing to do to remove the current information and 
start collecting a list of applicable bugs? Or can we just give them a 
Milestone of 2.8 in bugzilla?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: Squid 2.8 Roadmap

2010-01-22 Thread Mark Nottingham
I'm not against that, as long as 2.8 isn't rigidly feature-locked. It would be 
a bit weird, IMO, but I can live with it.


On 23/01/2010, at 11:02 AM, Amos Jeffries wrote:

 Mark Nottingham wrote:
 Now that Adrian has moved his work to Lusca, it looks like the Squid 2.8 
 roadmap http://wiki.squid-cache.org/RoadMap/Squid2 isn't reflecting 
 reality (please correct me if I'm wrong, adri!).
 Although I don't want to accelerate 2.x work, nor get in the way of 3.x 
 work, I think it would be good to solidify a lot of the improvements in 
 2.HEAD into a proper release, so that people don't have to run with a lot of 
 patches from HEAD.
 In particular, I'd like to see the following patches in 2.8 (or 2.7STABLE, 
 but AIUI that's not appropriate, in most cases).
 * Logging rewritten URLs - bug 2406
 * Make PEER_TCP_MAGIC_COUNT configurable - bug 2377
 * hier_code ACL - bug 2390
 * HTCP / extension method patches - by benno, including 1235[3-5], 12358, 
 12364, 1236[7,8], 12427, 1245[5,6] patches
 * 64bit crash with PURGE and HTCP - bug 2799
 * Add old entry back to async object - bug 2832
 * CLR segfault - bug 2788
 * Direct peer monitoring - bug 2643
 * Adjustable latency stats - bug 2345
 * Adjustable collapsed forwarding timeouts - bug 2504
 * Idempotent start - bug 2599
 * Configurable forward max tries - bug 2632
 * Request body buffering - bug 2420
 * HTCP logging - bug 2627
 * ignore must-revalidate - bug 2645
 * Aggressive caching - bug 2631
 * Don't make fatal() dump core - bug 2673
 * Make storeurl_rewriter work with Vary - bug 2678
 * Make miss_access a slow lookup - bug 2688
 I'm happy to help with documenting these, etc. as much as required, although 
 I'm not really up to full release management. Any guidance, etc. would be 
 helpful.
 WRT the roadmap, is the best thing to do to remove the current information 
 and start collecting a list of applicable bugs? Or can we just give them a 
 Milestone of 2.8 in bugzilla?
 Cheers,
 
 I know I don't have a lot of say in this, but here is my 2c anyway...
 
 If Henrik and you agree that 2.HEAD is stable enough for poduction use I wont 
 objects. Even while reaching a point that 2.8 might happen saddens me, I can 
 see that it might be needed.
 
 I'm happy with simply renaming 2.HEAD - 2.8 formally. But not really with 
 branching a new release. Opening HEAD again for a possible 2.9 is IMO a bad 
 idea.
 
 Making 2.8 formally the terminal 2.x release while allowing the possibility 
 that its feature set is not as stone-fixed as earlier 2.x.
 
 Amos
 -- 
 Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE21
  Current Beta Squid 3.1.0.15

--
Mark Nottingham   m...@yahoo-inc.com




Re: Squid 2.8 Roadmap

2010-01-22 Thread Mark Nottingham
For that matter, we could get all of this into 2.7, if we relax it...


On 23/01/2010, at 11:02 AM, Amos Jeffries wrote:

 Mark Nottingham wrote:
 Now that Adrian has moved his work to Lusca, it looks like the Squid 2.8 
 roadmap http://wiki.squid-cache.org/RoadMap/Squid2 isn't reflecting 
 reality (please correct me if I'm wrong, adri!).
 Although I don't want to accelerate 2.x work, nor get in the way of 3.x 
 work, I think it would be good to solidify a lot of the improvements in 
 2.HEAD into a proper release, so that people don't have to run with a lot of 
 patches from HEAD.
 In particular, I'd like to see the following patches in 2.8 (or 2.7STABLE, 
 but AIUI that's not appropriate, in most cases).
 * Logging rewritten URLs - bug 2406
 * Make PEER_TCP_MAGIC_COUNT configurable - bug 2377
 * hier_code ACL - bug 2390
 * HTCP / extension method patches - by benno, including 1235[3-5], 12358, 
 12364, 1236[7,8], 12427, 1245[5,6] patches
 * 64bit crash with PURGE and HTCP - bug 2799
 * Add old entry back to async object - bug 2832
 * CLR segfault - bug 2788
 * Direct peer monitoring - bug 2643
 * Adjustable latency stats - bug 2345
 * Adjustable collapsed forwarding timeouts - bug 2504
 * Idempotent start - bug 2599
 * Configurable forward max tries - bug 2632
 * Request body buffering - bug 2420
 * HTCP logging - bug 2627
 * ignore must-revalidate - bug 2645
 * Aggressive caching - bug 2631
 * Don't make fatal() dump core - bug 2673
 * Make storeurl_rewriter work with Vary - bug 2678
 * Make miss_access a slow lookup - bug 2688
 I'm happy to help with documenting these, etc. as much as required, although 
 I'm not really up to full release management. Any guidance, etc. would be 
 helpful.
 WRT the roadmap, is the best thing to do to remove the current information 
 and start collecting a list of applicable bugs? Or can we just give them a 
 Milestone of 2.8 in bugzilla?
 Cheers,
 
 I know I don't have a lot of say in this, but here is my 2c anyway...
 
 If Henrik and you agree that 2.HEAD is stable enough for poduction use I wont 
 objects. Even while reaching a point that 2.8 might happen saddens me, I can 
 see that it might be needed.
 
 I'm happy with simply renaming 2.HEAD - 2.8 formally. But not really with 
 branching a new release. Opening HEAD again for a possible 2.9 is IMO a bad 
 idea.
 
 Making 2.8 formally the terminal 2.x release while allowing the possibility 
 that its feature set is not as stone-fixed as earlier 2.x.
 
 Amos
 -- 
 Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE21
  Current Beta Squid 3.1.0.15

--
Mark Nottingham   m...@yahoo-inc.com




Re: Associating accesses with cache.log entries

2009-12-17 Thread Mark Nottingham
I've put a basic proof-of-concept into a bug; see 
http://bugs.squid-cache.org/show_bug.cgi?id=2835.

It only logs client FDs, but gives output like this (with debug_options ALL,2):

2009/12/17 22:08:24 client_fd=13 ctx: enter level  0: 'client_fd=13'
2009/12/17 22:08:24 client_fd=13 Parser: retval 1: from 0-47: method 0-2; url 
4-36; version 38-46 (1/1)
2009/12/17 22:08:24 client_fd=13 The request GET http://www.mnot.net/test/ is 
ALLOWED, because it matched 'localhost'
2009/12/17 22:08:24 client_fd=13 clientCacheHit: refreshCheckHTTPStale returned 
1
2009/12/17 22:08:24 client_fd=13 peerSourceHashSelectParent: Calculating hash 
for 127.0.0.1
2009/12/17 22:08:28 client_fd=17 ctx: exit level  0
2009/12/17 22:08:28 client_fd=17 ctx: enter level  0: 'client_fd=17'
2009/12/17 22:08:28 client_fd=17 Parser: retval 1: from 0-35: method 0-2; url 
4-24; version 26-34 (1/1)
2009/12/17 22:08:28 client_fd=17 The request GET http://www.apple.com/ is 
ALLOWED, because it matched 'localhost'
2009/12/17 22:08:28 client_fd=17 clientCacheHit: refreshCheckHTTPStale returned 0
2009/12/17 22:08:28 client_fd=17 clientCacheHit: HIT
2009/12/17 22:08:28 client_fd=17 The reply for GET http://www.apple.com/ is 
ALLOWED, because it matched 'all'
2009/12/17 22:08:29  ctx: exit level  0
2009/12/17 22:08:29  The reply for GET http://www.mnot.net/test/slow.cgi is 
ALLOWED, because it matched 'all'

Feedback / sanity checking appreciated.



On 25/02/2009, at 6:40 PM, Henrik Nordstrom wrote:

 ons 2009-02-25 klockan 12:10 +1100 skrev Mark Nottingham:
 
 What am I missing? The most straightforward way that I can see to do  
 this is to add an identifier to clientHttpRequest and pass that to  
 debug where available...
 
 That is what ctx_enter is about... There is not a single location where
 ctx_enter needs to be called, there is many..
 
 Remember that Squid is a big bunch of event driven state machines, doing
 a little bit of processing at a time interleaved with many other
 unrelated things. ctx_enter indicates which state transition is
 currently being processed, ctx_leave when that state transition has
 completed waiting for next event (even if still at the same state..)
 
 So you need ctx_enter in quite many places, providing a reasonable trace
 of the processing within the state machine so far, based on whatever
 identifier the current small step is about. Each time the processing
 returns to the comm loop you are back at ctx level 0 with no context.
 Sometimes the ctx level may be quite high, having many loosely related
 state transitions in the trace, sometimes even almost completely
 unrelated requests.
 
 Most of the time the state machine starts with something directly
 related to a specific request (read/write on http sockets) however, but
 there is also many other kinds of state transitions like DNS, timers
 etc.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Associating accesses with cache.log entries

2009-12-17 Thread Mark Nottingham
I've made some progress on this, now adding a request identifier and putting 
that in the context (see updated patch in bug).

I'm willing to see this through, but I'd like a bit of feedback first -- 
especially around whether this (i.e., sprinkling ctx_* calls throughout the 
codebase wherever we have a callback) is the right approach, first.

Cheers,



On 17/12/2009, at 10:24 PM, Mark Nottingham wrote:

 I've put a basic proof-of-concept into a bug; see 
 http://bugs.squid-cache.org/show_bug.cgi?id=2835.
 
 It only logs client FDs, but gives output like this (with debug_options 
 ALL,2):
 
 2009/12/17 22:08:24 client_fd=13 ctx: enter level  0: 'client_fd=13'
 2009/12/17 22:08:24 client_fd=13 Parser: retval 1: from 0-47: method 0-2; 
 url 4-36; version 38-46 (1/1)
 2009/12/17 22:08:24 client_fd=13 The request GET http://www.mnot.net/test/ is 
 ALLOWED, because it matched 'localhost'
 2009/12/17 22:08:24 client_fd=13 clientCacheHit: refreshCheckHTTPStale 
 returned 1
 2009/12/17 22:08:24 client_fd=13 peerSourceHashSelectParent: Calculating hash 
 for 127.0.0.1
 2009/12/17 22:08:28 client_fd=17 ctx: exit level  0
 2009/12/17 22:08:28 client_fd=17 ctx: enter level  0: 'client_fd=17'
 2009/12/17 22:08:28 client_fd=17 Parser: retval 1: from 0-35: method 0-2; 
 url 4-24; version 26-34 (1/1)
 2009/12/17 22:08:28 client_fd=17 The request GET http://www.apple.com/ is 
 ALLOWED, because it matched 'localhost'
 2009/12/17 22:08:28 client_fd=17 clientCacheHit: refreshCheckHTTPStale 
 returned 0
 2009/12/17 22:08:28 client_fd=17 clientCacheHit: HIT
 2009/12/17 22:08:28 client_fd=17 The reply for GET http://www.apple.com/ is 
 ALLOWED, because it matched 'all'
 2009/12/17 22:08:29  ctx: exit level  0
 2009/12/17 22:08:29  The reply for GET http://www.mnot.net/test/slow.cgi is 
 ALLOWED, because it matched 'all'
 
 Feedback / sanity checking appreciated.
 
 
 
 On 25/02/2009, at 6:40 PM, Henrik Nordstrom wrote:
 
 ons 2009-02-25 klockan 12:10 +1100 skrev Mark Nottingham:
 
 What am I missing? The most straightforward way that I can see to do  
 this is to add an identifier to clientHttpRequest and pass that to  
 debug where available...
 
 That is what ctx_enter is about... There is not a single location where
 ctx_enter needs to be called, there is many..
 
 Remember that Squid is a big bunch of event driven state machines, doing
 a little bit of processing at a time interleaved with many other
 unrelated things. ctx_enter indicates which state transition is
 currently being processed, ctx_leave when that state transition has
 completed waiting for next event (even if still at the same state..)
 
 So you need ctx_enter in quite many places, providing a reasonable trace
 of the processing within the state machine so far, based on whatever
 identifier the current small step is about. Each time the processing
 returns to the comm loop you are back at ctx level 0 with no context.
 Sometimes the ctx level may be quite high, having many loosely related
 state transitions in the trace, sometimes even almost completely
 unrelated requests.
 
 Most of the time the state machine starts with something directly
 related to a specific request (read/write on http sockets) however, but
 there is also many other kinds of state transitions like DNS, timers
 etc.
 
 Regards
 Henrik
 
 
 --
 Mark Nottingham   m...@yahoo-inc.com
 
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: Assertion in clientProcessBody

2009-12-15 Thread Mark Nottingham

On 08/12/2009, at 4:12 PM, Henrik Nordstrom wrote:

 tis 2009-12-08 klockan 13:34 +1100 skrev Mark Nottingham:
 
 Any thoughts here? Should this really be =, or should clientProcessBody 
 never get a 0 size_left?
 
 It's done when size_left == 0, and no further body processing handler
 shoud be active on this request at that time. Any data on the connection
 at this time is either surplus data (HTTP violation) or a pipelined
 request waiting to be processed.
 
 If you look a little further down (about one screen) in
 clientProcessBody you'll also see that the body reader gets unregistered
 when processing reaches 0.

is that this?

/* Remove request link if this is the last part of the body, as
 * clientReadRequest automatically continues to process next request */
if (conn-body.size_left = 0  request != NULL)
requestUnregisterBody(request, clientReadBody, conn);

I think what's happening here is that conn-body.size_left is 0, but request 
*is* NULL, because it was a cache hit and clientWriteComplete decided to eat 
the request body. More verbose logging gave:

2009/12/15 17:09:15| clientReadBody: start fd=35 body_size=226 in.offset=211 
cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| clientProcessBody: start fd=35 body_size=226 in.offset=211 
cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| clientProcessBody: end fd=35 size=211 body_size=15 
in.offset=0 cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| clientReadRequest: FD 35: reading request...
2009/12/15 17:09:15| clientReadRequest: FD 35: read 15 bytes
2009/12/15 17:09:15| clientWriteComplete: FD 35, sz 3705, err 0, off 4200, len 
4200
2009/12/15 17:09:15| clientWriteComplete: FD 35 transfer is DONE
2009/12/15 17:09:15| clientWriteComplete: closing, but first we need to read 
the rest of the request
2009/12/15 17:09:15| clientProcessBody: start fd=35 body_size=15 in.offset=15 
cb=0x42ed4c req=(nil)
2009/12/15 17:09:15| clientEatRequestBodyHandler: FD 35 Keeping Alive
2009/12/15 17:09:15| clientKeepaliveNextRequest: FD 35
2009/12/15 17:09:15| httpRequestFree: [url]
2009/12/15 17:09:15| clientKeepaliveNextRequest: FD 35 reading next req
2009/12/15 17:09:15| clientReadRequest: FD 35: reading request...
2009/12/15 17:09:15| clientReadRequest: FD 35: read -1 bytes
2009/12/15 17:09:15| clientProcessBody: end fd=35 size=15 body_size=0 
in.offset=0 cb=0x42ed4c req=(nil)
2009/12/15 17:09:15| clientReadBody: start fd=35 body_size=0 in.offset=0 
cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| clientProcessBody: start fd=35 body_size=0 in.offset=0 
cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| clientReadRequest: FD 35: reading request...
2009/12/15 17:09:15| clientReadRequest: FD 35: read 311 bytes
2009/12/15 17:09:15| clientProcessBody: start fd=35 body_size=0 in.offset=311 
cb=0x4562cd req=0xc52f60
2009/12/15 17:09:15| assertion failed: client_side.c:4471: 
conn-body.size_left  0

am I on the right track here?

--
Mark Nottingham   m...@yahoo-inc.com




Re: Assertion in clientProcessBody

2009-12-08 Thread Mark Nottingham
debug_options 33,2 results in:

2009/12/08 18:34:24| clientTryParseRequest: 0xdbeac0: FD 54: request body is 
226 bytes in size
2009/12/08 18:34:24| The request GET [URL] is ALLOWED, because it matched 
'[acl]'
2009/12/08 18:34:24| clientCacheHit: refreshCheckHTTPStale returned -2
2009/12/08 18:34:24| clientCacheHit: stale-while-revalidate needs revalidation
2009/12/08 18:34:24| The reply for GET [URL] is ALLOWED, because it matched 
'all'
2009/12/08 18:34:24| clientProcessBody: start fd=54 body_size=226 in.offset=226 
cb=0x42eac0 req=(nil)
2009/12/08 18:34:24| clientProcessBody: end fd=54 size=226 body_size=0 
in.offset=0 cb=0x42eac0 req=(nil)
2009/12/08 18:34:24| clientReadBody: start fd=54 body_size=0 in.offset=0 
cb=0x45602f req=0xdbeac0
2009/12/08 18:34:24| clientProcessBody: start fd=54 body_size=0 in.offset=0 
cb=0x45602f req=0xdbeac0
2009/12/08 18:34:24| The request GET [URL2] is ALLOWED, because it matched 
'[acl]'
2009/12/08 18:34:24| clientCacheHit: refreshCheckHTTPStale returned -2
2009/12/08 18:34:24| clientCacheHit: stale-while-revalidate needs revalidation
2009/12/08 18:34:24| The reply for GET [URL2] is ALLOWED, because it matched 
'all'
2009/12/08 18:34:24| clientProcessBody: start fd=54 body_size=0 in.offset=420 
cb=0x45602f req=0xdbeac0
2009/12/08 18:34:24| assertion failed: client_side.c:4445: 
conn-body.size_left  0



On 08/12/2009, at 5:23 PM, Mark Nottingham wrote:

 #2  0x00435749 in xassert (msg=0x4c02f1 conn-body.size_left  0, 
file=0x4bd9d0 client_side.c, line=4445) at debug.c:505
 No locals.
 #3  0x0042ece1 in clientProcessBody (conn=0xc270c8)
at client_side.c:4445
   valid = 1
   size = 0
   buf = 0x81024f0 
   cbdata = (void *) 0x51703568
   callback = (CBCB *) 0x45602f httpRequestBodyHandler
   request = (request_t *) 0x1ed3c80
 #4  0x0042e853 in clientReadRequest (fd=37, data=0xc270c8)
at client_side.c:4331
   conn = (ConnStateData *) 0xc270c8
   size = 422
   F = (fde *) 0x2a957a68b8
   len = 4095
   ret = 0
 #5  0x004346ad in comm_call_handlers (fd=37, read_event=1, 
write_event=0) at comm_generic.c:264
   hdl = (PF *) 0x42e4e3 clientReadRequest
   hdl_data = (void *) 0xc270c8
   do_read = 1
   F = (fde *) 0x2a957a68b8
   do_incoming = 1
 #6  0x00434f3e in do_comm_select (msec=579) at comm_epoll.c:195
   i = 0
   num = 1
   saved_errno = 11
 #7  0x00434a55 in comm_select (msec=579) at comm_generic.c:390
   last_timeout = 1260237328.0927789
   rc = 0
   start = 1260237328.5134871
 #8  0x0046722c in main (argc=3, argv=0x7fbfffd798) at main.c:862
   errcount = 0
 
 
 
 On 08/12/2009, at 4:12 PM, Henrik Nordstrom wrote:
 
 tis 2009-12-08 klockan 13:34 +1100 skrev Mark Nottingham:
 
 Any thoughts here? Should this really be =, or should clientProcessBody 
 never get a 0 size_left?
 
 It's done when size_left == 0, and no further body processing handler
 shoud be active on this request at that time. Any data on the connection
 at this time is either surplus data (HTTP violation) or a pipelined
 request waiting to be processed.
 
 If you look a little further down (about one screen) in
 clientProcessBody you'll also see that the body reader gets unregistered
 when processing reaches 0.
 
 But it would not be harmful to make clientProcessBody gracefully handle
 size_left == 0 I guess.
 
 A backtrace would be nice.
 
 Regards
 Henrik
 
 
 --
 Mark Nottingham   m...@yahoo-inc.com
 
 

--
Mark Nottingham   m...@yahoo-inc.com




Assertion in clientProcessBody

2009-12-07 Thread Mark Nottingham
I've got a user tripping across the following assertion in clientProcessBody 
(2.7):

/* Some sanity checks... */
assert(conn-body.size_left  0);

conn-body.size_left is 0, and the method is GET; content-length is 226. 
request_entities is on (obviously).

Any thoughts here? Should this really be =, or should clientProcessBody never 
get a 0 size_left?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: Assertion in clientProcessBody

2009-12-07 Thread Mark Nottingham
#2  0x00435749 in xassert (msg=0x4c02f1 conn-body.size_left  0, 
file=0x4bd9d0 client_side.c, line=4445) at debug.c:505
No locals.
#3  0x0042ece1 in clientProcessBody (conn=0xc270c8)
at client_side.c:4445
valid = 1
size = 0
buf = 0x81024f0 
cbdata = (void *) 0x51703568
callback = (CBCB *) 0x45602f httpRequestBodyHandler
request = (request_t *) 0x1ed3c80
#4  0x0042e853 in clientReadRequest (fd=37, data=0xc270c8)
at client_side.c:4331
conn = (ConnStateData *) 0xc270c8
size = 422
F = (fde *) 0x2a957a68b8
len = 4095
ret = 0
#5  0x004346ad in comm_call_handlers (fd=37, read_event=1, 
write_event=0) at comm_generic.c:264
hdl = (PF *) 0x42e4e3 clientReadRequest
hdl_data = (void *) 0xc270c8
do_read = 1
F = (fde *) 0x2a957a68b8
do_incoming = 1
#6  0x00434f3e in do_comm_select (msec=579) at comm_epoll.c:195
i = 0
num = 1
saved_errno = 11
#7  0x00434a55 in comm_select (msec=579) at comm_generic.c:390
last_timeout = 1260237328.0927789
rc = 0
start = 1260237328.5134871
#8  0x0046722c in main (argc=3, argv=0x7fbfffd798) at main.c:862
errcount = 0



On 08/12/2009, at 4:12 PM, Henrik Nordstrom wrote:

 tis 2009-12-08 klockan 13:34 +1100 skrev Mark Nottingham:
 
 Any thoughts here? Should this really be =, or should clientProcessBody 
 never get a 0 size_left?
 
 It's done when size_left == 0, and no further body processing handler
 shoud be active on this request at that time. Any data on the connection
 at this time is either surplus data (HTTP violation) or a pipelined
 request waiting to be processed.
 
 If you look a little further down (about one screen) in
 clientProcessBody you'll also see that the body reader gets unregistered
 when processing reaches 0.
 
 But it would not be harmful to make clientProcessBody gracefully handle
 size_left == 0 I guess.
 
 A backtrace would be nice.
 
 Regards
 Henrik
 

--
Mark Nottingham   m...@yahoo-inc.com




Re: httpMaybeRemovePublic and collapsed_forwarding

2009-10-29 Thread Mark Nottingham

:)

httpMaybeRemovePublic isn't called in when collapsed forwarding is on.  
While (according to you 1.5 years ago) this makes sense for  
determining whether to remove something from storage, it doesn't work  
for HTCP CLRs, which are also done from here.



On 30/10/2009, at 11:49 AM, Henrik Nordstrom wrote:


Sorry, lost the thread a bit here over the 1.5 year that has passed.
What was this about?


fre 2009-10-30 klockan 11:26 +1100 skrev Mark Nottingham:

Since the HTCP purging is tied up in httpMaybeRemovePublic, I think
this needs to happen in storePurgeEntriesByUrl; e.g.,

if (neighbors_do_private_keys  !
Config.onoff.collapsed_forwarding)
   storeRelease(e);

at the end.

Make sense?



On 24/06/2008, at 2:00 PM, Henrik Nordstrom wrote:


On tis, 2008-06-24 at 10:45 +1000, Benno Rice wrote:


Can someone fill me in on why this isn't called in the
collapsed_forwarding case?  I've got some ideas but I'm not  
confidant

enough in my reading of the code to be sure that I'm right.  Mainly
it
feels like we're very careful that the StoreEntry in use may not be
right in someway.  Is there some way I can tell whether it's safe
to
run httpMaybeRemovePublic in the collapsed case?


The difference in collapsed forwarding is that the object has  
already

overwritten earlier content early on when using collapsed
forwarding, so
in most cases the older content has already been invalidated.

Same thing when ICP peers do not support the query key parameter..

What's missing in this picture is variant invalidation..

Thinking.. I guess the easiest would be to move this logic down to
httpMaybeRemovePublic, for a starter making it not remove the object
itself which is the primary case this test is for..

Regards
Henrik


--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com




Re: WebSockets negotiation over HTTP

2009-10-21 Thread Mark Nottingham


On 22/10/2009, at 10:52 AM, Ian Hickson wrote:


Until the upgrade is complete, you're speaking HTTP and working with
HTTP implementations.


How so? A WebSocket client is always talking Web Socket, even if it  
might

also sound like HTTP.


Yes, but to someone examining the traffic -- whether in a debugger or  
an intermediary -- it looks, smells and quacks like HTTP. It declares  
itself to be HTTP/1.1 by using the HTTP/1.1 protocol identifier in the  
request-line. What's the point of doing that -- i.e., why not use  
WebSockets/1.0?




Have you verified that implementations (e.g., Apache module API) will
give you byte-level access to what's on the wire in the request, and
byte-level control over what goes out in the response?


On the server side, you don't need wire-level control over what's  
coming

in, only over what's going out.


Yes, you do, because section 5.2 specifies headers as whitespace- 
sensitive and header-names as case-sensitive.



There's already a WebSocket module for Apache, by the way:

  http://code.google.com/p/pywebsocket/


Cool.



Despite all of this, you say:

  The simplest method is to use port 80 to get a direct connection  
to a

  Web Socket server.  Port 80 traffic, however, will often be
  intercepted by HTTP proxies, which can lead to the connection  
failing

  to be established.


which I think is misleading; this is far from the simplest way to use
WebSockets, from a deployment perspective.


True. I've tried to reword this to avoid this possible ambiguity.


I see those changes in -50; looks good (and a very elegant change).  
Thanks.




This looks an awful lot like a redirect.


There's no redirection involved here. It's just confirming the  
opened URL,

as part of the handshake. The TCP connection is not closed (unless the
handshake fails, and then it's not reopened).

I see now that you have the client-side fail a connection where the  
URL

doesn't match, but that's really not obvious in 5.1. Please put some
context in there and reinforce that the URL has to be the URL of the
current script, not just any script.


Ok, I've added a note at the end of that section explaining that the  
user

agent will fail the connection if the strings don't match what the UA
sent. Please let me know if you'd like anything else clarified; I  
don't

really know exactly what should be made clearer.


Did this get into -50? Don't see anything in the diff...

The most effective way of doing this would be to actually define the  
new headers' semantics in your draft; Websocket-Location, for example,  
is only defined as client-side and server-side behaviours. I know this  
is probably intentional, but people will read all sorts of things into  
this header (especially since its name is so similar to other HTTP  
headers) unless you give some indication of what it means.



--
Mark Nottingham   m...@yahoo-inc.com




Re: WebSockets negotiation over HTTP

2009-10-13 Thread Mark Nottingham


On 13/10/2009, at 10:23 PM, Ian Hickson i...@hixie.ch wrote:

I want to just use port 80, and I want to make it possible for a  
suitably
configured HTTP server to pass connections over to WebSocket  
servers. It
seems to me that using something that looks like an HTTP Upgrade is  
better

than just having a totally unrelated handshake, but I guess maybe we
should just reuse port 80 without doing anything HTTP-like at all.


To be clear, upgrade is appropriate for changing an existing  
connection over to a new protocol (ie reusing it). To pass a request  
over to a different server, a redirect would be more appropriate (and  
is facilitated by the new uri scheme).


(Ian I don't have your draft in front of me, so this isnt a comment on  
it necessarily, just a general statement).


Cheers,


Re: WebSockets negotiation over HTTP

2009-10-13 Thread Mark Nottingham


Catching up to Ian's -48 draft*, I don't think there's much of a  
problem here -- or at least the spec can be brought into alignment  
with HTTP with a few small changes. However, the comment about upgrade  
vs. redirect stands (see below).


Section 4.1 describes the handshake from the client side. It requires  
the client to send a request that's a subset of HTTP; this doesn't  
conflict with HTTP in and of itself. It also constrains the responses  
to the request that the client can expect, but that's OK because at  
this point we shouldn't be talking HTTP any more.


It would be nice if clients were explicitly allowed to send other  
headers, e.g., Referer or User-Agent, but it's not critical. Also, by  
its nature this protocol is going to be fragile on non-CONNECTed HTTP  
connections, but Ian has already acknowledged this.


Section 5.1 describes the handshake from the server side. It doesn't  
place any requirements on the bytes received from the client, only on  
those sent by the server, so again this is a proper subset of HTTP.


Section 5.2 does constrain the bytes the server accepts from the  
client, thereby conflicting with HTTP, but only in some small details.  
In particular, it makes HTTP header field-names case-sensitive, and  
requires certain arrangements of whitespace in them.


Ian, if you can address these small things in section 5.2 it would help.

The other aspect here is that you're really not using Upgrade in an  
appropriate fashion; as mentioned before, its intended use is to  
upgrade *this* TCP connection, not redirect to another one. If you  
really want to just redirect all of the time, you'd be much better off  
doing a normal 3xx redirect to something with a ws: or wss: URL scheme  
-- it would avoid a lot of the fragility we've been concerned about on  
the HTTP side.


Cheers,


* Ian, are you just trying to exceed 100 drafts, thereby crashing the  
IETF? :)



On 14/10/2009, at 10:07 AM, Robert Collins wrote:


On Wed, 2009-10-14 at 09:59 +1100, Mark Nottingham wrote:

On 13/10/2009, at 10:23 PM, Ian Hickson i...@hixie.ch wrote:


I want to just use port 80, and I want to make it possible for a
suitably
configured HTTP server to pass connections over to WebSocket
servers. It
seems to me that using something that looks like an HTTP Upgrade is
better
than just having a totally unrelated handshake, but I guess maybe we
should just reuse port 80 without doing anything HTTP-like at all.


To be clear, upgrade is appropriate for changing an existing
connection over to a new protocol (ie reusing it). To pass a request
over to a different server, a redirect would be more appropriate (and
is facilitated by the new uri scheme).


Yup; and the major issue here is that websockets *does not want* the
initial handshake to be HTTP. Rather it wants to be something not- 
quite

HTTP, specifically reject a number of behaviours and headers that are
legitimate HTTP.

-Rob


--
Mark Nottingham   m...@yahoo-inc.com




Re: Fun with Squid2 and Clang

2009-09-02 Thread Mark Nottingham
++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../ 
include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite- 
strings -Wcomments -D_REENTRANT -g -O2 -MT AsyncCall.lo -MD -MP - 
MF .deps/AsyncCall.Tpo -c AsyncCall.cc -o AsyncCall.o

mv -f .deps/AsyncCall.Tpo .deps/AsyncCall.Plo
/bin/sh ../../libtool --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H  - 
I../.. -I../../include -I../../src -I../../include-I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2 -MT AsyncJob.lo -MD -MP -MF .deps/AsyncJob.Tpo -c - 
o AsyncJob.lo AsyncJob.cc
 g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../ 
include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite- 
strings -Wcomments -D_REENTRANT -g -O2 -MT AsyncJob.lo -MD -MP - 
MF .deps/AsyncJob.Tpo -c AsyncJob.cc -o AsyncJob.o

mv -f .deps/AsyncJob.Tpo .deps/AsyncJob.Plo
/bin/sh ../../libtool --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H  - 
I../.. -I../../include -I../../src -I../../include-I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2 -MT AsyncCallQueue.lo -MD -MP -MF .deps/ 
AsyncCallQueue.Tpo -c -o AsyncCallQueue.lo AsyncCallQueue.cc
 g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../ 
include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite- 
strings -Wcomments -D_REENTRANT -g -O2 -MT AsyncCallQueue.lo -MD -MP - 
MF .deps/AsyncCallQueue.Tpo -c AsyncCallQueue.cc -o AsyncCallQueue.o

mv -f .deps/AsyncCallQueue.Tpo .deps/AsyncCallQueue.Plo
/bin/sh ../../libtool --tag=CXX   --mode=link g++ -I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2  -g -o libbase.la  AsyncCall.lo AsyncJob.lo  
AsyncCallQueue.lo  -lm -lresolv

mkdir .libs
ar cru .libs/libbase.a  AsyncCall.o AsyncJob.o AsyncCallQueue.o
ranlib .libs/libbase.a
creating libbase.la
(cd .libs  rm -f libbase.la  ln -s ../libbase.la libbase.la)
Making all in acl
/bin/sh ../../libtool --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H  - 
I../.. -I../../include -I../../src -I../../include-I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2 -MT Acl.lo -MD -MP -MF .deps/Acl.Tpo -c -o Acl.lo  
Acl.cc
 g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../ 
include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite- 
strings -Wcomments -D_REENTRANT -g -O2 -MT Acl.lo -MD -MP -MF .deps/ 
Acl.Tpo -c Acl.cc -o Acl.o

mv -f .deps/Acl.Tpo .deps/Acl.Plo
/bin/sh ../../libtool --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H  - 
I../.. -I../../include -I../../src -I../../include-I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2 -MT Checklist.lo -MD -MP -MF .deps/Checklist.Tpo -c  
-o Checklist.lo Checklist.cc
 g++ -DHAVE_CONFIG_H -I../.. -I../../include -I../../src -I../../ 
include -I/usr/include/libxml2 -Werror -Wall -Wpointer-arith -Wwrite- 
strings -Wcomments -D_REENTRANT -g -O2 -MT Checklist.lo -MD -MP - 
MF .deps/Checklist.Tpo -c Checklist.cc -o Checklist.o

mv -f .deps/Checklist.Tpo .deps/Checklist.Plo
/bin/sh ../../libtool --tag=CXX   --mode=link g++ -I/usr/include/ 
libxml2 -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  - 
D_REENTRANT -g -O2  -g -o libapi.la  Acl.lo Checklist.lo  -lm -lresolv

mkdir .libs
ar cru .libs/libapi.a  Acl.o Checklist.o
ranlib .libs/libapi.a
creating libapi.la




On 02/09/2009, at 1:38 PM, Kinkie wrote:

On Wed, Sep 2, 2009 at 4:23 AM, Mark Nottinghamm...@yahoo-inc.com  
wrote:

Seeing the fun new tools in Snow Leopard's XCode, I dug a bit and ran
Clang's static analyser http://clang-analyzer.llvm.org/ on squid2- 
HEAD;

see

 http://www.mnot.net/test/squid-scan/

for results.

If this is interesting/useful, I can do a quick run on squid3 as  
well; it's

pretty easy.


Please, do it if it's not too much work. 3.0, 3.1 and HEAD would be  
interesting.


--
   /kinkie


--
Mark Nottingham   m...@yahoo-inc.com




Fun with Squid2 and Clang

2009-09-01 Thread Mark Nottingham
Seeing the fun new tools in Snow Leopard's XCode, I dug a bit and ran  
Clang's static analyser http://clang-analyzer.llvm.org/ on squid2- 
HEAD; see


  http://www.mnot.net/test/squid-scan/

for results.

If this is interesting/useful, I can do a quick run on squid3 as well;  
it's pretty easy.


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: A little help with cbdata and requestLink

2009-08-06 Thread Mark Nottingham
If I parse that correctly, I think there are no such cases; i.e., all  
of the paths where fwdContinue doesn't go to peerSelect don't go  
forward, they just error out.


E.g., if I read you correctly, it's not necessary to unlink the  
request here:


case PROTO_INTERNAL:
cbdataFree(fwdState);
internalStart(r, e);
return;

correct?

See modified patch below.



miss_access_slow.patch
Description: Binary data




Thanks again,



On 29/06/2009, at 11:12 AM, Alex Rousskov wrote:


On 06/29/2009 12:30 AM, Mark Nottingham wrote:

OK.

WRT requestlink, I was unsure what would happen if the request  
didn't go

forward; will moving requestLink at the bottom up to here:
 fwdState-request = r; /* requestLink? */
do it?


Yes, provided you adjust the code to call unlink if the request  
did go
forward but not all the way through to where fwdState currently  
unlinks.


Alex.



On 27/06/2009, at 3:05 AM, Alex Rousskov wrote:


On 06/16/2009 09:57 PM, Mark Nottingham wrote:

Thanks. Anybody else have a second to look?


Please s/fwdStartFoo/fwdContinue/ and document what it is. Since  
this is

Squid2 you do not have to do it, of course.

Your cbdata and request manipulations appear technically correct  
to me.

IMHO, the temporary lack of requestLink is poor style that will be
dangerous for future code modifications.

Cheers,

Alex.




On 11/06/2009, at 11:28 PM, Amos Jeffries wrote:


Mark Nottingham wrote:

Would someone mind taking a quick look at this patch:
http://www.squid-cache.org/bugs/attachment.cgi?id=1989
and telling me if I've royally stuffed up with managing  
fwdState and

request linking?
It's to make miss_access a slow lookup...


Looks okay to these uneducated eyes.  Probably best to wait for
someone else to double-check before a HEAD commit, but IMO it  
looks

good enough for a patching.

This one is long-awaited by many. Thanks.

Amos
--
Please be using
Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1


--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com




Re: HTTP-MPLEX

2009-08-04 Thread Mark Nottingham

I think we need to get him in touch with the HTTP-over-SCTP folks...


On 04/08/2009, at 12:50 PM, Henrik Nordstrom wrote:


 Vidarebefordrat meddelande 

Från: Robert Mattson r.matt...@latrobe.edu.au
Till: l...@lists.mozilla.org, i...@squid-cache.org,
us...@httpd.apache.org, www-t...@w3.org
Ämne: HTTP-MPLEX
Datum: Mon, 3 Aug 2009 17:29:32 +1000

Dear Firefox, Squid, Apache and W3 communities,

Apologies for cross-posting, hopefully at the end of this email it  
will

be understood that it is not my intention to annoy people.

My recent PhD research focused on improving page and object retrieval
performance in the context of a congested network and a significant  
part
of this research was the development of HTTP-MPLEX. I would like to  
let

the word out about this protocol. The protocol is designed to improve
page and object retrieval time in bandwidth asymmetric (ADSL) network
environments, which are common in Australia. HTTP-MPLEX is based on  
HTTP

and is designed to be both transparent and backwards compatible.

At this time, all of my work on HTTP-MPLEX is in the public domain  
and
links to the individual publications are listed on my homepage [1].  
Of

the documents available, the most current/up-to-date work is my PhD
thesis.

As my candidature is now over, I'm hoping that some value can be  
found

in this work by the Internet community.

Sincerely,
Rob Mattson

[1] - http://www.mattson.com.au/robert/index.php?Menu=Research

Please consider the environment - do you really need to print this
email?




--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-17 Thread Mark Nottingham
I missed that Ian was still talking about using port 80. I think  
that's broken / more trouble than it's worth, for the reasons Adri is  
going through.


If you have to tunnel using an existing port, use 443 (with null  
encryption if you're worried about overhead, but still want to  
authenticate the endpoint). Even then, Wifi hotspots are probably  
going to redirect you, but using 443 should be a last-gasp measure  
anyway.


Cheers,


On 17/07/2009, at 3:18 PM, Adrian Chadd wrote:


2009/7/17 Ian Hickson i...@hixie.ch:

That way you are still speaking HTTP right until the protocol  
change

occurs, so any and all HTTP compatible changes in the path(s) will
occur.


As mentioned earlier, we need the handshake to be very precisely  
defined
because otherwise people could trick unsuspecting servers into  
opting in,
or rather appearing to opt in, and could then send all kinds of  
commands

down to those servers.


Would you please provide an example of where an unsuspecting server is
tricked into doing something?


Ian, don't you see and understand the semantic difference between
speaking HTTP and speaking a magic bytecode that is intended to  
look
HTTP-enough to fool a bunch of things until the upgrade process  
occurs
? Don't you understand that the possible set of things that can go  
wrong

here is quite unbounded ? Don't you understand the whole reason for
known ports and protocol descriptions in the first place?


Apparently not.


Ok. Look at this.

The byte sequence GET / HTTP/1.0\r\nHost: foo\r\nConnection:
close\r\n\r\n is not byte equivalent to the sequence GET /
HTTP/1.0\r\nConnection: close\r\nHost: foo\r\n\r\n

The same byte sequence interpreted as a HTTP protocol exchange is  
equivalent.


There's a mostly-expected understanding that what happens over port 80
is HTTP. The few cases where that has broken (specifically Shoutcast,
but I do see other crap on port 80 from time to time..) has been by
people who have implemented a mostly HTTP looking protocol, tested
that it mostly works via a few gateways/firewalls/proxies, and then
deployed it.

My suggestion is to completely toss the whole pretend to be HTTP  
thing
out of the window and look at extending or adding a new HTTP  
mechanism

for negotiating proper tunneling on port 80. If this involves making
CONNECT work on port 80 then so be it.


Redesigning HTTP is really much more work than I intend to take on  
here.
HTTP already has an Upgrade mechanism, reusing it seems the right  
thing to

do.


What you intend to take on here and what should be taken on here is
very relevant.
You're intending to do stuff over tcp/80 which looks like HTTP but
isn't HTTP. Everyone who implements anything HTTP gateway related (be
it a transparent proxy, a firewall, a HTTP router, etc) suddenly may
have to implement your websockets stuff as well. So all of a sudden
your attempt to not extend HTTP ends up extending HTTP.


The point is, there may be a whole lot of stuff going on with HTTP
implementations that you're not aware of.


Sure, but with the except of man-in-the-middle proxies, this isn't  
a big
deal -- the people implementing the server side are in control of  
what the

HTTP implementation is doing.


That may be your understanding of how the world works, but out here in
the rest of the world, the people who deploy the edge and the people
who deploy the core may not be the same people. There may be a dozen
layers of red tape, equipment lifecycle, security features, etc, that
need to be handled before websockets happy stuff can be deployed
everywhere it needs to.

Please don't discount man-in-the-middle -anything- as being easy  
to deal with.


In all cases except a man-in-the-middle proxy, this seems to be  
what we
do. I'm not sure how we can do anything in the case of such a  
proxy, since

by definition the client doesn't know it is present.


.. so you're still not speaking HTTP?

Ian, are you absolutely certain that everywhere you use the
internet, there is no man in the middle between you and the server
you're speaking to? Haven't you ever worked at any form of corporate
or enterprise environment? What about existing captive portal
deployments like wifi hotspots, some of which still use squid-2.5
(eww!) as their http firewall/proxy to control access to the internet?
That stuff is going to need upgrading sure, but I'd rather see the
upgrade happen once to a well thought out and reasonably well designed
protocol, versus having lots of little upgrades need to occur because
your HTTP but not quite HTTP exchange on port 80 isn't thought out
enough.




Adrian


--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-16 Thread Mark Nottingham
Realise that server-side infrastructure often includes things like  
CDNs (even when the content is uncacheable), reverse proxies and L7  
load balancers -- all of which can make the changes we're talking about.


While it's true that these things are under control of the people who  
own the server, changing them is *much* harder than deploying an  
Apache module.


Cheers,


On 17/07/2009, at 9:53 AM, Ian Hickson wrote:



The point is, there may be a whole lot of stuff going on with HTTP
implementations that you're not aware of.


Sure, but with the except of man-in-the-middle proxies, this isn't a  
big
deal -- the people implementing the server side are in control of  
what the

HTTP implementation is doing.



--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-16 Thread Mark Nottingham
Yep. Just think it's worth pointing out in the draft, since you go to  
such efforts to make this possible.


Cheers,


On 17/07/2009, at 10:28 AM, Ian Hickson wrote:


On Fri, 17 Jul 2009, Mark Nottingham wrote:


Realise that server-side infrastructure often includes things like  
CDNs

(even when the content is uncacheable), reverse proxies and L7 load
balancers -- all of which can make the changes we're talking about.

While it's true that these things are under control of the people who
own the server, changing them is *much* harder than deploying an  
Apache

module.


People with that level of complexity can easily just not share the  
port.


--
Ian Hickson   U+1047E) 
\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _ 
\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'-- 
(,_..'`-.;.'


--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-15 Thread Mark Nottingham

Upgrade is hop-by-hop, so it's pretty limiting.

Ian, an example:

An intermediary (transparent, intercepting or whatever) can and often  
does add arbitrary headers; e.g., x-forwarded-for. This is completely  
legal in HTTP, where headers that are not understood are ignored, and  
additionally several headers have provisions to change values in  
headers as well (e.g., transfer-encoding, te, via, connection).


They're also allowed to change the formatting of the message; e.g., re- 
wrap headers, normalise whitespace.


Specifying a bit-for bit handshake is incredibly fragile.

Cheers,


On 15/07/2009, at 3:26 PM, Adrian Chadd wrote:


2009/7/15 Ian Hickson i...@hixie.ch:

On Tue, 14 Jul 2009, Alex Rousskov wrote:


WebSocket made the handshake bytes look like something Squid  
thinks it
understands. That is the whole point of the argument. You are  
sending an
HTTP-looking message that is not really an HTTP message. I think  
this is
a recipe for trouble, even though it might solve some problem in  
some

environments.


Could you elaborate on what bytes Squid thinks it should change in  
the

WebSocket handshake?


Anything which it can under the HTTP/1.x RFCs.

Maybe I missed it - why exactly again aren't you just talking HTTP on
the HTTP port(s), and doing a standard HTTP upgrade?


Adrian


--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-15 Thread Mark Nottingham

On 15/07/2009, at 5:34 PM, Ian Hickson wrote:


On Wed, 15 Jul 2009, Mark Nottingham wrote:


Upgrade is hop-by-hop, so it's pretty limiting.


Do man-in-the-middle proxies count as a hop for the purposes of  
HTTP? As

far as I can tell from the HTTP spec, the client is supposed to know
whether it is speaking to a proxy or not, so man-in-the-middle proxies
don't affect the hop-by-hop semantics... but it's not entirely clear.


Intercepting (often called transparent, although that's confusing  
because HTTP defines semantically transparent as something  
completely different) do consider them to be separate hops (to be  
clear, the hops are on the wire, not on the devices, although  
sometimes there might be a virtual hop inside a device).


There are a few L7 load balancers that don't act as a full  
intermediary in the HTTP sense (or at least, what they do is muddy  
enough that it's not clear), but do fiddle with headers (e.g., this is  
why you see things like Coenection instead of Connection).


The important thing to remember is that interception *only* happens on  
port 80, except in the most pathological of networks (and you won't be  
able to do anything about them anyway).


Notice that I'm making a distinction between intercepting and  
firewalling here; all bets are off when a firewall does stateful  
inspection of your protocol, unless you can convince it to pass an  
encrypted stream (the TLS-over-443 solution).




They're also allowed to change the formatting of the message; e.g.,
re-wrap headers, normalise whitespace.


Sure, but that's why we have the TLS-over-port-443 option. In the  
cases
where there is uncooperative network infrastructure, the client can  
just
switch to that, and then there's no way the connection can be  
affected.


As long as you're using TLS, yes. That's going to limit scaling, of  
course.




Specifying a bit-for bit handshake is incredibly fragile.


Not doing so is unacceptably insecure for this use case, IMHO. We  
can't
run the risk of Web pages hitting SMTP servers and sending spam, or  
poking

at intranet servers, or who knows what else.



Let me put that a slightly different way; specifying a bit-for-bit  
handshake is fragile *when you expect it to pass through HTTP  
infrastructure that has no reason to preserve those bits exactly as  
they are*.


Can you remind me why you need the handshake to look like valid HTTP  
again? I think that's the crux here.


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: Hello from Mozilla

2009-07-15 Thread Mark Nottingham


On 16/07/2009, at 12:28 PM, Ian Hickson wrote:


On Thu, 16 Jul 2009, Mark Nottingham wrote:


As an alternative, why not:

1) specify a new port for normal WebSocket operation, and
2) specify that if there's a proxy configured, ask the proxy to  
CONNECT to

your new port, and
3) specify that if #2 fails, they can CONNECT to 443 and ask for an  
Upgrade to

WebSockets in the first HTTP request.


We don't want to ever have a spec-mandated switch from port to port  
(since

that implies an origin change),


Ah - that makes sense.


but basically, that's what the spec
currently says, except it requires the script to detect the failure  
at #2
and requires the script to explicitly try again on port 443. (And  
except
that the Upgrade has to be precisely constrained, not arbitrary  
HTTP, so
that we never get to a situation where the client thinks it has done  
an
upgrade but really the other side was tricked into sending the right  
bytes

for that, letting the script speak to a poor unsuspecting server that
didn't intentionally opt in.)


So, to be clear, the only time the byte-for-byte HTTP handshake is  
used is when it's over a TLS tunnel via CONNECT (i.e., it's not used  
to set up the tunnel, but only once it's established)?


If that's the case, should be no problem. A bit weird, thought;  
speaking two protocols on the same port isn't really good practice...


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Fwd: squid -I

2009-07-08 Thread Mark Nottingham
Upon request, Dave Dykstra provided a BSD licensed version of the  
script he uses for running multiple instances of Squid listening to a  
single port.


Can we get this included in contrib?

Cheers,

Begin forwarded message:


From: Dave Dykstra d...@fnal.gov
Date: 7 July 2009 6:35:19 AM
To: Mark Nottingham m...@yahoo-inc.com
Subject: Re: squid -I experiences

On Fri, Jul 03, 2009 at 01:52:43PM +1000, Mark Nottingham wrote:
I think you need to provide a license for it, so that it can be  
included

there...

If you do that I'll be happy to push for its inclusion.


Oh, ok.  I put in a BSD License.  The changed versions are below.  I
didn't make any other changes.

Thanks,

- Dave


-- multisquid ---
#!/usr/bin/perl -w
#
# run multiple squids.
#  If the command line options are for starting up and listening on a
#  socket, first open a socket for them to share with squid -I.
#  If either one results in an error exit code, return the first  
error code.

# Written by Dave Dykstra, July 2007
#
# Copyright (c) 2007, Fermi National Accelerator Laboratory
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions  
are met:

# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above  
copyright

# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of the Fermi National Accelerator  
Laboratory nor
# the names of its contributors may be used to endorse or promote  
products
# derived from this software without specific prior written  
permission.

#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND  
CONTRIBUTORS AS
# IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT  
LIMITED

# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,  
INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT  
LIMITED
# TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,  
DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY  
THEORY OF

# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#

use strict;
use Socket;
use IO::Handle;
use Fcntl;

if ($#ARGV  2) {
 print STDERR Usage: multisquid squidprefix numsquids http_port  
[squid_args ...]\n;

 exit 2;
}

my $prefix = shift(@ARGV);
my $numsquids = shift(@ARGV);
my $port = shift(@ARGV);
my $proto = getprotobyname('tcp');

if (!(($#ARGV = 0)  (($ARGV[0] eq -k) || ($ARGV[0] eq -z {
 #open the socket for both squids to listen on if not doing an
 # operation that doesn't use the socket (that is, -k or -z)
 close STDIN;
 my $fd;
 socket($fd, PF_INET, SOCK_STREAM, $proto)  || die socket: $!;
 setsockopt($fd, SOL_SOCKET, SO_REUSEADDR, 1)|| die setsockopt:  
$!;
 bind($fd, sockaddr_in($port, INADDR_ANY))	|| die bind of port  
$port: $!;

}

my $childn;
for ($childn = 0; $childn  $numsquids; $childn++) {
 if (fork() == 0) {
   exec $prefix/sbin/squid -f $prefix/etc/.squid-$childn.conf -I  
@ARGV || die exec: $!;

 }
 # get them to start at different times so they're identifiable by  
squidclient

 sleep 2;
}

my $exitcode = 0;
while(wait()  0) {
 if (($? != 0)  ($exitcode == 0)) {
   # Take the first non-zero exit code and ignore the other one.
   # exit expects a byte, but the exit code from wait() has signal
   #  numbers in low byte and exit code in high byte.  Combine them.
   $exitcode = ($?  8) | ($?  255);
 }
}

exit $exitcode;
-- init-squid ---
#!/bin/bash
# This script will work with one squid or up to 4 squids on the same  
http port.
# The number of squids is determined by the existence of cache  
directories
# as follows.  The main path to the cache directories is determined  
by the
# cache_dir option in squid.conf.  To run multiple squids, create  
directories

# of the form
#   `dirname $cache_dir`/$N/`basename $cache_dir`
# where N goes from 0 to the number of squids minus 1.  Also create
# cache_log directories of the same form.  Note that the cache_log  
option

# in squid.conf is a file, not a directory, so the $N is up one level:
#   cache_log_file=`basename $cache_log`
#   cache_log_dir=`dirname $cache_log`
#   cache_log_dir=`dirname $cache_log_dir`/$N/`basename  
$cache_log_dir`

# The access_log should be in the same directory as the cache_log, and
# the pid_filename also needs to be in similarly named directories  
(the

# same directories

Re: Hello from Mozilla

2009-07-08 Thread Mark Nottingham


If you want to tunnel over proxies, what Websockets (and any of the  
other hybi proposals) needs to specify is:
  - when (and only when) there is an explicitly configured proxy, use  
CONNECT *to* the configured proxy
  - otherwise, use a port other than 80 to avoid intercepting  
proxies, and do NOT use CONNECT


Regarding ports, your choice will be to use 443 and pretend to be SSL,  
or to register a new port for websockets with IANA (as already  
suggested).


Note that if you don't actually use SSL on port 443, there's no  
guarantee that a stateful firewall (e.g., CheckPoint, although I  
haven't used their products in a while) won't inspect the stream and  
decide it's bogus.


OTOH if you use a non-443 port, as you've seen it may be blocked.  
AFAIK this is quite common, both in existing Squid installs (of which  
there are many that squid-dev can't change) and other products.


As mentioned, the best thing is probably to register a new port; that  
way, vendors and administrators will know what it is and can account  
for it in their products/deployments. It's a lot easier to justify  
opening up a port if it's for a well-defined, identified protocol.


BTW, Websockets isn't an IETF protocol -- it's a draft that's under  
discussion along with a number of other proposals. It may (and  
probably will) change, and may never be published as an RFC.  
Implementing it now exposes you to the risk of these changes. But you  
already know that :)


Cheers,


On 08/07/2009, at 9:01 AM, Alex Rousskov wrote:



The reason I ask is because we're looking to take a patch that
implements the IETF websockets protocol:

   http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-17

I noticed that in section 3.1.3 the spec relies implicitly on CONNECT
being allowed to arbitrary ports.  But this is not the case for  
default
installs of squid, and thus I fear that the general approach may be  
flawed.


--
Mark Nottingham   m...@yahoo-inc.com




Re: A little help with cbdata and requestLink

2009-06-29 Thread Mark Nottingham

OK.

WRT requestlink, I was unsure what would happen if the request didn't  
go forward; will moving requestLink at the bottom up to here:

  fwdState-request = r; /* requestLink? */
do it?

Thanks,


On 27/06/2009, at 3:05 AM, Alex Rousskov wrote:


On 06/16/2009 09:57 PM, Mark Nottingham wrote:

Thanks. Anybody else have a second to look?


Please s/fwdStartFoo/fwdContinue/ and document what it is. Since  
this is

Squid2 you do not have to do it, of course.

Your cbdata and request manipulations appear technically correct to  
me.

IMHO, the temporary lack of requestLink is poor style that will be
dangerous for future code modifications.

Cheers,

Alex.




On 11/06/2009, at 11:28 PM, Amos Jeffries wrote:


Mark Nottingham wrote:

Would someone mind taking a quick look at this patch:
http://www.squid-cache.org/bugs/attachment.cgi?id=1989
and telling me if I've royally stuffed up with managing fwdState  
and

request linking?
It's to make miss_access a slow lookup...


Looks okay to these uneducated eyes.  Probably best to wait for
someone else to double-check before a HEAD commit, but IMO it looks
good enough for a patching.

This one is long-awaited by many. Thanks.

Amos
--
Please be using
Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1


--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com




Re: [squid-users] NONE/411 Length Required

2009-06-18 Thread Mark Nottingham
Yeah, saw that --- but I observed the same behaviour in 2, and can't  
figure out why (from a quick look, at least).


Cheers,


On 18/06/2009, at 2:19 PM, Amos Jeffries wrote:

On Thu, 18 Jun 2009 10:20:09 +1000, Mark Nottingham m...@yahoo-inc.com 


wrote:

Weird. The code that I'm assuming generates the 411 is (squid2-HEAD):


Reporter is using 3.0. The header parse and validation code is  
similar only

regarding flow and design.
Particular lines and if-else are very different in spots between  
each of

the 3 current release of Squid.




if (!clientCheckContentLength(request) || httpHeaderHas(request-

header, HDR_TRANSFER_ENCODING)) {

err = errorCon(ERR_INVALID_REQ, HTTP_LENGTH_REQUIRED, request);
http-al.http.code = err-http_status;
http-log_type = LOG_TCP_DENIED;
http-entry = clientCreateStoreEntry(http, request-method,
null_request_flags);
errorAppendEntry(http-entry, err);
return -1;
}

but clientCheckContentLength doesn't look like it's triggering it:

static int
clientCheckContentLength(request_t * r)
{
switch (r-method-code) {
case METHOD_GET:
case METHOD_HEAD:
/* We do not want to see a request entity on GET/HEAD requests */
return (r-content_length = 0 || Config.onoff.request_entities);
default:
/* For other types of requests we don't care */
return 1;
}
/* NOT REACHED */
}

because the request method is POST. However, the request headers  
don't

have Transfer-Encoding...

What am I missing? I know Bijayant is using Squid-3, but I'm  
observing

the same behaviour in my build of 2...



On 17/06/2009, at 4:00 PM, Mark Nottingham wrote:


[ moving to squid-dev ]

From what I can see, the site is using JavaScript to do autocomplete
on a search field. The autocomplete requests use POST, but without a
body.

With Firefox, this results in a POST request without a body; i.e.,
it doesn't have transfer-encoding *or* content-length.

Such a POST request is legal (although atypical; Safari and I think
others will include a Content-Length: 0 to signal no body
explicitly). See


http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-06#section-4.3



.


I think the right thing to do here is for Squid to only 411 when
there's a transfer-encoding present; if there's no content-length,
it's safe to assume 0 length.

Cheers,


On 17/06/2009, at 2:07 PM, Bijayant Kumar wrote:



Bijayant Kumar


--- On Mon, 15/6/09, Bijayant Kumar bijayan...@yahoo.com wrote:


From: Bijayant Kumar bijayan...@yahoo.com
Subject: Re: [squid-users] NONE/411 Length Required
To: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:48 PM


--- On Mon, 15/6/09, Amos Jeffries squ...@treenet.co.nz
wrote:


From: Amos Jeffries squ...@treenet.co.nz
Subject: Re: [squid-users] NONE/411 Length Required
To: Bijayant Kumar bijayan...@yahoo.com
Cc: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:06 PM
Bijayant Kumar wrote:

Hello list,

I have Squid version 3.0.STABLE 10 installed on

Gentoo

linux box. All things are working fine, means caching
proxying etc. There is a problem with some sites. When

I am

accessing one of those sites, in access.log I am

getting


NONE/411 3692 POST
http://.justdial.com/autosuggest_category_query_main.php?

- NONE/- text/html


And on the webpage I am getting whole error page

of

squid. Actually its a search related page. In the

search

criteria field as soon as I am typing after two words

I am

getting this error. The website in a question is http://justdial.com



. But it works without the Squid.



I tried to capture the http headers also which

are as

below




http://.justdial.com/autosuggest_category_query_main.php?city=Bangaloresearch=Ka




POST



/autosuggest_category_query_main.php?city=Bangaloresearch=Ka

HTTP/1.1


Host: .justdial.com

User-Agent: Mozilla/5.0 (X11; U; Linux i686;

en-US;

rv:1.8.1.16) Gecko/20080807 Firefox/2.0.0.16


Accept:



text/xml,application/xml,application/xhtml+xml,text/
html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5


Accept-Language: en-us,en;q=0.7,hi;q=0.3

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Referer: http://.justdial.com/

Cookie:

PHPSESSID=d1d12004187d4bf1f084a1252ec46cef;



__utma=79653650.2087995718.1245064656.1245064656.1245064656.1;

__utmb=79653650; __utmc=79653650;


__utmz=79653650.1245064656.1.1.utmccn=(direct)|utmcsr=(direct)|
utmcmd=(none);

CITY=Bangalore


Pragma: no-cache

Cache-Control: no-cache



HTTP/1.x 411 Length Required

Server: squid/3.0.STABLE10

Mime-Version: 1.0

Date: Mon, 15 Jun 2009 11:18:10 GMT

Content-Type: text/html

Content-Length: 3287

Expires: Mon, 15 Jun 2009 11:18:10 GMT

X-Squid-Error: ERR_INVALID_REQ 0

X-Cache: MISS from bijayant.kavach.blr

X-Cache-Lookup: NONE from

bijayant.kavach.blr:3128


Via: 1.0 bijayant.kavach.blr

Re: [squid-users] NONE/411 Length Required

2009-06-17 Thread Mark Nottingham

[ moving to squid-dev ]

From what I can see, the site is using JavaScript to do autocomplete  
on a search field. The autocomplete requests use POST, but without a  
body.


With Firefox, this results in a POST request without a body; i.e., it  
doesn't have transfer-encoding *or* content-length.


Such a POST request is legal (although atypical; Safari and I think  
others will include a Content-Length: 0 to signal no body explicitly).  
See http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-06#section-4.3 
.


I think the right thing to do here is for Squid to only 411 when  
there's a transfer-encoding present; if there's no content-length,  
it's safe to assume 0 length.


Cheers,


On 17/06/2009, at 2:07 PM, Bijayant Kumar wrote:



Bijayant Kumar


--- On Mon, 15/6/09, Bijayant Kumar bijayan...@yahoo.com wrote:


From: Bijayant Kumar bijayan...@yahoo.com
Subject: Re: [squid-users] NONE/411 Length Required
To: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:48 PM


--- On Mon, 15/6/09, Amos Jeffries squ...@treenet.co.nz
wrote:


From: Amos Jeffries squ...@treenet.co.nz
Subject: Re: [squid-users] NONE/411 Length Required
To: Bijayant Kumar bijayan...@yahoo.com
Cc: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:06 PM
Bijayant Kumar wrote:

Hello list,

I have Squid version 3.0.STABLE 10 installed on

Gentoo

linux box. All things are working fine, means caching
proxying etc. There is a problem with some sites. When

I am

accessing one of those sites, in access.log I am

getting


NONE/411 3692 POST http://.justdial.com/autosuggest_category_query_main.php?

- NONE/- text/html


And on the webpage I am getting whole error page

of

squid. Actually its a search related page. In the

search

criteria field as soon as I am typing after two words

I am
getting this error. The website in a question is http://justdial.com 
. But it works without the Squid.



I tried to capture the http headers also which

are as

below


http://.justdial.com/autosuggest_category_query_main.php?city=Bangaloresearch=Ka



POST



/autosuggest_category_query_main.php?city=Bangaloresearch=Ka

HTTP/1.1


Host: .justdial.com

User-Agent: Mozilla/5.0 (X11; U; Linux i686;

en-US;

rv:1.8.1.16) Gecko/20080807 Firefox/2.0.0.16


Accept:


text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/ 
plain;q=0.8,image/png,*/*;q=0.5


Accept-Language: en-us,en;q=0.7,hi;q=0.3

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Referer: http://.justdial.com/

Cookie:

PHPSESSID=d1d12004187d4bf1f084a1252ec46cef;



__utma=79653650.2087995718.1245064656.1245064656.1245064656.1;

__utmb=79653650; __utmc=79653650;

__utmz=79653650.1245064656.1.1.utmccn=(direct)|utmcsr=(direct)| 
utmcmd=(none);

CITY=Bangalore


Pragma: no-cache

Cache-Control: no-cache



HTTP/1.x 411 Length Required

Server: squid/3.0.STABLE10

Mime-Version: 1.0

Date: Mon, 15 Jun 2009 11:18:10 GMT

Content-Type: text/html

Content-Length: 3287

Expires: Mon, 15 Jun 2009 11:18:10 GMT

X-Squid-Error: ERR_INVALID_REQ 0

X-Cache: MISS from bijayant.kavach.blr

X-Cache-Lookup: NONE from

bijayant.kavach.blr:3128


Via: 1.0 bijayant.kavach.blr

(squid/3.0.STABLE10)


Proxy-Connection: close

Please suggest me what could be the reason and

how to

resolve this. Any help/pointer can be a very helpful

for me.




Bijayant Kumar


 Get your new

Email

address!

Grab the Email name you've always wanted before

someone else does!

http://mail.promotions.yahoo.com/newdomains/aa/



NONE - no upstream source.
411  - Content-Length missing

HTTP requires a Content-Length: header on POST

requests.




How to resolve this issue. Because the website is on internet and  
its working fine without the squid. When I am bypassing the proxy, I  
am not getting any type of error.


Can't this website be accessed through the Squid?



Amos
-- Please be using
 Current Stable Squid 2.7.STABLE6 or

3.0.STABLE15

 Current Beta Squid 3.1.0.8 or

3.0.STABLE16-RC1





 New Email addresses available on
Yahoo!
Get the Email name you've always wanted on the new
@ymail and @rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/aa/




 Get your new Email address!
Grab the Email name you#39;ve always wanted before someone else does!
http://mail.promotions.yahoo.com/newdomains/aa/


--
Mark Nottingham   m...@yahoo-inc.com




Re: [squid-users] NONE/411 Length Required

2009-06-17 Thread Mark Nottingham

Weird. The code that I'm assuming generates the 411 is (squid2-HEAD):

	if (!clientCheckContentLength(request) || httpHeaderHas(request- 
header, HDR_TRANSFER_ENCODING)) {

err = errorCon(ERR_INVALID_REQ, HTTP_LENGTH_REQUIRED, request);
http-al.http.code = err-http_status;
http-log_type = LOG_TCP_DENIED;
	http-entry = clientCreateStoreEntry(http, request-method,  
null_request_flags);

errorAppendEntry(http-entry, err);
return -1;
}

but clientCheckContentLength doesn't look like it's triggering it:

static int
clientCheckContentLength(request_t * r)
{
switch (r-method-code) {
case METHOD_GET:
case METHOD_HEAD:
/* We do not want to see a request entity on GET/HEAD requests */
return (r-content_length = 0 || Config.onoff.request_entities);
default:
/* For other types of requests we don't care */
return 1;
}
/* NOT REACHED */
}

because the request method is POST. However, the request headers don't  
have Transfer-Encoding...


What am I missing? I know Bijayant is using Squid-3, but I'm observing  
the same behaviour in my build of 2...




On 17/06/2009, at 4:00 PM, Mark Nottingham wrote:


[ moving to squid-dev ]

From what I can see, the site is using JavaScript to do autocomplete  
on a search field. The autocomplete requests use POST, but without a  
body.


With Firefox, this results in a POST request without a body; i.e.,  
it doesn't have transfer-encoding *or* content-length.


Such a POST request is legal (although atypical; Safari and I think  
others will include a Content-Length: 0 to signal no body  
explicitly). See http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-06#section-4.3 
.


I think the right thing to do here is for Squid to only 411 when  
there's a transfer-encoding present; if there's no content-length,  
it's safe to assume 0 length.


Cheers,


On 17/06/2009, at 2:07 PM, Bijayant Kumar wrote:



Bijayant Kumar


--- On Mon, 15/6/09, Bijayant Kumar bijayan...@yahoo.com wrote:


From: Bijayant Kumar bijayan...@yahoo.com
Subject: Re: [squid-users] NONE/411 Length Required
To: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:48 PM


--- On Mon, 15/6/09, Amos Jeffries squ...@treenet.co.nz
wrote:


From: Amos Jeffries squ...@treenet.co.nz
Subject: Re: [squid-users] NONE/411 Length Required
To: Bijayant Kumar bijayan...@yahoo.com
Cc: squid users squid-us...@squid-cache.org
Date: Monday, 15 June, 2009, 6:06 PM
Bijayant Kumar wrote:

Hello list,

I have Squid version 3.0.STABLE 10 installed on

Gentoo

linux box. All things are working fine, means caching
proxying etc. There is a problem with some sites. When

I am

accessing one of those sites, in access.log I am

getting


NONE/411 3692 POST http://.justdial.com/autosuggest_category_query_main.php?

- NONE/- text/html


And on the webpage I am getting whole error page

of

squid. Actually its a search related page. In the

search

criteria field as soon as I am typing after two words

I am
getting this error. The website in a question is http://justdial.com 
. But it works without the Squid.



I tried to capture the http headers also which

are as

below


http://.justdial.com/autosuggest_category_query_main.php?city=Bangaloresearch=Ka



POST



/autosuggest_category_query_main.php?city=Bangaloresearch=Ka

HTTP/1.1


Host: .justdial.com

User-Agent: Mozilla/5.0 (X11; U; Linux i686;

en-US;

rv:1.8.1.16) Gecko/20080807 Firefox/2.0.0.16


Accept:


text/xml,application/xml,application/xhtml+xml,text/ 
html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5


Accept-Language: en-us,en;q=0.7,hi;q=0.3

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Referer: http://.justdial.com/

Cookie:

PHPSESSID=d1d12004187d4bf1f084a1252ec46cef;



__utma=79653650.2087995718.1245064656.1245064656.1245064656.1;

__utmb=79653650; __utmc=79653650;

__utmz=79653650.1245064656.1.1.utmccn=(direct)|utmcsr=(direct)| 
utmcmd=(none);

CITY=Bangalore


Pragma: no-cache

Cache-Control: no-cache



HTTP/1.x 411 Length Required

Server: squid/3.0.STABLE10

Mime-Version: 1.0

Date: Mon, 15 Jun 2009 11:18:10 GMT

Content-Type: text/html

Content-Length: 3287

Expires: Mon, 15 Jun 2009 11:18:10 GMT

X-Squid-Error: ERR_INVALID_REQ 0

X-Cache: MISS from bijayant.kavach.blr

X-Cache-Lookup: NONE from

bijayant.kavach.blr:3128


Via: 1.0 bijayant.kavach.blr

(squid/3.0.STABLE10)


Proxy-Connection: close

Please suggest me what could be the reason and

how to

resolve this. Any help/pointer can be a very helpful

for me.




Bijayant Kumar


Get your new

Email

address!

Grab the Email name you've always wanted before

someone else does!

http://mail.promotions.yahoo.com/newdomains/aa/



NONE - no upstream source.
411  - Content-Length missing

HTTP requires a Content-Length: header on POST

requests.




How to resolve

Re: A little help with cbdata and requestLink

2009-06-16 Thread Mark Nottingham

Thanks. Anybody else have a second to look?


On 11/06/2009, at 11:28 PM, Amos Jeffries wrote:


Mark Nottingham wrote:

Would someone mind taking a quick look at this patch:
 http://www.squid-cache.org/bugs/attachment.cgi?id=1989
and telling me if I've royally stuffed up with managing fwdState  
and request linking?

It's to make miss_access a slow lookup...


Looks okay to these uneducated eyes.  Probably best to wait for  
someone else to double-check before a HEAD commit, but IMO it looks  
good enough for a patching.


This one is long-awaited by many. Thanks.

Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
 Current Beta Squid 3.1.0.8 or 3.0.STABLE16-RC1


--
Mark Nottingham   m...@yahoo-inc.com




Re: Problem with cached entries w/ETag and request without If-None-Match header

2009-06-15 Thread Mark Nottingham
Selecting request headers are specified by Vary; If-None-Match is a  
conditional request header.


Cheers,


On 16/06/2009, at 12:44 AM, Jason Noble wrote:


From RFC 2616 13.6:
...
When the cache receives a subsequent request whose Request-URI  
specifies one or
more cache entries including a Vary header field, the cache MUST NOT  
use such a
cache entry to construct a response to the new request unless all of  
the
selecting request-headers present in the new request match the  
corresponding

stored request-headers in the original request. ...


For the case in question, all selecting request headers do not match  
the stored request headers.  Therefore, the cache must not use the  
stored entry to construct a response.


--Jason

Mark Nottingham wrote:

What requirement in RFC2616 does this violate?

On 13/06/2009, at 3:02 AM, Jason Noble wrote:

I recently ran into a bug on Squid 2.7 regarding cached content  
with ETags.  Currently, if all cached entries for a URL include  
ETags, and a request is received for said URL with no If-None- 
Match header, Squid will serve a cached entry.  This behavior does  
not follow RFC 2616.  I have attached a patch that prevents Squid  
from serving the cached entries in said case here:  http://www.squid-cache.org/bugs/show_bug.cgi?id=2677


I would appreciate any feedback regarding this patch.

Thanks,
Jason


--
Mark Nottingham   m...@yahoo-inc.com




--
Mark Nottingham   m...@yahoo-inc.com




Re: Problem with cached entries w/ETag and request without If-None-Match header

2009-06-12 Thread Mark Nottingham

What requirement in RFC2616 does this violate?

On 13/06/2009, at 3:02 AM, Jason Noble wrote:

I recently ran into a bug on Squid 2.7 regarding cached content with  
ETags.  Currently, if all cached entries for a URL include ETags,  
and a request is received for said URL with no If-None-Match header,  
Squid will serve a cached entry.  This behavior does not follow RFC  
2616.  I have attached a patch that prevents Squid from serving the  
cached entries in said case here:  http://www.squid-cache.org/bugs/show_bug.cgi?id=2677


I would appreciate any feedback regarding this patch.

Thanks,
Jason


--
Mark Nottingham   m...@yahoo-inc.com




A little help with cbdata and requestLink

2009-06-10 Thread Mark Nottingham

Would someone mind taking a quick look at this patch:
  http://www.squid-cache.org/bugs/attachment.cgi?id=1989

and telling me if I've royally stuffed up with managing fwdState and  
request linking?


It's to make miss_access a slow lookup...

Thanks,


--
Mark Nottingham   m...@yahoo-inc.com




storeurl_rewrite and vary

2009-05-24 Thread Mark Nottingham
Can those familiar with the ins and outs of the Vary implementation  
have a quick look at

  http://www.squid-cache.org/bugs/show_bug.cgi?id=2678
to see if I'm on the right track?

If so, I'll come up with a patch, as well as one for 1947 (which looks  
to be quite similar).


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: Is it really necessary for fatal() to dump core?

2009-05-20 Thread Mark Nottingham

Patch at:
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2673


On 19/05/2009, at 1:50 PM, Mark Nottingham wrote:


tools.c:fatal() dumps core because it calls abort.

Considering that the core can be quite large (esp. on a 64bit  
system), and that there's fatal_dump() as well if you really want  
one, can we just make fatal() exit(1) instead of abort()ing?


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-05-20 Thread Mark Nottingham

On 2-HEAD.


On 14/05/2009, at 6:36 PM, Mark Nottingham wrote:



On 23/04/2009, at 10:38 AM, Mark Nottingham wrote:


http://www.squid-cache.org/bugs/show_bug.cgi?id=2643



http://www.squid-cache.org/bugs/show_bug.cgi?id=2631




These are the last two remaining. There's been some discussion on  
them, but I believe the issues have been resolved in the most recent  
patches attached to them; if I don't hear otherwise soon, I'll go  
ahead and apply to 2-HEAD.


Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




--
Mark Nottingham   m...@yahoo-inc.com




Re: Is it really necessary for fatal() to dump core?

2009-05-19 Thread Mark Nottingham
I'm going to push back on that; the administrator doesn't really have  
any need to get a core when, for example, append_domain doesn't start  
with .'.


Squid.conf is bloated as it is; if there are cases where a core could  
be conceivably useful, they should be converted to fatal_dump. From  
what I've seen they'll be a small minority at best...





On 19/05/2009, at 2:25 PM, Adrian Chadd wrote:


just make that behaviour configurable?

core_on_fatal {on|off}



Adrian

2009/5/19 Mark Nottingham m...@yahoo-inc.com:

tools.c:fatal() dumps core because it calls abort.

Considering that the core can be quite large (esp. on a 64bit  
system), and
that there's fatal_dump() as well if you really want one, can we  
just make

fatal() exit(1) instead of abort()ing?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com




Re: Is it really necessary for fatal() to dump core?

2009-05-19 Thread Mark Nottingham
Yeah; looking through, it looks to me like a few in the auth and fs  
modules need to be changed to fatal_dump; the rest (over a hundred)  
don't really need a core -- they're things like config parsing errors  
and self-evident error states.


I'll come up with a patch that does that and adds a FATAL_DUMPS  
compile-time flag with a 0 default.



On 20/05/2009, at 11:28 AM, Mark Nottingham wrote:

The case that triggered this for me was when the log daemon dies;  
doesn't make much sense to core there either.


I'll take a look through and report back.


On 20/05/2009, at 11:24 AM, Amos Jeffries wrote:

I'm going to push back on that; the administrator doesn't really  
have
any need to get a core when, for example, append_domain doesn't  
start

with .'.

Squid.conf is bloated as it is; if there are cases where a core  
could

be conceivably useful, they should be converted to fatal_dump. From
what I've seen they'll be a small minority at best...



That I agree with. Grep the code for all fatals and see what falls  
out.
I think you will find it's only the config parser that can get away  
with

non-core fatals.

Amos





On 19/05/2009, at 2:25 PM, Adrian Chadd wrote:


just make that behaviour configurable?

core_on_fatal {on|off}



Adrian

2009/5/19 Mark Nottingham m...@yahoo-inc.com:

tools.c:fatal() dumps core because it calls abort.

Considering that the core can be quite large (esp. on a 64bit
system), and
that there's fatal_dump() as well if you really want one, can we
just make
fatal() exit(1) instead of abort()ing?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com








--
Mark Nottingham   m...@yahoo-inc.com




--
Mark Nottingham   m...@yahoo-inc.com




Re: Is it really necessary for fatal() to dump core?

2009-05-19 Thread Mark Nottingham

Fine by me.

On 20/05/2009, at 12:25 PM, Amos Jeffries wrote:


Yeah; looking through, it looks to me like a few in the auth and fs
modules need to be changed to fatal_dump; the rest (over a hundred)
don't really need a core -- they're things like config parsing errors
and self-evident error states.

I'll come up with a patch that does that and adds a FATAL_DUMPS
compile-time flag with a 0 default.


What do you mean by that last?
I think if you are going to correct the fatal() and fatal_dump()  
usage it

won't be needed.

Amos




On 20/05/2009, at 11:28 AM, Mark Nottingham wrote:


The case that triggered this for me was when the log daemon dies;
doesn't make much sense to core there either.

I'll take a look through and report back.


On 20/05/2009, at 11:24 AM, Amos Jeffries wrote:


I'm going to push back on that; the administrator doesn't really
have
any need to get a core when, for example, append_domain doesn't
start
with .'.

Squid.conf is bloated as it is; if there are cases where a core
could
be conceivably useful, they should be converted to fatal_dump.  
From

what I've seen they'll be a small minority at best...



That I agree with. Grep the code for all fatals and see what falls
out.
I think you will find it's only the config parser that can get away
with
non-core fatals.

Amos





On 19/05/2009, at 2:25 PM, Adrian Chadd wrote:


just make that behaviour configurable?

core_on_fatal {on|off}



Adrian

2009/5/19 Mark Nottingham m...@yahoo-inc.com:

tools.c:fatal() dumps core because it calls abort.

Considering that the core can be quite large (esp. on a 64bit
system), and
that there's fatal_dump() as well if you really want one, can we
just make
fatal() exit(1) instead of abort()ing?

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com





--
Mark Nottingham   m...@yahoo-inc.com








--
Mark Nottingham   m...@yahoo-inc.com




--
Mark Nottingham   m...@yahoo-inc.com








--
Mark Nottingham   m...@yahoo-inc.com




Re: [RFC] Translate and Unless-Modified-Since headers

2009-05-18 Thread Mark Nottingham
Sorry to be blunt, but shouldn't these sites be securing themselves?  
Having Squid strip this header hardly closes any significant attack  
vectors off... and doing so creates yet another special case for  
people to work around.


-1 on Translate (default strip; registering it, I suppose, although  
it's a vendor-specific extension header that they haven't bothered to  
register; I'd rather the focus be on those headers that people have  
actually tried to do the right thing for -- especially when they have  
*not* said they'll license patents for this specification).


WRT Unless-Modified-Since -- IIRC this is a very old, pre-2068 version  
of If-Range. /me looks around...

see: 
http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15847a-s96/web/draft-luotonen-http-url-byterange-02.txt

What's the issue with it? Amusingly, MSFT thinks it's a response header:
  http://msdn.microsoft.com/en-us/library/aa917918.aspx



On 18/05/2009, at 9:05 PM, Amos Jeffries wrote:


Both of these are non-standard headers created by microsoft.

These are both weird ones. We seem to need them, but only because  
they need to be stripped away in certain circumstances.


The Translate: header is the trickiest. After reading the docs it  
appears we should be always stripping it away for security. It's  
entire purpose is to perform code disclosure 'attacks' on targeted  
dynamic sites. With perhapse a fast-ACL to allow admins to use it  
and control the requests using it when they really need to.


Pending any objections I'll add as registered headers in 3.0 and the  
above handling for Translate in 3.1.


Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
 Current Beta Squid 3.1.0.7


--
Mark Nottingham   m...@yahoo-inc.com




Is it really necessary for fatal() to dump core?

2009-05-18 Thread Mark Nottingham

tools.c:fatal() dumps core because it calls abort.

Considering that the core can be quite large (esp. on a 64bit system),  
and that there's fatal_dump() as well if you really want one, can we  
just make fatal() exit(1) instead of abort()ing?


Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-05-14 Thread Mark Nottingham


On 23/04/2009, at 10:38 AM, Mark Nottingham wrote:


http://www.squid-cache.org/bugs/show_bug.cgi?id=2643



http://www.squid-cache.org/bugs/show_bug.cgi?id=2631




These are the last two remaining. There's been some discussion on  
them, but I believe the issues have been resolved in the most recent  
patches attached to them; if I don't hear otherwise soon, I'll go  
ahead and apply to 2-HEAD.


Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Re: One final (?) set of patches for 2.HEAD

2009-05-11 Thread Mark Nottingham

All applied to 2-HEAD.

On 09/05/2009, at 12:43 PM, Amos Jeffries wrote:


Mark Nottingham wrote:

Just a few more;
HTCP logging
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2627


+1.


ignore-must-revalidate
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2645


+1.


Create request methods consistently
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2646


+1.


Do override-* before stale-while-revalidate, stale-if-error
 http://www.squid-cache.org/bugs/show_bug.cgi?id=2647


Yeesh. +1 if tested true.

Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
 Current Beta Squid 3.1.0.7


--
Mark Nottingham   m...@yahoo-inc.com




Re: [squid-users] CARP Failover behavior - multiple parents chosen for URL

2009-05-11 Thread Mark Nottingham

A patch to make PEER_TCP_MAGIC_COUNT configurable is on 2-HEAD;
   http://www.squid-cache.org/Versions/v2/HEAD/changesets/12208.patch

Cheers,


On 12/05/2009, at 1:15 PM, Chris Woodfield wrote:

1. Adjust PEER_TCP_MAGIC_COUNT from 10 to 1, so that a parent is  
marked DEAD after only one failure. This may be overly sensitive  
however.


--
Mark Nottingham   m...@yahoo-inc.com




One final (?) set of patches for 2.HEAD

2009-05-07 Thread Mark Nottingham

Just a few more;

HTCP logging
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2627

ignore-must-revalidate
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2645

Create request methods consistently
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2646

Do override-* before stale-while-revalidate, stale-if-error
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2647


--
Mark Nottingham   m...@yahoo-inc.com




Re: [PATCH] Cache-Control overwriting hack

2009-05-05 Thread Mark Nottingham
I'm neutral on this. Chris, the what that Robert talks about is what I  
had originally thought, but I'm not dead-set either way...


On 05/05/2009, at 12:25 PM, Robert Collins wrote:


On Tue, 2009-05-05 at 12:17 +1000, Mark Nottingham wrote:

The functionality? Very much so; I've been thinking about adding this
sort of thing for a while. Very useful if you're running an  
accelerator.


No, a rewrite of the approach - seems to me that a functional version
many things support  a new version that few things support.

That said, I did have one concern - I think its clearer to say:
'surrogates use this header, clients get the original cache-control',
than to say:
'surrogates use cache-control, and if there is a header X they replace
cache-control with X'.

The latter will be harder to debug by network traces I think.

-Rob


--
Mark Nottingham   m...@yahoo-inc.com




Re: [PATCH] Cache-Control overwriting hack

2009-05-04 Thread Mark Nottingham
Yeah, but if I had a do-over, I'd make it simpler. This is *much*  
simpler...



On 05/05/2009, at 9:46 AM, Robert Collins wrote:


surrogate-control is an existing, defined header that should do this
cleanly, and squid-3 already supports.

-Rob


--
Mark Nottingham   m...@yahoo-inc.com




Re: [PATCH] Cache-Control overwriting hack

2009-05-04 Thread Mark Nottingham
The functionality? Very much so; I've been thinking about adding this  
sort of thing for a while. Very useful if you're running an accelerator.




On 05/05/2009, at 12:12 PM, Robert Collins wrote:


On Tue, 2009-05-05 at 11:57 +1000, Mark Nottingham wrote:

Yeah, but if I had a do-over, I'd make it simpler. This is *much*
simpler...


I think you are entitled to that... but is it needed?

-Rob


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-22 Thread Mark Nottingham


On 21/04/2009, at 10:44 AM, Mark Nottingham wrote:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2642


Seems uncontroversial, applying.



http://www.squid-cache.org/bugs/show_bug.cgi?id=2643


Amos, any thoughts about the revised patch (monitor-direct)?



http://www.squid-cache.org/bugs/show_bug.cgi?id=2631


Amos, does this make sense now?



http://www.squid-cache.org/bugs/show_bug.cgi?id=2632


I think we cleared up confusion in discussion; applying unless I hear  
otherwise.




http://www.squid-cache.org/bugs/show_bug.cgi?id=2515


Patch seems to work; applying.


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-22 Thread Mark Nottingham


On 23/04/2009, at 2:11 PM, Amos Jeffries wrote:



On 21/04/2009, at 10:44 AM, Mark Nottingham wrote:

snip



http://www.squid-cache.org/bugs/show_bug.cgi?id=2643


Amos, any thoughts about the revised patch (monitor-direct)?



I still don't agree that this is anything close to the Right Way to  
do it.
Easy, yes, but thats all. Please hold off until Henrik can get a  
good look

at it and voice his opinion.


OK. Henrik?

FWIW, I think this is the right way to do it -- a flag saying that  
monitoring should be direct is backwards-compatible, easy for users to  
understand, and addresses the use case.




http://www.squid-cache.org/bugs/show_bug.cgi?id=2631


Amos, does this make sense now?



http://www.squid-cache.org/bugs/show_bug.cgi?id=2632


I think we cleared up confusion in discussion; applying unless I hear
otherwise.



Well, Henrik may have other opinions since he has to manage it. But  
I'm no

longer objecting to it as a temporary measure.


Assuming that this applies to both. H?


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-20 Thread Mark Nottingham


Responses inline, and a couple more:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2642
http://www.squid-cache.org/bugs/show_bug.cgi?id=2643


On 20/04/2009, at 4:46 PM, Amos Jeffries wrote:


Mark Nottingham wrote:
Unless I hear otherwise, I'm going to apply the patches attached to  
the following bugs:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2631


response in bugzilla.


Likewise.



http://www.squid-cache.org/bugs/show_bug.cgi?id=2632


IMO, this should be number of times squid tries _each_ available  
forwarding method before giving up on it. With a default of  
something lower, 1 or 2 may be reasonable in most of todays internet.


+1 on the configurable idea though.


Sorry, could you explain what you mean by each method? Is it direct  
vs. peer?



Definitely relevant to squid-3, if you commit this for 2 before it  
gets to 3, please just comment commited to squid-2 and bump target  
milestone to 3.HEAD


Ack.



http://www.squid-cache.org/bugs/show_bug.cgi?id=2515


Looks good to me. Mind the formatting though.


Yeah. Still can't get the proper version of indent running on OSX, so  
I have to shove it to another box to indent before submission...



Is it relevant to squid-3 parser?


Don't think so; StringToInt64 doesn't look at errno.


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-20 Thread Mark Nottingham


Responses inline, and a couple more:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2642
http://www.squid-cache.org/bugs/show_bug.cgi?id=2643


On 20/04/2009, at 4:46 PM, Amos Jeffries wrote:


Mark Nottingham wrote:
Unless I hear otherwise, I'm going to apply the patches attached to  
the following bugs:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2631


response in bugzilla.


Likewise.



http://www.squid-cache.org/bugs/show_bug.cgi?id=2632


IMO, this should be number of times squid tries _each_ available  
forwarding method before giving up on it. With a default of  
something lower, 1 or 2 may be reasonable in most of todays internet.


+1 on the configurable idea though.


Sorry, could you explain what you mean by each method? Is it direct  
vs. peer?



Definitely relevant to squid-3, if you commit this for 2 before it  
gets to 3, please just comment commited to squid-2 and bump target  
milestone to 3.HEAD


Ack.



http://www.squid-cache.org/bugs/show_bug.cgi?id=2515


Looks good to me. Mind the formatting though.


Yeah. Still can't get the proper version of indent running on OSX, so  
I have to shove it to another box to indent before submission...



Is it relevant to squid-3 parser?


Don't think so; StringToInt64 doesn't look at errno.


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-20 Thread Mark Nottingham
Yeah, this came up in another bug as well, don't remember where, but  
really this whole section needs to be reworked pretty extensively;  
this is just a way to fine-tune the current behaviour until we figure  
out what the right thing to do should be (and I suspect that's not a  
trivial task).


BTW, it's not exactly as you describe; it's not 10x attempts per  
route, it's 10 routes, AFAICT.


Cheers,


On 21/04/2009, at 1:08 PM, Amos Jeffries wrote:

Sorry I'm wandering in the vague area between access methods and  
routing

directions here. What I mean is an aggregate of all that.

At present we have:
DIRECT via IP #1
DIRECT via IP #2
... repeat for all possible IPs.
PEER #1
PERR #2
REEP # ... 64

Doing each of those within a 1 minute timeout and 10x attempts per  
route

causes unreasonably long delays and false-failures. A few hacks reduce
this somewhat by dropping the timeout, doing one connect when 1 IPs
found, and only trying one peer per request, using netdb to improve  
the

peers chances of working, but still hitting the problem.


--
Mark Nottingham   m...@yahoo-inc.com




Re: More patches for squid2-HEAD

2009-04-20 Thread Mark Nottingham


On 21/04/2009, at 1:24 PM, Amos Jeffries wrote:



Responses inline, and a couple more:

http://www.squid-cache.org/bugs/show_bug.cgi?id=2642


I can't tell from the patch which one is being remove.
+1 if its the one directly in mainReconfigure()


Yep.




peerMonitorInit() should probably check for duplicate calls somehow  
too.

But this is good for a quick fix.


http://www.squid-cache.org/bugs/show_bug.cgi?id=2643


-1 right now. see bug report.


Responded there.

Cheers,


--
Mark Nottingham   m...@yahoo-inc.com




Re: Introducing myself

2009-04-20 Thread Mark Nottingham

Hi!

Take a look at stale-while-revalidate (cache_peer option in 2.7); it  
may do what you need.


Cheers,


On 21/04/2009, at 2:10 PM, Alistair Reay wrote:


Hi everyone,

I'd like to introduce myself to the dev team and start helping out.

My name is Alistair Reay and I'm a system engineer at a large New
Zealand broadcaster that uses Squid and other open-source software
extensively. Using squid we've built the nations largest and cheapest
commercial CDN for our VOD offering so I've got a vested interest in
helping Squid kick more ass. Although I'm not a professional  
developer,

I have a lot of interest in contributing code to this project and I've
created a production-ready load balancer project in SourceForge called
Octopus http://sourceforge.net/projects/octopuslb/ that works really
well behind Squid.

Anyway, the first thing I'd like to do is investigate how
refresh_stale_hit works and try to improve it. I searched to squid- 
users

mail list and found this thread of conversation which is what I'd like
to implement in Squid2.7. If you'll have me, I'll subscribe to this
mailing list and make a new topic about this feature request then  
start

work.

http://www.squid-cache.org/mail-archive/squid-users/200609/0162.html
User's query/request (also what I'd like to be able to do)
http://www.squid-cache.org/mail-archive/squid-users/200609/0167.html
Henrik's response

Cheers
Al
==
For more information on the Television New Zealand Group, visit us
online at tvnz.co.nz
==
CAUTION:  This e-mail and any attachment(s) contain information that
is intended to be read only by the named recipient(s).  This  
information

is not to be used or stored by any other person and/or organisation.



--
Mark Nottingham   m...@yahoo-inc.com




More patches for squid2-HEAD

2009-04-19 Thread Mark Nottingham
Unless I hear otherwise, I'm going to apply the patches attached to  
the following bugs:


http://www.squid-cache.org/bugs/show_bug.cgi?id=2631
http://www.squid-cache.org/bugs/show_bug.cgi?id=2632
http://www.squid-cache.org/bugs/show_bug.cgi?id=2515


--
Mark Nottingham   m...@yahoo-inc.com




FYI: rewritten HTTP caching section

2009-03-24 Thread Mark Nottingham
As you all hopefully know, the IETF HTTPbis WG is in the process of re- 
writing HTTP to clarify it, address interop issues, and generally  
increase specification quality.


One of the first things we did was breaking up the document into seven  
separate sections, listed at:

  http://tools.ietf.org/wg/httpbis/

Recently, we did a substantial re-write of part 6 -- caching -- to  
improve its readability and reduce the number of conflicts (since  
several requirements were stated multiple times).


You can see this work at:
  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-06
the previous version being at (for reference; this should be  
essentially equivalent to RFC2616):

  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-05

We'd very much like feedback from cache implementers (you). To be  
clear, the intent of this effort is that no existing, conformant  
implementation should become non-conformant, so we're particularly  
interested if you feel this is the case.


Note that the rewrite is not yet complete; in particular, there are a  
number of notes interspersed in the spec outlining remaining work.


See also:
  http://www.w3.org/mid/250db66e-111e-4b2e-ad30-3d9ec49ee...@mnot.net

Cheers,

--
Mark Nottingham   m...@yahoo-inc.com




online configuration manuals messed up?

2009-03-04 Thread Mark Nottingham

Browsing through
  http://www.squid-cache.org/Versions/v2/2.7/cfgman/
and the 2.6 manual, it seems like the argument types for all config  
directives are missing... has something changed?



--
Mark Nottingham   m...@yahoo-inc.com




Re: Feature: quota control

2009-02-26 Thread Mark Nottingham

Isn't requests really just an external acl helper?


On 27/02/2009, at 8:36 AM, Adrian Chadd wrote:


I'm looking at implementing this as part of a contract for squid-2.

I was going to take a different approach - that is, i'm not going to
implement quota control or management in squid; I'm going to provide
the hooks to squid to allow external controls to handle the quota.



adrian

2009/2/21 Pieter De Wit pie...@insync.za.net:

Hi Guys,

I would like to offer my time in working on this feature - I have  
not done

any squid dev, but since I would like to see this feature in Squid, I
thought I would take it on.

I have briefly contacted Amos off list and we agreed that there is  
no set
in stone way of doing this. I would like to propose that we then  
start
throwing around some ideas and let's see if we can get this into  
squid :)


Some ideas that Amos quickly said :

  - Based on delay pools
  - Use of external helpers to track traffic


The way I see this happening is that a Quota is like a pool that  
empties
based on 2 classes - bytes and requests. Requests will be for  
things like
the number of requests, i.e. a person is only allowed to download 5  
exe's
per day or 5 requests of 1meg or something like that (it just  
popped into

my head :) )

Bytes is a pretty straight forward one, the user is only allowed x  
amount of

bytes per y amount of time.

Anyways - let the ideas fly :)

Cheers,

Pieter




--
Mark Nottingham   m...@yahoo-inc.com




Re: Feature: quota control

2009-02-26 Thread Mark Nottingham
Honestly, if I wanted to do byte-based quotas today, I'd have an  
external ACL helper talking to an external logging helper; that way,  
you can just log the response sizes to a daemon and then another  
daemon would use that information to make a decision at access time.  
The only even mildly hard part about this is sharing state between the  
daemons, but if you don't need the decisions to be real-time, it's not  
that bad (especially considering that in any serious deployment,  
you'll have state issues between multiple boxes anyway).


Squid modifications that would help this (and similar use cases):
1) Allow multiple, different external loggers to be defined and used.
2) Harmonise the interfaces/configuration of *all* external helpers,  
so that you can pass arbitrary args to each, using the same syntax.  
I'm looking at you, redirectors.

3) Reduce the overhead of using external helpers, if possible.

A *lot* of the customisation that I do for Squid is based on uses of  
helpers like this; they're actually very powerful and flexible, if you  
know what you're doing with them. I would very much like to see Squid  
turn them into more of a strength...



On 27/02/2009, at 9:30 AM, Robert Collins wrote:


Key things I'd look for:
- quota's shouldn't be lost when squid restarts
- people should be able to use external quota systems (they may have
  e.g. netflow or other systems tracking direct b/w use, and squid
  is pulling from those allowances when downloads are caused by a
  given user).

Both of which are nicely solved by Adrians suggestion of making sure
there are appropriate hooks in squid to let other software actually do
quotas.

It would be nice to ship a default client for those hooks that does
something simple (like monthly quotas with a simple script to report  
on

current totals/reset). But even that is probably too much in core - it
should probably be a parallel project, like the reference icap client
is.

(And isn't iCap enough to do quotas?, or is it too heavyweight?  
Perhaps

look at the in-squid iCap-alike Alex has been working on?)

-Rob


--
Mark Nottingham   m...@yahoo-inc.com




Re: Feature: quota control

2009-02-26 Thread Mark Nottingham
I was talking about request number quotas (e.g., you can make n  
requests in m minutes).


Regarding byte quotes -- see subsequent message -- I disagree that you  
need eCAP :)



On 27/02/2009, at 10:00 AM, Kinkie wrote:

On Thu, Feb 26, 2009 at 11:23 PM, Mark Nottingham m...@yahoo- 
inc.com wrote:

Isn't requests really just an external acl helper?


Not really.. an external ACL helper would need to do real-time parsing
of the logs to really know how much each client downloaded, as AFAIK
both request and reply acl's are evaluated BEFORE the actual transfer
takes place.
Only eCAP probably has the right hooks.

--
   /kinkie


--
Mark Nottingham   m...@yahoo-inc.com




Re: Associating accesses with cache.log entries

2009-02-24 Thread Mark Nottingham

Not sure I follow.

ctx_enter is called relatively late, in httpProcessReplyHeader and  
destroy_MemObject; I'd think for this case we need to have a request  
id associated quite early, probably around parseHttpRequest.


Also, AFAICT Ctx_Descrs and Ctx_Current_Level isn't really a reliable  
way to identify what the context of a particular debug statement is.


What am I missing? The most straightforward way that I can see to do  
this is to add an identifier to clientHttpRequest and pass that to  
debug where available...



On 28/11/2008, at 12:02 PM, Henrik Nordstrom wrote:


fre 2008-11-28 klockan 10:34 +1100 skrev Mark Nottingham:


Agreed. How would you pass it into debug()? It looks like _db_print
already takes variable length args,


By adding it to the context already used by _db_print.. i.e. by
extending ctx_enter to take the sequence number in addition to url.

Regards
Henrik



--
Mark Nottingham   m...@yahoo-inc.com




Applying patches to 2.HEAD

2009-02-22 Thread Mark Nottingham
I'd like to see the patches attached to the following bugs applied to  
2.HEAD;


http://www.squid-cache.org/bugs/show_bug.cgi?id=2482 (deep ctx errors)

http://www.squid-cache.org/bugs/show_bug.cgi?id=2566 (send CLR even  
when PURGE isn't a HIT)


http://www.squid-cache.org/bugs/show_bug.cgi?id=2567 (assertion in  
method patches)


http://www.squid-cache.org/bugs/show_bug.cgi?id=2599 (idempotent start)

http://www.squid-cache.org/bugs/show_bug.cgi?id=2602 (raise MAX_URL)

Any objections to doing so?


--
Mark Nottingham   m...@yahoo-inc.com




Re: Making start/stop idempotent

2009-02-16 Thread Mark Nottingham

http://www.squid-cache.org/bugs/show_bug.cgi?id=2599

On 31/10/2008, at 3:59 AM, Alex Rousskov wrote:


On Thu, 2008-10-30 at 00:58 +0100, Henrik Nordstrom wrote:

On tor, 2008-10-30 at 10:03 +1100, Mark Nottingham wrote:

Hmm, good point. My aim was just start and stop. Seem reasonable to
just limit it to those two?


Yes. And maybe maybe rotate.


Perhaps rotate should continue to fail. Those who do not like that,  
can

always do something like !check || rotate in their cron jobs, right?

In general, an operation that cannot accomplish its goal should  
fail. I

am not going to fight against shutdown pretending that it has shutdown
Squid, but faking rotate is more dangerous than convenient, IMO.

Thanks,

Alex.




--
Mark Nottingham   m...@yahoo-inc.com




collapsed_forwarding and ICP

2009-02-05 Thread Mark Nottingham
If I have a peer and it has collapsed_forwarding on, at what point  
will it return an ICP_HIT to me? E.g.,


1) As soon as there's an outstanding (therefore collapsed) request for  
it?

2) As soon as there's a cacheable response in-flight for it?
3) Only when the entire response is in-cache?

My reading of the code is that it's #1. Do I have this right?

If that's the case:
- On the plus side, this helps collapse requests across peers.
- On the down side, it seems like there's the potential for requests  
to go to peers, only to find that the response is uncacheable.


Thanks,

--
Mark Nottingham   m...@yahoo-inc.com




Re: collapsed_forwarding and ICP

2009-02-05 Thread Mark Nottingham

Thanks. I may have a look into putting those knobs in...


On 06/02/2009, at 11:18 AM, Henrik Nordstrom wrote:


fre 2009-02-06 klockan 10:07 +1100 skrev Mark Nottingham:

If I have a peer and it has collapsed_forwarding on, at what point
will it return an ICP_HIT to me? E.g.,

1) As soon as there's an outstanding (therefore collapsed) request  
for

it?
2) As soon as there's a cacheable response in-flight for it?
3) Only when the entire response is in-cache?

My reading of the code is that it's #1. Do I have this right?


Probably.


If that's the case:
- On the plus side, this helps collapse requests across peers.


Yes.


- On the down side, it seems like there's the potential for requests
to go to peers, only to find that the response is uncacheable.


Yes, and the slower the origin is to respond the more you'll see of
this.

Should be trivial to add a tuning knob to ICP for this. The not yet
known if they may be cached objects have a special KEY_EARLY_PUBLIC
flag set. See icpCheckUdpHit() for a suitable spot. htcpCheckHit would
also need the same.

Regards
Henrik



--
Mark Nottingham   m...@yahoo-inc.com




Re: diff file for 302 loop bug when using storeurl

2009-02-05 Thread Mark Nottingham

Shouldn't that be for 301, 303 and 307 as well?


On 05/02/2009, at 8:54 PM, Henrik Nordstrom wrote:

Was seen on info@, forwarded to squid-dev@ in case it's still  
relevant.


 Vidarebefordrat meddelande 

Från: chudy fernandez chudy_fernan...@yahoo.com
Till: adr...@squid-cache.org
Ämne: diff file for 302 loop bug when using storeurl
Datum: Wed, 5 Nov 2008 08:04:48 -0800 (PST)

I just wanna know if this has disadvantages.





302loopfixed.diff


--
Mark Nottingham   m...@yahoo-inc.com




Re: Introductions

2009-01-03 Thread Mark Nottingham

Welcome!

I've been thinking about this a bit recently. A few random ideas which  
may (or may not) be interesting to discuss:


- Allowing other sample periods (beyond 5min and 60min) in cachemgr  
stats; e.g., 1m. Perhaps even adjustable or even dynamic periods.


- More stats for service times; e.g., from request initiation to  
headers complete, to body complete, to response initiation, to  
response headers complete, to response body complete. There was a  
thread about this on dev a while back, if you're interested I can dig  
it up.


- Allowing a dump of info for a specific URL in the cachemgr.

Cheers,



On 30/12/2008, at 2:16 PM, Regardt van de Vyver wrote:


Hi Dev Team.

My name is Regardt van de Vyver, a technology enthusiast who tinkers  
with squid on a regular basis. I've been involved in development for  
around 12 years and am an active participant on numerous open source  
projects.


Right now I'm focussed on improving and extending performance  
metrics for squid, specifically related to SNMP and the cachemanager.


I'd like to take a more active role in the coming year from a dev  
perspective and feel the 1st step here is to at least get my butt  
onto the dev mailing list ;-)


I look forward to getting involved.

Regards,

Regardt van de Vyver



--
Mark Nottingham   m...@yahoo-inc.com




Re: X-Vary-Options support

2008-12-20 Thread Mark Nottingham
I agree. My impression was that it's pretty specific to their  
requirements, not a good general solution.



On 17/12/2008, at 4:17 PM, Henrik Nordstrom wrote:


ons 2008-12-17 klockan 16:12 -0500 skrev Adrian Chadd:


So is there any reason whatsoever that it can't be committed to
Squid-2.HEAD as-is, and at least backported (but not committed to
start with) to squid-2.7?


I am a bit uneasy about adding features known to be flawed in design.
Once the header format is added it becomes hard to change.

Sorry, don't remember the details right now what was flawed. See
archives for earlier discussion.

Regards
Henrik



--
Mark Nottingham   m...@yahoo-inc.com




Re: The cache deny QUERY change... partial rollback?

2008-12-01 Thread Mark Nottingham
Hmm. Given that heap GDSF out-performs LRU in the common case, and  
there's a crashing bug in LRU at the moment anyway, maybe the best  
thing to do is to change the default replacement policy -- and always  
compile in the heap algorithms?



On 02/12/2008, at 2:05 AM, Henrik Nordstrom wrote:


mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd:


Hm, thats kind of interesting actually. Whats it displacing from the
cache? Is the drop of hit ratio due to the removal of other cachable
large objects, or other cachable small objects? Is it -just- flash
video thats exhibiting this behaviour?


The studied cache is using LRU, and these flash videos effectively
reduce the cache size by filling the cache with large and never to be
referenced again objects.


Are you able to put up some examples and statistics?


I'll try.


I really think
the right thing to do here is look at what various sites are doing  
and

try to open a dialogue with them. Chances are they don't really know
exactly how to (ab)use HTTP to get the semantics they want whilst
retaining control over their content.


Probably true. Based on the URLs styles there seem to only be two or
three of these authentication/session schemes.

Regards
Henrik


--
Mark Nottingham   [EMAIL PROTECTED]




HTCP documentation

2008-11-27 Thread Mark Nottingham
From http://www.squid-cache.org/Versions/v2/2.7/cfgman/ 
cache_peer.html:
use 'htcp' to send HTCP, instead of ICP, queries to the neighbor.  
You probably also want to set the icp port to 4827 instead of  
3130. You must also allow this Squid htcp_access and http_access in  
the peer Squid configuration.
Is the You probably want to set... sentence still true? I.e., isn't  
htcp_port enough?



--
Mark Nottingham   [EMAIL PROTECTED]




Re: Rv: Why not BerkeleyDB based object store?

2008-11-26 Thread Mark Nottingham
Just a tangental thought; has there been any investigation into  
reducing the amount of write traffic with the existing stores?


E.g., establishing a floor for reference count; if it doesn't have n  
refs, don't write to disk? This will impact hit rate, of course, but  
may mitigate in situations where disk caching is desirable, but  
writing is the bottleneck...



On 26/11/2008, at 9:14 AM, Kinkie wrote:


On Tue, Nov 25, 2008 at 10:23 PM, Pablo Rosatti
[EMAIL PROTECTED] wrote:
Amazon uses BerkeleyDB for several critical parts of its website.  
The Chicago Mercatile Exchange uses BerkeleyDB for backup and  
recovery of its trading database. And Google uses BerkeleyDB to  
process Gmail and Google user accounts. Are you sure BerkeleyDB is  
not a good idea to replace the Squid filesystems even COSS?


Squid3 uses a modular storage backend system, so you're more than
welcome to try to code it up and see how it compares.
Generally speaking, the needs of a data cache such as squid are very
different from those of a general-purpose backend storage.
Among the other key differences:
- the data in the cache has little or no value.
 it's important to know whether a file was corrupted, but it can
always be thrown away and fetched from the origin server at a
relatively low cost
- workload is mostly writes
 a well-tuned forward proxy will have a hit-rate of roughly 30%,
which means 3 writes for every read on average
- data is stored in incremental chunks

Given these characteristics, a long list of mechanisms database-like
systems have such as journaling, transactions etc. are a  waste of
resources.
COSS is explicitly designed to handle a workload of this kind. I would
not trust any valuable data to it, but it's about as fast as it gets
for a cache.

IMHO BDB might be much more useful as a metadata storage engine, as
those have a very different access pattern than a general-purpose
cache store.
But if I had any time to devote to this, my priority would be in
bringing 3.HEAD COSS up to speed with the work Adrian has done in 2.

--
   /kinkie


--
Mark Nottingham   [EMAIL PROTECTED]




  1   2   >