[ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread David Lawson
I just discussed this with Jim and he concurs that it's a problem  
within Zope, he gets the same results (old data) tunneling directly  
to the app servers.  I believe he's planning on investigating  
further, but he's unavailable at the moment.


--Dave
Systems Administrator
Zope Corp.
540-361-1722
[EMAIL PROTECTED]



___
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web


Re: [ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread David Lawson



Well, trying to figure out what's changed...that would be the first
thing I'd check; just verify that the rules are indeed hitting the
requests.  How long has this problem been evident?  When did the  
changes

take place?  I presume awhile ago?


Most of the changes would've been some time ago, on the order of nine  
months to a year at least.  I did make some changes, recently, a few  
weeks ago, but mostly just management stuff.  I did find one refresh  
pattern that was commented out on the parents but not on the  
children, so I put that back in.  Can someone who understand the  
parameters of the issue better than I do check and see if it's still  
being problematic?  It's possible the change I just made fixed it,  
though I'd be somewhat surprised, since there's talk of things being  
cached for, like, several weeks and I don't see anything in the cache  
tier that would allow such a thing.


--Dave
Systems Administrator
Zope Corp.
540-361-1722
[EMAIL PROTECTED]



___
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web


Re: [ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread Andrew Sawyers
On Mon, 2007-05-21 at 16:42 -0400, David Lawson wrote:

> 
> It has, but mostly only in layout and some streamlining of the  
> configuration.  The basic rules you established are still in place,  
> since we assumed you had good reasons for them.  I haven't been  
> following this discussion terribly closely, but I'll take a closer  
> look at things this evening.

Well, trying to figure out what's changed...that would be the first
thing I'd check; just verify that the rules are indeed hitting the
requests.  How long has this problem been evident?  When did the changes
take place?  I presume awhile ago?

Andrew

___
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web


Re: [ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread David Lawson




It looks to me like the cache tier has been changed; IIRC zope.org was
not in the cache tier I setup for Managed Hostingand didn't  
have (at

least) 4 cache servers in it's request flow.


Yes.  The cache tier has changed significantly, but the basic  
configuration is still the same.




It's been a long time, but IIRC we got rid of all caching for zope.org
where a cache header wasn't explicitly being set.


This is still the case.  There's a refresh-pattern in place for the  
entirety of *.zope.org that should allow nothing without some sub-set  
of the proper cache control headers to be set.  A valid subset would  
be at least a Last-Modified and either an Expires or a Cache-Control:  
max-age.  Neither of these appear to be set, so that refresh pattern  
should be in effect.


Was the cache servers changed around?  IIRC the zope.org tier only  
had 2
measly cachesand if they changed, I bet the rules where not  
changed

along with the cache servers.


It has, but mostly only in layout and some streamlining of the  
configuration.  The basic rules you established are still in place,  
since we assumed you had good reasons for them.  I haven't been  
following this discussion terribly closely, but I'll take a closer  
look at things this evening.


--Dave
Systems Administrator
Zope Corp.
540-361-1722
[EMAIL PROTECTED]



___
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web


Re: [ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread Andrew Sawyers
On Mon, 2007-05-21 at 10:09 -0400, Jim Fulton wrote:
> I'm adding zope-web to the CC list.
> I wish you hadn't done that yet.  If we keep changing things. it will  
> be hard to figure this out.
> 
> It would be helpful to show the results of, say wget -S, as in:
> 
> [EMAIL PROTECTED]:~/tmp$ wget -S http://www.zope.org/news.rss
> --09:58:05--  http://www.zope.org/news.rss
> => `news.rss'
> Resolving www.zope.org... 63.240.213.171
> Connecting to www.zope.org|63.240.213.171|:80... connected.
> HTTP request sent, awaiting response...
>HTTP/1.0 200 OK
>Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
>Date: Mon, 21 May 2007 13:57:28 GMT
>Content-Length: 4011
>Content-Type: text/xml
>Age: 4
>X-Cache: HIT from parent-ng2.zmh.zope.net
>X-Cache: MISS from cache2.zmh.zope.net
>Connection: close
> Length: 4,011 (3.9K) [text/xml]
> 
> 100%[>] 4,011 --.--K/s
> 
> 09:58:05 (365.80 KB/s) - `news.rss' saved [4011/4011]

> 
> [EMAIL PROTECTED]:~/tmp$ wget --user jim --password xx -S http:// 
> www.zope.org/news.rss
> --10:00:45--  http://www.zope.org/news.rss
> => `news.rss.1'
> Resolving www.zope.org... 63.240.213.171
> Connecting to www.zope.org|63.240.213.171|:80... connected.
> HTTP request sent, awaiting response...
>HTTP/1.0 200 OK
>Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
>Date: Mon, 21 May 2007 13:45:20 GMT
>Content-Length: 4011
>Content-Type: text/xml
>X-Cache: HIT from parent-ng2.zmh.zope.net
>Age: 892
>X-Cache: HIT from cache4.zmh.zope.net
>Connection: close
> Length: 4,011 (3.9K) [text/xml]
> 
> 100%[>] 4,011 --.--K/s
> 
> 10:00:45 (1.85 MB/s) - `news.rss.1' saved [4011/4011]
It looks to me like the cache tier has been changed; IIRC zope.org was
not in the cache tier I setup for Managed Hostingand didn't have (at
least) 4 cache servers in it's request flow.
> 
> Note that the second request is authenticated (except with a  
> different password :)
That won't make any difference...they both resulted in a cache hit.  The
first time, you hit a front side cache server that didn't have it
cached: cache2, the second time, you hit a cache server that did have it
cached:  cache4.
> 
> ...
> 
> >> The objects do not display any caching policy in ZMI, but the cache
> >> manager still shows the enties in different variations.
What cache manager?
> 
> Possibly because it doesn't know about the change.
> 
> 
> > It looks like in issue in Zope.
> 
> How so?  If you look at the wget output above, there don't seem to be  
> any cache headers set.  So, data would not be cached unless there is  
> an overriding policy in squid.
If the cache tier was changed and relying on the (old) default settings,
then it's going to cache for a certain period of time.
> 
> > If you see both te child and the parent MISS,
> > then what you're getting is coming from the app server.
Yup.
> 
> I'm getting a hit from the parent.  Also note that both hits have me  
> results for which the most recent entry is from March 29.  If I bust  
> the cache with a query string, the most recent entry is for May 15.
> 
> 
> > That would also
> > explain differences based on roles. There is nothing in squid that
> > distinguishes if a user is authenticated, anonymous or manager.
I don't follow this, what explains the difference; there should be no
difference based on roles.
> 
> I *think* Andrew Sawyers did something to arrange that non-anonymous  
> users get non-cached results.  This doesn't seem to be working any  
> more. This is bad. I'm hoping that however got this working properly  
> at some point can tell us what they did. :)
It's been a long time, but IIRC we got rid of all caching for zope.org
where a cache header wasn't explicitly being set.
> 
> Jim
Was the cache servers changed around?  IIRC the zope.org tier only had 2
measly cachesand if they changed, I bet the rules where not changed
along with the cache servers.

Andrew

___
Zope-web maillist  -  Zope-web@zope.org
http://mail.zope.org/mailman/listinfo/zope-web


[ZWeb] Re: zope.org - serious caching issues

2007-05-21 Thread Jim Fulton

I'm adding zope-web to the CC list.


On May 20, 2007, at 1:42 PM, Mark W. Alexander wrote:


On Saturday 19 May 2007, Michael Haubenwallner wrote:


Michael, I'm sorry I dropped the ball on this.  I said I'd look into  
it and got distracted.



Mark W. Alexander schrieb:

On Friday 18 May 2007, you wrote:
Hi, i experience problems with the news listing and rss feed on  
zope.org

: http://www.zope.org (right column)
http://www.zope.org/News/ (News listing)
http://www.zope.org/news.rss (news rss feed)
http://www.zope.org/products.rss (products rss feed)

The pages / feeds are different for authenticated and anonymous  
users.
Refreshing (even forced) does not produce a correct page, same  
results

with non-browser based retrieval (wget, urllib).

Adding a querystring to the URL returns updated data - but that for
logged in Users only.


What do you mean by "different" for authenticated and anonymous  
users? It

looks the same to me both ways.  Pages will cache for 15 minutes
_per_cache_ so when you are making many changes you'll see  
differences

depending on which cache you hit.

Any query string will bust the cache once, but only once, as the
url?string will produce a new, unique cache url.

You can  use wget's -S option to see the X-Cache headers for the  
caches

the request is using as well as the Age (in seconds) of the cached
object. That information may help your analysis.

Mark


Checking again this morning i see no difference - there is still (2
month) old data showing on frontpage and in the rss feeds ...

I've looked into the scripts that compute the data

/zopeorg/news.rss
/zopeorg/products.rss
/zopeorg/latestContentBySubject

In ZMI all three objects are cached by an 'Accelerated HTTP Cache  
Manager'.


I subsequently removed the 'Five minutes' cache from the objects and
checked that stats page for several minutes (see below)


I wish you hadn't done that yet.  If we keep changing things. it will  
be hard to figure this out.


It would be helpful to show the results of, say wget -S, as in:

[EMAIL PROTECTED]:~/tmp$ wget -S http://www.zope.org/news.rss
--09:58:05--  http://www.zope.org/news.rss
   => `news.rss'
Resolving www.zope.org... 63.240.213.171
Connecting to www.zope.org|63.240.213.171|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.0 200 OK
  Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
  Date: Mon, 21 May 2007 13:57:28 GMT
  Content-Length: 4011
  Content-Type: text/xml
  Age: 4
  X-Cache: HIT from parent-ng2.zmh.zope.net
  X-Cache: MISS from cache2.zmh.zope.net
  Connection: close
Length: 4,011 (3.9K) [text/xml]

100%[>] 4,011 --.--K/s

09:58:05 (365.80 KB/s) - `news.rss' saved [4011/4011]

[EMAIL PROTECTED]:~/tmp$ wget --user jim --password xx -S http:// 
www.zope.org/news.rss

--10:00:45--  http://www.zope.org/news.rss
   => `news.rss.1'
Resolving www.zope.org... 63.240.213.171
Connecting to www.zope.org|63.240.213.171|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.0 200 OK
  Server: Zope/(unreleased version, python 2.2.3, linux2) ZServer/1.1b1
  Date: Mon, 21 May 2007 13:45:20 GMT
  Content-Length: 4011
  Content-Type: text/xml
  X-Cache: HIT from parent-ng2.zmh.zope.net
  Age: 892
  X-Cache: HIT from cache4.zmh.zope.net
  Connection: close
Length: 4,011 (3.9K) [text/xml]

100%[>] 4,011 --.--K/s

10:00:45 (1.85 MB/s) - `news.rss.1' saved [4011/4011]

Note that the second request is authenticated (except with a  
different password :)


...


The objects do not display any caching policy in ZMI, but the cache
manager still shows the enties in different variations.


Possibly because it doesn't know about the change.



It looks like in issue in Zope.


How so?  If you look at the wget output above, there don't seem to be  
any cache headers set.  So, data would not be cached unless there is  
an overriding policy in squid.



If you see both te child and the parent MISS,
then what you're getting is coming from the app server.


I'm getting a hit from the parent.  Also note that both hits have me  
results for which the most recent entry is from March 29.  If I bust  
the cache with a query string, the most recent entry is for May 15.




That would also
explain differences based on roles. There is nothing in squid that
distinguishes if a user is authenticated, anonymous or manager.


I *think* Andrew Sawyers did something to arrange that non-anonymous  
users get non-cached results.  This doesn't seem to be working any  
more. This is bad. I'm hoping that however got this working properly  
at some point can tell us what they did. :)


Jim

--
Jim Fulton  mailto:[EMAIL PROTECTED]Python 
Powered!
CTO (540) 361-1714  
http://www.python.org
Zope Corporationhttp://www.zope.com http://www.zope.org