Re: [Zope] ZEO and a front end...

2000-07-21 Thread Toby Dickenson

On Wed, 19 Jul 2000 10:07:30 -0600, Bill Anderson [EMAIL PROTECTED]
wrote:

Toby Dickenson wrote:
 
 On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
 wrote:
 
  I might be reading more into his words than was intended, but I think
  this demonstrates the problem. Distributing multiple requests for one
  section across multiple servers is (what I consider to be)
  undesirable.
 
 You can actually do it either way. Curtis (AIUI) complained that the
 method described meant your site depended upon each of th esection's
 servers being up, that there was no redundancy. So I described a way of
 doing it with redundancy.
 
 What you described doesn't scale up to having 1000's of sections
 (which I was assuming, and I think Curtis was too).  If this isn't a
 problem, then your solution is great.

I don't understand why you think it doesn't. DNS has clearly
demonstrated the ability to handle 'thousands', and the entire
scalability of a cluster is the addition of machines. You appear to be
desirous of having a machine handle a section. Thus, for thousands of
sections, you have thousands of machines.

DNS scales up to one machine per section, but a typical budget doesnt.

Fortunately it doesnt need too. Even if we have 1's of sections, I
would expect only 10's to be active over a period of a few minutes.

Another way of looking at the issue is that it is similar to using
in-memory Sessions. You have to ensure that each user's requests are
routed to the machine that holds their session. The main difference is
that it is a performance, not correctness issue.

I don't want to think about handling Sessions using DNS and one
machine per user ;-)

 EddieWare does do 'intellgient' caching
 
 eddieware is on my list of option to try out next month... Ill keep
 you posted

Cool.


Toby Dickenson
[EMAIL PROTECTED]

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-21 Thread Bill Anderson

Toby Dickenson wrote:
 
 On Wed, 19 Jul 2000 10:07:30 -0600, Bill Anderson [EMAIL PROTECTED]
 wrote:
 
 Toby Dickenson wrote:
 
  On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
  wrote:
 
   I might be reading more into his words than was intended, but I think
   this demonstrates the problem. Distributing multiple requests for one
   section across multiple servers is (what I consider to be)
   undesirable.
  
  You can actually do it either way. Curtis (AIUI) complained that the
  method described meant your site depended upon each of th esection's
  servers being up, that there was no redundancy. So I described a way of
  doing it with redundancy.
 
  What you described doesn't scale up to having 1000's of sections
  (which I was assuming, and I think Curtis was too).  If this isn't a
  problem, then your solution is great.
 
 I don't understand why you think it doesn't. DNS has clearly
 demonstrated the ability to handle 'thousands', and the entire
 scalability of a cluster is the addition of machines. You appear to be
 desirous of having a machine handle a section. Thus, for thousands of
 sections, you have thousands of machines.
 
 DNS scales up to one machine per section, but a typical budget doesnt.
 
 Fortunately it doesnt need too. Even if we have 1's of sections, I
 would expect only 10's to be active over a period of a few minutes.

You can have multiple sections per machine, as well. :^)
sec1.libc.org and sec2.libc.org can be on the same machine (heck, they
_could_ be different ZServers on the same machine!).

Real time analysis of section use compared to user browsing by url
analysis would, IMO, induce more overhead than you would save by doing
it based upon overall site useage patterns.
 
 Another way of looking at the issue is that it is similar to using
 in-memory Sessions. You have to ensure that each user's requests are
 routed to the machine that holds their session. The main difference is
 that it is a performance, not correctness issue.

Ah, but if you encoded the session information in the url, you get no
practical differences ;^)

 I don't want to think about handling Sessions using DNS and one
 machine per user ;-)

ygh, me either!



--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-20 Thread Bill Anderson

Curtis Maloney wrote:
 
 On Thu, 20 Jul 2000, Bill Anderson wrote:
  Toby Dickenson wrote:
   On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
  
   wrote:
I might be reading more into his words than was intended, but I think
this demonstrates the problem. Distributing multiple requests for one
section across multiple servers is (what I consider to be)
undesirable.
   
   You can actually do it either way. Curtis (AIUI) complained that the
   method described meant your site depended upon each of th esection's
   servers being up, that there was no redundancy. So I described a way of
   doing it with redundancy.
  
   What you described doesn't scale up to having 1000's of sections
   (which I was assuming, and I think Curtis was too).  If this isn't a
   problem, then your solution is great.
 
  I don't understand why you think it doesn't. DNS has clearly
  demonstrated the ability to handle 'thousands', and the entire
  scalability of a cluster is the addition of machines. You appear to be
  desirous of having a machine handle a section. Thus, for thousands of
  sections, you have thousands of machines. Again, with a ZEO clusters the
  bottleneck/SPOF would be the ZSS, but that _could_ be worked aorund, and
  has nothing to do with 'sections' of a website.
 
 Bill,
 
 Whilst the structures you've described are very effective, your example of
 libc.org required one thing in particular that I'm not sure is available:
 prior knowledge of which sections will be hit hardest.

You start with the most likely suspects, and then after a given time
interval, you adjust as needed. *most* site admins have a good idea of a
given section being more popular or frequented when the site is built.
That is as good a start as any other, if not better.

 
 Essentially, your setup allows any 'server' to become a 'server cluster' for
 scaling purposes.  Great!  So, if for now on we assume 'server' can mean
 'single or cluster of servers'

A logical assumption.
 
 The desire isn't for fixed server-section relationship.  Instead, a
 'preference' for that section to go to a particular server, so that the
 request 'hopefully' goes the server with the greatest chance of having the
 relevant objects in cache.

I see that it may not have been clear, but my ecample provided just
that. A preference is indicated by the weight given to servers and
sections. Let us say I have three servers. Fo rthe whole site, two get a
weight of 2, whilst a third gets a weight of 1. This third one, however,
gets a weight of 2 for the members section, whilst the other two get a
weight or 1. This provides a preference for server3 to serve up the
members section, though it is not a direct-only mapping. how does this
not fit the 'hopefully' desire?

If you _wanted_ a direct-only, you simply remove servers 1 and 2 from
the list of the members section. The really neat thing about this is
that it can be done at runtime.
 
 In fact, with the further information provided, what you really want is for
 requests from a particular client to go to the same server.  This would be
 better served with a redirection to a server specific domain name
 (serverN.mysite.com).  However, for the initial request, your best choice is
 to go to the server that last served those pages.
 
 Since dynamically tracking this info would be onerous, by encouraging
 requests for one section toward a particular server, you improve the chances
 of it holding the relevant objects in cache, with merely a fraction of the
 processing/data overheads.

Right. I agree that tracking all of this would be onerous, which is why
I said I don't think it is worth the effort, and would cost more than it
saved. The scenario I described gives a preference for sections to go to
a particular server, thus giving you the 'encouragement'. :^)


 
  Beyond that, your bottleneck would be networking. Whether yoour
  individual BE servers responded directly to the web browser, or whether
  they were channeled through a single/multiple FrontEnd servers. The
  decision to implement a BE-Client vs. a BE-FE-Client topology has not
  been discussed, as it is irrelevent to the discussion.
 
 Ah, topology.  (I'm leaving it there.  I really don't have time to get into
 this fully :)

Yeah, topology is where the umm ... electrons hits the wire.
 
Mebbe I'll post this stuff to the Wiki ... the question is .,.. which
one?

--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-20 Thread Curtis Maloney

On Thu, 20 Jul 2000, Bill Anderson wrote:
 Curtis Maloney wrote:
[snip]
  Bill,
 
  Whilst the structures you've described are very effective, your
  example of libc.org required one thing in particular that I'm not sure is
  available: prior knowledge of which sections will be hit hardest.

 You start with the most likely suspects, and then after a given time
 interval, you adjust as needed. *most* site admins have a good idea of a
 given section being more popular or frequented when the site is built.
 That is as good a start as any other, if not better.

Ah... in my revision of this e-mail (scary, but i do that when i'm writing :) 
I must have dropped out the bit about tuning... (o8


  Essentially, your setup allows any 'server' to become a 'server
  cluster' for scaling purposes.  Great!  So, if for now on we assume
  'server' can mean 'single or cluster of servers'

 A logical assumption.

  The desire isn't for fixed server-section relationship. 
  Instead, a 'preference' for that section to go to a particular server, so
  that the request 'hopefully' goes the server with the greatest chance of
  having the relevant objects in cache.

 I see that it may not have been clear, but my ecample provided just
 that. A preference is indicated by the weight given to servers and
 sections. Let us say I have three servers. Fo rthe whole site, two get a
 weight of 2, whilst a third gets a weight of 1. This third one, however,
 gets a weight of 2 for the members section, whilst the other two get a
 weight or 1. This provides a preference for server3 to serve up the
 members section, though it is not a direct-only mapping. how does this
 not fit the 'hopefully' desire?

Ah... well... in your previous e-mails I don't recall you mentioning multiple 
weightings for a single server.  In this case, yes, your solutions fits well.

  Ah, topology.  (I'm leaving it there.  I really don't have time to get
  into this fully :)

 Yeah, topology is where the umm ... electrons hits the wire.

hehehe

 Mebbe I'll post this stuff to the Wiki ... the question is .,.. which
 one?

Don't look at me... I've never even SEEN a wiki. (o8

Curtis

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-19 Thread Toby Dickenson

On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
wrote:

 I might be reading more into his words than was intended, but I think
 this demonstrates the problem. Distributing multiple requests for one
 section across multiple servers is (what I consider to be)
 undesirable.

You can actually do it either way. Curtis (AIUI) complained that the
method described meant your site depended upon each of th esection's
servers being up, that there was no redundancy. So I described a way of
doing it with redundancy. 

What you described doesn't scale up to having 1000's of sections
(which I was assuming, and I think Curtis was too).  If this isn't a
problem, then your solution is great.

EddieWare does do 'intellgient' caching

eddieware is on my list of option to try out next month... Ill keep
you posted


Toby Dickenson
[EMAIL PROTECTED]

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-19 Thread Bill Anderson

Toby Dickenson wrote:
 
 On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
 wrote:
 
  I might be reading more into his words than was intended, but I think
  this demonstrates the problem. Distributing multiple requests for one
  section across multiple servers is (what I consider to be)
  undesirable.
 
 You can actually do it either way. Curtis (AIUI) complained that the
 method described meant your site depended upon each of th esection's
 servers being up, that there was no redundancy. So I described a way of
 doing it with redundancy.
 
 What you described doesn't scale up to having 1000's of sections
 (which I was assuming, and I think Curtis was too).  If this isn't a
 problem, then your solution is great.

I don't understand why you think it doesn't. DNS has clearly
demonstrated the ability to handle 'thousands', and the entire
scalability of a cluster is the addition of machines. You appear to be
desirous of having a machine handle a section. Thus, for thousands of
sections, you have thousands of machines. Again, with a ZEO clusters the
bottleneck/SPOF would be the ZSS, but that _could_ be worked aorund, and
has nothing to do with 'sections' of a website. 

Beyond that, your bottleneck would be networking. Whether yoour
individual BE servers responded directly to the web browser, or whether
they were channeled through a single/multiple FrontEnd servers. The
decision to implement a BE-Client vs. a BE-FE-Client topology has not
been discussed, as it is irrelevent to the discussion.

In fact, come to think of it, I have noticed many sites redirect a
/foo/bar usr to a foo.domain.com or bar.domain.com.

 
 EddieWare does do 'intellgient' caching
 
 eddieware is on my list of option to try out next month... Ill keep
 you posted

Cool.


--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-19 Thread Curtis Maloney

On Thu, 20 Jul 2000, Bill Anderson wrote:
 Toby Dickenson wrote:
  On Tue, 18 Jul 2000 16:08:48 -0600, Bill Anderson [EMAIL PROTECTED]
 
  wrote:
   I might be reading more into his words than was intended, but I think
   this demonstrates the problem. Distributing multiple requests for one
   section across multiple servers is (what I consider to be)
   undesirable.
  
  You can actually do it either way. Curtis (AIUI) complained that the
  method described meant your site depended upon each of th esection's
  servers being up, that there was no redundancy. So I described a way of
  doing it with redundancy.
 
  What you described doesn't scale up to having 1000's of sections
  (which I was assuming, and I think Curtis was too).  If this isn't a
  problem, then your solution is great.

 I don't understand why you think it doesn't. DNS has clearly
 demonstrated the ability to handle 'thousands', and the entire
 scalability of a cluster is the addition of machines. You appear to be
 desirous of having a machine handle a section. Thus, for thousands of
 sections, you have thousands of machines. Again, with a ZEO clusters the
 bottleneck/SPOF would be the ZSS, but that _could_ be worked aorund, and
 has nothing to do with 'sections' of a website.

Bill,

Whilst the structures you've described are very effective, your example of 
libc.org required one thing in particular that I'm not sure is available: 
prior knowledge of which sections will be hit hardest.

Essentially, your setup allows any 'server' to become a 'server cluster' for 
scaling purposes.  Great!  So, if for now on we assume 'server' can mean 
'single or cluster of servers'

The desire isn't for fixed server-section relationship.  Instead, a 
'preference' for that section to go to a particular server, so that the 
request 'hopefully' goes the server with the greatest chance of having the 
relevant objects in cache.

In fact, with the further information provided, what you really want is for 
requests from a particular client to go to the same server.  This would be 
better served with a redirection to a server specific domain name 
(serverN.mysite.com).  However, for the initial request, your best choice is 
to go to the server that last served those pages.

Since dynamically tracking this info would be onerous, by encouraging 
requests for one section toward a particular server, you improve the chances 
of it holding the relevant objects in cache, with merely a fraction of the 
processing/data overheads.

 Beyond that, your bottleneck would be networking. Whether yoour
 individual BE servers responded directly to the web browser, or whether
 they were channeled through a single/multiple FrontEnd servers. The
 decision to implement a BE-Client vs. a BE-FE-Client topology has not
 been discussed, as it is irrelevent to the discussion.

Ah, topology.  (I'm leaving it there.  I really don't have time to get into 
this fully :)


 In fact, come to think of it, I have noticed many sites redirect a
 /foo/bar usr to a foo.domain.com or bar.domain.com.

  EddieWare does do 'intellgient' caching
 
  eddieware is on my list of option to try out next month... Ill keep
  you posted

 Cool.

Have a better one,
Curtis

dtml-var standard_work_disclaimer

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-18 Thread Bill Anderson

Curtis Maloney wrote:
 
 On Tue, 18 Jul 2000, ethan mindlace fremen wrote:
  Curtis Maloney wrote:
   Yes, however his point is that by having each Zope instance
   'predominantly' serving one portion of the site, its cache will contain
   more objects relevant, and thus be just that little bit faster.
  
   Personally, I find this such a simple idea that it MUST be good. (o8
   So much so, in fact, that I've decided to have a crack at writing just
   such a redirector.  I feel the Zope world (and others, most likely) could
   benefit from a 'preferential' redirector.
 
  The way I would do this is have
 
  section1.contrived-example.com
  section2.contrived-example.com
  section3.contrived-example.com
 
  with siteAccess, and then each zope would serve it according to it's IP
  (though each "could" serve each site).  Then you can use whatever IP/DNS
  load balancing tool your heart desires.
 
 I think most people seem to be missing the point here.
 
 The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
 balancer' will opt for a certain server for a certain URL, in order to
 improve cache hits.
 
 So, for www.contrived-example.com/dir1  it will first try server1, but if
 it's busy (or down) it will try others.  This way, the cache on server1 is
 more likely to contain objects relevant to /dir1  and thus have a higher hit
 rate, therefore improving performance.

No, I understand what is being discussed, I doubt the problem. :-)

Given an equal distribution*, then all the back-end (BE) servers will
have a fairly consistent cache content from server to server. you are
_equally_ likely to hit a server with that object in cache. The more
requests you have for a given object, the greater odds you'll see it in
the caches of all BE servers.  

* Now, not all systems are equal, this is true. However, in an
intelligent load balancing sysstem, you 'weight' the faster/better
performing machines, such that they are hit more often. Since these
machines will be used more frequently, they will have the best chance to
have what you want in cache already. I just don't see that the
additional effort is worth it. The job is already done, and the
additional overhead would seem to outweigh any perceived increases in
performance.  See below.

 
 An enforced 'mapping', as you were suggesting, removes ALL redundancy from
 the site, but would likely provide even better cache hits.

How so?

http://my.site.com/sec1 is mapped to: sec1.site.com, which is load
balanced across as many machines as possible, using ZEO and a load
balancing tool. Any of the machines in the pool known as sec1 (nobody
said it had to be a single machine) could respond. since these machines
serve out sec1 predominantly (they can also participate in the general
site load balanceing servers), these would have a better cache hit rate
on sec1 stuff than the primary BE servers.

Perhaps this can help:

www.libc.org (real site, fictional setup :) is a ZEO cluster.
 o The site's primary ZEO Clients number 5.
 o My load balancing tool lets me weight some servers over others.

/Members is a heavily trafficked section, so I want it to be seperated
out using a rewrite tool (SiteAccess, Roxen, Apache mod_rewrite,
whatever) to send all /Members urls to members.libc.org. 

I set up two ZEO clients, M1 and M2. These two talk to the same ZSS as
the other 5, and respond to members.libc.org.

So, when you go to www.libc.org/Members, you will wind up on either M1
or M2. These machines are set up as low-weighted primary site servers
(bringing the total up to 7), so they will have a cache that is biased
towards /Members, but still can serve up any part of www.libc.org 

If M1 or M2 goes down, you stay up.

For added redundancy, you can add the other 5 primary servers as
low-weighted servers for  members.libc.org, such that if both M1 and M2
die, or get heavily loaded, one or more of the other 5 can pick up the
overage, just as M1 and M2 can for teh 5 primary servewrs for the main
site.

Now you have 'preferred' machines, to improve cache-hit-rate for certain
heavily trafficked sections of your site, and maintain (or even improve)
overall performance and redundancy of the system.  Of course, you still
have ZSS as a SPOF, but even that can be gotten around with good design
and planning. :^)

If that isn't enough, you can throw eddieware into the mix, which
*already* has the ability to redirect based upon the URL.

And-yes,-McGuyver-is-my-hero-ly y'rs Bill

--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-18 Thread Toby Dickenson

On Tue, 18 Jul 2000 04:22:16 -0600, Bill Anderson [EMAIL PROTECTED]
wrote:

 I think most people seem to be missing the point here.
 
 The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
 balancer' will opt for a certain server for a certain URL, in order to
 improve cache hits.
 
 So, for www.contrived-example.com/dir1  it will first try server1, but if
 it's busy (or down) it will try others.  This way, the cache on server1 is
 more likely to contain objects relevant to /dir1  and thus have a higher hit
 rate, therefore improving performance.

No, I understand what is being discussed, I doubt the problem. :-)

You are right, theres no problem in the scenario you described. 

Ill fill in some more details about the fictional example for which I
still can't see an easy solution

Zope is used to store books. Each book object contains:
1. The text of the books, each page in a separate object
2. Images and diagrams for the book.
3. A ZCatalog full-text-index of the book.
Each book object allows:
1. Searching, viewing pages, etc.
2. Dynamically rendering a range of pages as pdf, postscript, etc.

The whole database stores 10,000 books, and is served by a cluster of
many identical Zope servers.

A typical usage pattern might be:
a. Users searches through a book to find the interesting pages
b. He browses the pdf version of those pages
c. He tweaks the page range, and double-checks the pdf version.
d. then downloads a postscript version of that page range for printing

Assume that noone has accessed this book recently, so it's not in any
caches.

The cache has to be filled at step b. This transfers alot of data -
possibly the whole content of the book - and introduces a noticeable
delay.

The possibility for optimisation comes at steps c and d. There is one
cache already filled with the right data - if the requests from c and
d can be directed to the same server as the original then the
cache-filling delay can be avoided.

This extra delay might not have a great impact of actual site
performance, but I've found a catastrophic affect on perceived
performance in some usability tests. Users seem happy to accept a
delay when they first access their data, but not if it repeated in a
subsequent request.

Bill wrote...

 http://my.site.com/sec1 is mapped to: sec1.site.com, which
 is load balanced across as many machines as possible

I might be reading more into his words than was intended, but I think
this demonstrates the problem. Distributing multiple requests for one
section across multiple servers is (what I consider to be)
undesirable.

I want to move load balancing up one level of abstraction -
distributing sections across machines (rather than connections).

If that isn't enough, you can throw eddieware into the mix, which
*already* has the ability to redirect based upon the URL.

Ive not seen eddieware before - so it looks like Ive got some reading
to do.

At a first glance it doesn't have any integrated http caching
(although it seems to have everything else ;-) and theres no obvious
place to hang squid. In my example above, I really want to be able to
cache the rendered pdf files.



Toby Dickenson
[EMAIL PROTECTED]

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-18 Thread Bill Anderson

Toby Dickenson wrote:
 
 On Tue, 18 Jul 2000 04:22:16 -0600, Bill Anderson [EMAIL PROTECTED]
 wrote:
 
  I think most people seem to be missing the point here.
 
  The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
  balancer' will opt for a certain server for a certain URL, in order to
  improve cache hits.
 
  So, for www.contrived-example.com/dir1  it will first try server1, but if
  it's busy (or down) it will try others.  This way, the cache on server1 is
  more likely to contain objects relevant to /dir1  and thus have a higher hit
  rate, therefore improving performance.
 
 No, I understand what is being discussed, I doubt the problem. :-)
 
 You are right, theres no problem in the scenario you described.
 
 Ill fill in some more details about the fictional example for which I
 still can't see an easy solution
 
 Zope is used to store books. Each book object contains:
 1. The text of the books, each page in a separate object
 2. Images and diagrams for the book.
 3. A ZCatalog full-text-index of the book.
 Each book object allows:
 1. Searching, viewing pages, etc.
 2. Dynamically rendering a range of pages as pdf, postscript, etc.
 
 The whole database stores 10,000 books, and is served by a cluster of
 many identical Zope servers.
 
 A typical usage pattern might be:
 a. Users searches through a book to find the interesting pages
 b. He browses the pdf version of those pages
 c. He tweaks the page range, and double-checks the pdf version.
 d. then downloads a postscript version of that page range for printing
 
 Assume that noone has accessed this book recently, so it's not in any
 caches.
 
 The cache has to be filled at step b. This transfers alot of data -
 possibly the whole content of the book - and introduces a noticeable
 delay.
 
 The possibility for optimisation comes at steps c and d. There is one
 cache already filled with the right data - if the requests from c and
 d can be directed to the same server as the original then the
 cache-filling delay can be avoided.
 
 This extra delay might not have a great impact of actual site
 performance, but I've found a catastrophic affect on perceived
 performance in some usability tests. Users seem happy to accept a
 delay when they first access their data, but not if it repeated in a
 subsequent request.
 
 Bill wrote...
 
  http://my.site.com/sec1 is mapped to: sec1.site.com, which
  is load balanced across as many machines as possible
 
 I might be reading more into his words than was intended, but I think
 this demonstrates the problem. Distributing multiple requests for one
 section across multiple servers is (what I consider to be)
 undesirable.

You can actually do it either way. Curtis (AIUI) complained that the
method described meant your site depended upon each of th esection's
servers being up, that there was no redundancy. So I described a way of
doing it with redundancy. 
 
 I want to move load balancing up one level of abstraction -
 distributing sections across machines (rather than connections).

That's easier :) Make sec1.site.com a single machine, and all requests
for my.site.com/sec1 go to this machine, thus the cache will have it
loaded if it has been accessed at all. The downside, like Curtis
mentioned, is that if sec1 dies, you lose that part of the site.

 
 If that isn't enough, you can throw eddieware into the mix, which
 *already* has the ability to redirect based upon the URL.
 
 Ive not seen eddieware before - so it looks like Ive got some reading
 to do.
 
 At a first glance it doesn't have any integrated http caching
 (although it seems to have everything else ;-) and theres no obvious
 place to hang squid. In my example above, I really want to be able to
 cache the rendered pdf files.

EddieWare does do 'intellgient' caching, allowing you to seperate out
sections of a site to a server (for example, all images come from this
machine, and text from that one, etc.), and it works at the IP Address
level. You simply plug in squid wherever, AIUI.



--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-18 Thread ethan mindlace fremen

Curtis Maloney wrote:
 I think most people seem to be missing the point here.

While I think Bill addressed this, I am not missing your point.  By subdomaining
areas, you can assign those subdomains an IP address, which can be primarily
served by a Zope Client.

 The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
 balancer' will opt for a certain server for a certain URL, in order to
 improve cache hits.

Because you're using SiteAccess, every node can access the objects that the
subdomain-primary serves, so you can do loadbalancing or failover.  There might
be some delay as the secondaries draw objects from the siteserver.

 Have a better one,

My life has been so good lately I'm almost afraid to think of what that would be
like.

ethan mindlace fremen
Zopatistas Unite!

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread Toby Dickenson

On Mon, 17 Jul 2000 05:45:49 -0600, Bill Anderson [EMAIL PROTECTED]
wrote:

  I'm wondering if anyone can suggest something good to run in front of
  2 zopes talking to a zeo server - for failover and load balancing.  I


 One disadvantage is that solution is that each Zope will have poor
 locality-of-reference within the object database. I think I can avoid
 that using a squid redirector (www.squid-cache.org). Ill post any
 news.

What do you mean?

Suppose your zope site www.contrived-example.com is comprised of many
largely independant sections,
http://www.contrived-example.com/section1,
http://www.contrived-example.com/section2 etc.

Your multiple Zopes can all serve all of these sections, however
theres not enough storage for each machine to hold all the sections
simultaneously.

You can make better use of ZODB's in-memory cache and the ZEO pickle
cache if the requests for /section1 usually go to the same server.

However, you don't want to hardwire this relationship since any other
machine should handle /section1 if its 'home' machine goes down, or is
busy.


This problem can't be solved without parsing http headers, so
low-level solutions such as www.linux-ha.com are not a total solution.



Toby Dickenson
[EMAIL PROTECTED]

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread Bill Anderson

Toby Dickenson wrote:
 
 On Mon, 17 Jul 2000 05:45:49 -0600, Bill Anderson [EMAIL PROTECTED]
 wrote:
 
   I'm wondering if anyone can suggest something good to run in front of
   2 zopes talking to a zeo server - for failover and load balancing.  I
 
  One disadvantage is that solution is that each Zope will have poor
  locality-of-reference within the object database. I think I can avoid
  that using a squid redirector (www.squid-cache.org). Ill post any
  news.
 
 What do you mean?
 
 Suppose your zope site www.contrived-example.com is comprised of many
 largely independant sections,
 http://www.contrived-example.com/section1,
 http://www.contrived-example.com/section2 etc.
 
 Your multiple Zopes can all serve all of these sections, however
 theres not enough storage for each machine to hold all the sections
 simultaneously.

As I understand ZEO, each machine _doesn't_ hold the site. The ZEO
clients (servers) communicate with a central ZSS (Zope Storage Server).
So in this contrived example, the problem is non-extant. ;-)


 You can make better use of ZODB's in-memory cache and the ZEO pickle
 cache if the requests for /section1 usually go to the same server.

AIUI (I'm no ZEO expert, I just use it ;) ), you can have each server
cache a certain amount, thus ameliorating the problem somewhat.

 
--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread Curtis Maloney

On Tue, 18 Jul 2000, Bill Anderson wrote:
  Your multiple Zopes can all serve all of these sections, however
  theres not enough storage for each machine to hold all the sections
  simultaneously.

 As I understand ZEO, each machine _doesn't_ hold the site. The ZEO
 clients (servers) communicate with a central ZSS (Zope Storage Server).
 So in this contrived example, the problem is non-extant. ;-)

  You can make better use of ZODB's in-memory cache and the ZEO pickle
  cache if the requests for /section1 usually go to the same server.

 AIUI (I'm no ZEO expert, I just use it ;) ), you can have each server
 cache a certain amount, thus ameliorating the problem somewhat.

Yes, however his point is that by having each Zope instance 'predominantly' 
serving one portion of the site, its cache will contain more objects 
relevant, and thus be just that little bit faster.

Personally, I find this such a simple idea that it MUST be good. (o8
So much so, in fact, that I've decided to have a crack at writing just such a 
redirector.  I feel the Zope world (and others, most likely) could benefit 
from a 'preferential' redirector.

Watch this space. (o8

Have a better one,
Curtis Maloney.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread Bill Anderson

Curtis Maloney wrote:
 
 On Tue, 18 Jul 2000, Bill Anderson wrote:
   Your multiple Zopes can all serve all of these sections, however
   theres not enough storage for each machine to hold all the sections
   simultaneously.
 
  As I understand ZEO, each machine _doesn't_ hold the site. The ZEO
  clients (servers) communicate with a central ZSS (Zope Storage Server).
  So in this contrived example, the problem is non-extant. ;-)
 
   You can make better use of ZODB's in-memory cache and the ZEO pickle
   cache if the requests for /section1 usually go to the same server.
 
  AIUI (I'm no ZEO expert, I just use it ;) ), you can have each server
  cache a certain amount, thus ameliorating the problem somewhat.
 
 Yes, however his point is that by having each Zope instance 'predominantly'
 serving one portion of the site, its cache will contain more objects
 relevant, and thus be just that little bit faster.

Which is why I said 'somewhat' ;-) Now, depending on if you set a shared
FS across the systems, this cache could theoretically be shared. If for
example, you usef afs/codafs, you could have each server using the same
FS, reading the same cache, unless something in ZEO prevents two
processes form using the cache, of course. Now, AFS/Codafs are not for
the faint of heart, but I get the feeling that someone willing to try an
IPVS setup is likely made of sturdier stuff. ;-)

 Personally, I find this such a simple idea that it MUST be good. (o8
 So much so, in fact, that I've decided to have a crack at writing just such a
 redirector.  I feel the Zope world (and others, most likely) could benefit
 from a 'preferential' redirector.

Roxen can do this to some extent. Not sure how well, since I am not
using that aspect of it, but I do see it in the modules. I do know thet
eddieware can also do this:
"""
The Eddie Intelligent HTTP Gateway package allows specialised functions
to be allocated to specific Back End Servers. As each user request
arrives, it is parsed by the Front End Server which then: 
o Splits multiple requests within HTTP 1.1 
  persistent connections into a number of 
  individual requests. 

o Sends the individual requests to Back End 

This allows a system administrator to, for example, dedicate certain
Back End Servers to be CGI processing engines, while other machines may
be dedicated to acting as image repositories. The individual machines
may then be tuned to optimise their performance to these specific tasks. 
"""

I have considered trying out eddieware, but haven't yet. 


--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread ethan mindlace fremen

Curtis Maloney wrote:
 Yes, however his point is that by having each Zope instance 'predominantly'
 serving one portion of the site, its cache will contain more objects
 relevant, and thus be just that little bit faster.
 
 Personally, I find this such a simple idea that it MUST be good. (o8
 So much so, in fact, that I've decided to have a crack at writing just such a
 redirector.  I feel the Zope world (and others, most likely) could benefit
 from a 'preferential' redirector.

The way I would do this is have 

section1.contrived-example.com
section2.contrived-example.com
section3.contrived-example.com

with siteAccess, and then each zope would serve it according to it's IP
(though each "could" serve each site).  Then you can use whatever IP/DNS
load balancing tool your heart desires.

a thought,
-- 
ethan mindlace fremen
Zopatista Community Liason
Abnegate I!

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] ZEO and a front end...

2000-07-17 Thread Curtis Maloney

On Tue, 18 Jul 2000, ethan mindlace fremen wrote:
 Curtis Maloney wrote:
  Yes, however his point is that by having each Zope instance
  'predominantly' serving one portion of the site, its cache will contain
  more objects relevant, and thus be just that little bit faster.
 
  Personally, I find this such a simple idea that it MUST be good. (o8
  So much so, in fact, that I've decided to have a crack at writing just
  such a redirector.  I feel the Zope world (and others, most likely) could
  benefit from a 'preferential' redirector.

 The way I would do this is have

 section1.contrived-example.com
 section2.contrived-example.com
 section3.contrived-example.com

 with siteAccess, and then each zope would serve it according to it's IP
 (though each "could" serve each site).  Then you can use whatever IP/DNS
 load balancing tool your heart desires.

I think most people seem to be missing the point here.

The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
balancer' will opt for a certain server for a certain URL, in order to
improve cache hits.

So, for www.contrived-example.com/dir1  it will first try server1, but if
it's busy (or down) it will try others.  This way, the cache on server1 is
more likely to contain objects relevant to /dir1  and thus have a higher hit
rate, therefore improving performance.

An enforced 'mapping', as you were suggesting, removes ALL redundancy from
the site, but would likely provide even better cache hits.

 a thought,

Have a better one,
Curtis

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )