Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-26 Thread Bob Ippolito
On 8/25/07, Cliff Wells [EMAIL PROTECTED] wrote:

 On Sat, 2007-08-25 at 20:24 +0300, Pekka Jääskeläinen wrote:
  On 8/25/07, Ben Bangert [EMAIL PROTECTED] wrote:
  I'd highly suggest memcached rather than database backend.
  It's easy
  to setup, and of course, very fast. :)
 
  Yes, but as far as I know, memcached objects can be always replaced,
  that is, you cannot define an object to be persistent, can you? Thus,
  one
  also needs to back up the session data to a persistent storage so it
  does
  not get replaced when the cache fills up.
 

 This is a good point and is reiterated here:

 http://www.socialtext.net/memcached/index.cgi?sessions


Well if you have a cache that fills up with active sessions then you
have a pretty huge problem, because your performance characteristics
are going to be pretty bad at that point. You shouldn't really need
db-backed ephemeral sessions, unless you need to have the flexibility
to restart your memcached servers without interruption.

However, if you're not doing a lot with sessions I'd have to suggest
storing your state directly in cookies, since they're usually
equivalent with sessions but don't require any server-side state.
Though of course you should use HMAC or the like to guarantee that
your server generated that cookie's contents.

-bob

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-26 Thread Pekka Jääskeläinen
On 8/26/07, Bob Ippolito [EMAIL PROTECTED] wrote:

 Well if you have a cache that fills up with active sessions then you


The cache is used to store also other objects, thus it eventually will fill
up. Sure, if I have a separate memcached instance only for session data,
this should not be a problem.

However, if you're not doing a lot with sessions I'd have to suggest
 storing your state directly in cookies, since they're usually
 equivalent with sessions but don't require any server-side state.
 Though of course you should use HMAC or the like to guarantee that
 your server generated that cookie's contents.


This is good advice.  I'll try to design the future web apps with this in
mind.

Thanks,

-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-26 Thread Pekka Jääskeläinen
On 8/26/07, Bob Ippolito [EMAIL PROTECTED] wrote:

 It will eventually fill up, but data in memcached is a combination of
 (opt-in) expiration and LRU. Sessions shouldn't be least recently
 used, and they only get expired if you gave them an expiration time
 (which probably makes sense to do).


Right. But still, this is a kind of rough, it should work in most cases
solution,
as there is a very real chance of losing session data when user is idling
for a while in
a busy system. Depends on the system whether it's a serious thing or not, of
course.


-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-25 Thread Ben Bangert

On Aug 24, 2007, at 4:15 PM, Cliff Wells wrote:


Memcached seems the easiest (and probably best) solution.  As Philip
mentions, however, it isn't well-documented how to use Beaker with
Memcached.  If you decide to go this route, maybe update the wiki?


I'd highly suggest memcached rather than database backend. It's easy  
to setup, and of course, very fast. :)


Here's the setup info for memcached:

beaker.session.type = ext:memcached
beaker.session.url = 192.147.32.2

The url can be a comma separated list of IP/hostnames as well should  
you be using multiple memcached servers.


Cheers,
Ben

smime.p7s
Description: S/MIME cryptographic signature


Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-25 Thread Pekka Jääskeläinen
On 8/25/07, Ben Bangert [EMAIL PROTECTED] wrote:

 I'd highly suggest memcached rather than database backend. It's easy
 to setup, and of course, very fast. :)


Yes, but as far as I know, memcached objects can be always replaced,
that is, you cannot define an object to be persistent, can you? Thus, one
also needs to back up the session data to a persistent storage so it does
not get replaced when the cache fills up.

Here's the setup info for memcached:

 beaker.session.type = ext:memcached
 beaker.session.url = 192.147.32.2

 The url can be a comma separated list of IP/hostnames as well should
 you be using multiple memcached servers.


Thank you. This helps, but still, I need to define the persistent storage
in addition to the memcached due to above mentioned reason.


-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-25 Thread Cliff Wells

On Sat, 2007-08-25 at 20:24 +0300, Pekka Jääskeläinen wrote:
 On 8/25/07, Ben Bangert [EMAIL PROTECTED] wrote:
 I'd highly suggest memcached rather than database backend.
 It's easy
 to setup, and of course, very fast. :)
 
 Yes, but as far as I know, memcached objects can be always replaced,
 that is, you cannot define an object to be persistent, can you? Thus,
 one 
 also needs to back up the session data to a persistent storage so it
 does
 not get replaced when the cache fills up.
 

This is a good point and is reiterated here:

http://www.socialtext.net/memcached/index.cgi?sessions


Regards,
Cliff



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Pekka Jääskeläinen
Hello,

In our new project we want to implement the web application from the
beginning to be
easily scalable to 1) multiple cores (on the same server) and to 2) multiple
separate servers.

Due to the infamous GIL ruining multithreading scalability of Python, the
only sensible way to
implement both 1) and 2) seems to be to run multiple instances of the server
(we plan to
use the Paster to serve the app) and use a separate load-balancer (possibly
some Apache mod...
any recommendations?) to redirect requests to each of the server instances
running either on the
same machine (to take advantage of 1) or to separate servers (to implement
2).

Of course, in this setup there's no real difference between 1) and 2) which
is kind of nice.

However, we started to think the practical issues with this in Pylons. In
principle, making this work
reliably means to distribute the session data so all server processes can
access each session's data.
For this we plan to store the session data to the database and reduce its
overhead using memcached.

How to implement this reliably on Pylons? The first thing that pops into my
mind is to add code
in __call__() of the base controller to load the session data (from
memcached or from DB). But how
about saving? This would be best implemented in session.save() so there's no
useless saving (which
invalidates the memcached entry) if nothing hasn't been changed. Is there a
way to do this nicely
without poking with Pylons code?

Any ideas and comments considering this kind of scalable Pylons
implementations are welcome.

Thanks,

-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Philip Jenvey


On Aug 24, 2007, at 4:41 AM, Pekka Jääskeläinen wrote:

 Hello,

 In our new project we want to implement the web application from  
 the beginning to be
 easily scalable to 1) multiple cores (on the same server) and to 2)  
 multiple separate servers.

 Due to the infamous GIL ruining multithreading scalability of  
 Python, the only sensible way to
 implement both 1) and 2) seems to be to run multiple instances of  
 the server (we plan to
 use the Paster to serve the app) and use a separate load-balancer  
 (possibly some Apache mod...
 any recommendations?) to redirect requests to each of the server  
 instances running either on the
 same machine (to take advantage of 1) or to separate servers (to  
 implement 2).

Apache 2.2 has a mod_proxy_balancer. If performance is a concern, you  
should go with the CherryPy WSGI server.

use = egg:PasteScript#cherrypy

instead of

use egg:Paste#httpserver


 Of course, in this setup there's no real difference between 1) and  
 2) which is kind of nice.

 However, we started to think the practical issues with this in  
 Pylons. In principle, making this work
 reliably means to distribute the session data so all server  
 processes can access each session's data.
 For this we plan to store the session data to the database and  
 reduce its overhead using memcached.

Beaker has support for using a database or a memcached backend,  
though the docs on how to do this seem to be currently lacking.

--
Philip Jenvey



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Cliff Wells

On Fri, 2007-08-24 at 14:41 +0300, Pekka Jääskeläinen wrote:

 However, we started to think the practical issues with this in Pylons.
 In principle, making this work 
 reliably means to distribute the session data so all server processes
 can access each session's data.

I'm curious about this too.  I've been actually doing it already for
some time (albeit not on any heavily loaded sites) using Nginx and a
default Pylons setup and quite frankly I've not had any issues despite
taking no precautions.

My only possible explanation is that either a) Nginx makes some attempt
to track sessions itself and always passes the same IP back to the same
backend Pylons process or b) Pylons automagically makes it work.

Either way I'd like to feel a little more certain about this.

Regards,
Cliff


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Cliff Wells

On Fri, 2007-08-24 at 15:16 -0700, Cliff Wells wrote:
 On Fri, 2007-08-24 at 14:41 +0300, Pekka Jääskeläinen wrote:
 
  However, we started to think the practical issues with this in Pylons.
  In principle, making this work 
  reliably means to distribute the session data so all server processes
  can access each session's data.
 
 I'm curious about this too.  I've been actually doing it already for
 some time (albeit not on any heavily loaded sites) using Nginx and a
 default Pylons setup and quite frankly I've not had any issues despite
 taking no precautions.
 
 My only possible explanation is that either a) Nginx makes some attempt
 to track sessions itself and always passes the same IP back to the same
 backend Pylons process or b) Pylons automagically makes it work.
 
 Either way I'd like to feel a little more certain about this.

Just to clarify: I'm not load balancing across multiple servers, just
multiple Pylons backends on the same machine.

Regards,
Cliff


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Bob Ippolito
On 8/24/07, Cliff Wells [EMAIL PROTECTED] wrote:

 On Fri, 2007-08-24 at 15:16 -0700, Cliff Wells wrote:
  On Fri, 2007-08-24 at 14:41 +0300, Pekka Jääskeläinen wrote:
 
   However, we started to think the practical issues with this in Pylons.
   In principle, making this work
   reliably means to distribute the session data so all server processes
   can access each session's data.
 
  I'm curious about this too.  I've been actually doing it already for
  some time (albeit not on any heavily loaded sites) using Nginx and a
  default Pylons setup and quite frankly I've not had any issues despite
  taking no precautions.
 
  My only possible explanation is that either a) Nginx makes some attempt
  to track sessions itself and always passes the same IP back to the same
  backend Pylons process or b) Pylons automagically makes it work.
 
  Either way I'd like to feel a little more certain about this.

 Just to clarify: I'm not load balancing across multiple servers, just
 multiple Pylons backends on the same machine.


It definitely doesn't unless you're using the hashing thing...

We just excised sessions from our app altogether, we only really
stored authentication data in there so we moved it to a cookie. Now we
just randomly send requests to any server that's up, and that works
quite well.

-bob

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Pekka Jääskeläinen

 Apache 2.2 has a mod_proxy_balancer. If performance is a concern, you
 should go with the CherryPy WSGI server.

 use = egg:PasteScript#cherrypy

 instead of

 use egg:Paste#httpserver



This didn't work, I changed the line

use = egg:Paste#http
to
use egg:Paste#cherrypy

Even though I just installed CherryPy, I got error:

LookupError: Entry point 'cherrypy' not found in egg 'Paste' (dir:
/usr/lib/python2.5/site-packages/Paste-1.4-py2.5.egg; protocols:
paste.server_factory, paste.server_runner; entry_points: )

when trying to start paster serve. I've had the same trouble when applying
any kind of
filter(?) (for example the profiler or the thread watcher) with Paste: the
modules are not found.

How to add more search paths for Paster in the Pylons conf?

Beaker has support for using a database or a memcached backend,
 though the docs on how to do this seem to be currently lacking.


Sounds good, but a bit useless if there's no directions for how to take
advantage of the feature. Should I take a look at the Beaker sources and
maybe
write the howto myself? Can you point me to the right source code spot?

-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Pekka Jääskeläinen
On 8/25/07, Cliff Wells [EMAIL PROTECTED] wrote:

 My only possible explanation is that either a) Nginx makes some attempt
 to track sessions itself and always passes the same IP back to the same
 backend Pylons process or b) Pylons automagically makes it work.


...or c) you don't use much the session data storage and you've been lucky.

On the other hand... if session data is always stored to the disk to the
same
directory and location and loaded on each request, multiple server processes
accessing
the same data should actually work quite fine. To scale this to multiple
server machines
one would need a networked file system mount point in which to store the
session data.

However, I'd like to avoid using the disk as much as possible.

-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Cliff Wells

On Sat, 2007-08-25 at 01:46 +0300, Pekka Jääskeläinen wrote:
 On 8/25/07, Cliff Wells [EMAIL PROTECTED] wrote:
 My only possible explanation is that either a) Nginx makes
 some attempt
 to track sessions itself and always passes the same IP back to
 the same
 backend Pylons process or b) Pylons automagically makes it
 work.
 
 ...or c) you don't use much the session data storage and you've been
 lucky.

I think I'd have had to have been *much* luckier than I can take credit
for ;-)

 On the other hand... if session data is always stored to the disk to
 the same directory and location and loaded on each request, multiple
 server processes accessing the same data should actually work quite
 fine.

This seems the likely answer.

  To scale this to multiple server machines one would need a networked
 file system mount point in which to store the session data.

 However, I'd like to avoid using the disk as much as possible. 

Memcached seems the easiest (and probably best) solution.  As Philip
mentions, however, it isn't well-documented how to use Beaker with
Memcached.  If you decide to go this route, maybe update the wiki?

Regards,
Cliff


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Cliff Wells

On Sat, 2007-08-25 at 01:42 +0300, Pekka Jääskeläinen wrote:
 Apache 2.2 has a mod_proxy_balancer. If performance is a
 concern, you
 should go with the CherryPy WSGI server. 
 
 use = egg:PasteScript#cherrypy
 
 instead of
 
 use egg:Paste#httpserver
 
 
 This didn't work, I changed the line
 
 use = egg:Paste#http
 to
 use egg:Paste#cherrypy

This is wrong AFAIK.  I'm using CP's wsgiserver (which is a standalone
app and included with Paste, so you don't actually need to install CP3,
although you certainly can), and this is my entry:

[server:main]
# use = egg:Paste#http
use = egg:PasteScript#cherrypy

Note PasteScript vs Paste.

As an aside, my light testing with ab showed CP3's wsgiserver to be
considerably faster than Paste's http server, but also seemed to fail
under load slightly more often (more failed requests, not crashes).

Regards,
Cliff




--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Pekka Jääskeläinen
On 8/25/07, Cliff Wells [EMAIL PROTECTED] wrote:

 This is wrong AFAIK.  I'm using CP's wsgiserver (which is a standalone
 app and included with Paste, so you don't actually need to install CP3,
 although you certainly can), and this is my entry:

 [server:main]
 # use = egg:Paste#http
 use = egg:PasteScript#cherrypy

 Note PasteScript vs Paste.



This works. Thank you.

As an aside, my light testing with ab showed CP3's wsgiserver to be
 considerably faster than Paste's http server, but also seemed to fail
 under load slightly more often (more failed requests, not crashes).


OK. Then it's a no go. Reliability first.

BTW. Have you used the profiling middleware with paster? Last
time I tried, the profiling decorator didn't work.

-- 
--PJ

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---



Re: implementing a scalable (to multiple processors and multiple servers) Pylons webapp

2007-08-24 Thread Peter Hansen

On 8/24/07, Pekka Jääskeläinen [EMAIL PROTECTED] wrote:

  Apache 2.2 has a mod_proxy_balancer. If performance is a concern, you
  should go with the CherryPy WSGI server.
 
  use = egg:PasteScript#cherrypy
 
  instead of
 
  use egg:Paste#httpserver


 This didn't work, I changed the line

 use = egg:Paste#http
 to
 use egg:Paste#cherrypy

  Even though I just installed CherryPy, I got error:

Note it's egg:PasteScript in the above, not just egg:Paste.

-Peter

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
pylons-discuss group.
To post to this group, send email to pylons-discuss@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~--~~~~--~~--~--~---