Cliff Wells wrote:
On Sat, 2007-04-28 at 08:07 -0400, Dan wrote:
  
IMHO.  I never implemented memcache, but the distributed memory
concept is *not* by itself scalable architecture.  The key is session
data.  Typically (as in default Pylons setup) session data is tied to
physical hardware... your web server.  To make use of memcache, one
would first need to implement scalable architecture, such as the
'share nothing' approach.   
    

And this is what Memcached does.  Simply store your session data in
Memcached.
  
Data (you want to save) eventually has to be written to disk.  The longer you keep data in memory, the longer you run the risk of losing that data.  Memcached is a *tool* to speedup disk access.
Again, I'm not claiming to be an expert on the 'share nothing'
architecture, but my understanding is that 'share nothing' stores all
data in the database... even session data.
    

But of course, "database" doesn't have to mean "SQL database".
Memcached provides a "hash in the sky" quite ideal for this type of
problem.
  
I somewhat agree.  A database is simply a way of storing data on a disk.  This can be a flat file w/ XML or a RDMS w/ SQL.  A 'hash in the sky' is not a database, even when using virtual memory (a disk), its volatile. 

  
  This way, it doesn't matter what web server in your farm takes users
requests.  They will all ask the database (the db will be your
bottleneck) but at least they will *mostly* know how to access and
store data. 
    

Again, this is exactly what Memcached is meant to do.  Memcached can
share data transparently across servers and will typically be *much*
faster than database access (and much, *much* easier to implement -
distributed databases are not fun [or cheap]).
  
  
I'd agree that memcached is faster, as its storing data from disk into memory.  RAM is always faster than disk I/O.  However, byte-for-byte hard disk drives are cheaper than RAM.
If you'll recall, Memcached was developed to help solve LiveJournal's
scalability issues, so it's quite well-tested:

http://www.linuxjournal.com/article/7451
  
Thanks for the link.  I haven't read it yet, but I plan on it.
Also, if your architecture consists of storing your data in a single
database, you will certainly *not* scale beyond a certain point.  The
database itself will become a bottleneck.  In fact, if you read the
above article, you'll find that Memcached was in fact developed to help
remove load from the LiveJournal databases.

  
Agreed.  It removes load, speeds up access, but does not eliminate a database (disk).  The slower speed of a physical disk will always be the bottleneck.

  
So the basic point to my post is that scalable architecture is more
about theory than specific tools.   The theory behind building
scalable web applications is a growing subject in system engineering
that I find interesting.  You can check a decent article about
myspace.com struggles at -
http://www.baselinemag.com/article2/0,1540,2084131,00.asp

    
I found the articles on scaling Wikipedia interesting, although I can't
find the link to them at the moment...

Despite the fact that MySpace obviously scales to a large degree, I put
little trust in their technical abilities (they can't even do CSS or
search properly).  I suspect their scalability is largely derived from
millions of dollars in investments rather than actual architecture.
*Anyone* can scale if they can throw unlimited money at it ;-)

  
Agreed

  
'memcache' is a tool, when applied to a scalable architecture, could
provide performance improvements.
    

Distributed information is a cornerstone of a scalable architecture
(sessions are simply information in this context).  Memcached provides
exactly this and is therefore more than just a "tool".  It's a
foundation point for building a scalable architecture.
  
Somewhat agreed.  I give you that Memcached is an important tool, but its still just a tool.  A database is a tool.  The fiber cable connecting your memcached servers is a tool. 

Scalable architecture is a system.  A system is a set of tools.  Again my point is that one needs to understand the big picture and not just say 'wow, I implemented memcached so I'm scalable'
Of course, for most of us, this is interesting but purely theoretical.
For most of the sites I do, the required scalability of a Pylons app
consists of running multiple instances of the app behind a load balancer
on the same server ;-)  Of course, the same problems and solutions apply
but on a much smaller scale.  Planned right, scaling from multiple
processes on a single machine to hundreds of processes across multiple
machines really is a matter of scale rather than architecture.

  
I hope that one of your projects takes off and I can be reading about your growing pains! 

As for myself, I get paid as a network engineer and I know more Cisco CLI than SQL.  I simply enjoy building web applications in Pylons in my spare time.  Who knows, with a little luck maybe one of my side projects will get put to the test.
Regards,
Cliff



  

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to