Rob Nagler wrote:
>>A session is useful for very limited things, like remembering if this 
>>user is logged in and linking him to a user_id.
> 
> 
> We store this information in the cookie.  I don't see how it could be
> otherwise.  It's the browser that maintains the "login" state.

My preferred design for this is to set one cookie that lasts forever and 
serves as a browser ID.  If that user logs in, you can associate a user 
ID with that browser ID, on the server side.  You never need to send 
another cookie after the very first time someone hits your site.  If you 
decide to attach new kinds of state information to the browser, you 
still don't need to send a new cookie.

Many sites need to keep track of state information (like what's in your 
shopping cart) for anonymous users who haven't logged in.  Having this 
unique browser ID (or session ID, if you prefer to give out a new one 
each time someone comes to the site) lets you track this for 
unregistered users.

> Consider the following scenario:
> 
> * User logs in.
> * Site Admin decides to delete the user.
> * In our stateless servers, the user_id is invalidated immediately.
> * Next request from User, he's implicitly logged out, because the user_id
>   is verified on every request.
> 
> In the case of a session-based server, you have to delete the user and
> invalidate any sessions which the user owns.

I don't see that as a big deal.  You'd have to delete lots of other data 
associated with a user too.  Actually deleting a user is something I've 
never seen happen anywhere.

>>Although Oracle can be fast, some data models and application 
>>requirements make it hard to do live queries every time and still have 
>>decent performance.  This is especially true as traffic starts to
>>climb.
> 
> 
> I've tried to put numbers on some of this.  I've never worked on a
> 1M/day site, so I don't know if this is the point where you need
> sessions.  What sites other than etoys needs this type of session
> caching?

Well, eToys handled more than 2.5 million pages per hour, but caching 
can be important for much smaller sites in some situations.  It's not 
session caching necessarilly, although we did cache session data in a 
local write-through cache on each server.

We knew that the database would probably be the bottleneck in scaling 
our application, and it was.  We took pains to take as much work as 
possible off the database, so that it could spend its resources on 
handling things that can't be cached, like user submitted data and orders.

Here's a situation where a small site could need caching: suppose you 
have a typical hierarchical catalog site, with a tree of categories that 
contain products.  Now suppose that the requirements for the site make 
it necessary to do a pretty hairy query to get the list of products in a 
category, because you have some sort of indirect association based on 
product attributes or something and you have to account for start and 
end dates on every product and various availability statuses, etc. 
Categories should only be shown if they have products in them or if 
their child categories have products in them.  Keep piling on business 
rules.  Then the UI design calls for the front page to have a 
Yahoo-style display showing multiple levels of the category hierarchy, 
maybe 70 categories or so.

Sure, you get your DBAs to tune the SQL and to put all the indexes in 
place, and it gets the results for a single category pretty fast, in .08 
seconds, but you have 70 of them!  When you throw multiple users in the 
mix, all executing these queries every time they hit the homepage, your 
database server will burn a hole through the floor.

Or you can take advantage of your domain knowledge, that the data used 
in generating this page only changes every 6 hours or so, and just cache 
the page, or part of the page, or the data, for an hour.

Maybe I just have bad luck, but I always seem to end up at companies 
where they give me requirements like these.  And then they say to make 
it really fast and handle a billion users.  They are happy to trade 
slightly stale data for very good performance, and part of the 
requirements gathering process involves finding out how often various 
kinds of data change and how much it matters if they are out of date. 
(For example, inventory data for products changes often and needs to be 
much more current than, say, user comments on that product.)

- Perrin

- Perrin

Reply via email to