Hi David, Thanks a lot! This makes a lot of sense to me.
Regards, Markus "The best way to predict the future is to invent it" -- Alan Kay On Mon, Nov 30, 2009 at 11:24 PM, David Pollak < [email protected]> wrote: > On Mon, Nov 30, 2009 at 2:00 PM, Markus Kohler <[email protected] > >wrote: > > > > > > So, that means that each year, there will be 36,000M (36B) mailbox > > entries. > > > > > > > > > I don't understand why we would need to store all entries in a cache, > > instead of only keeping the last n entries for each user based on some > > heuristics such as the last 3 days or something. I would somehow expect > > that > > the probability that a user wants to see a message is exponentially > > decreasing with the messages age. For example that someone wants to see > a > > message that is the 1000 newest message in his timeline is probably > almost > > zero. > > > > Some people mine their timelines for information. I agree that some aging > policy is necessary as 36B entries will consume a lot of storage in RAM or > on disk, but the last 1,000 is likely too few based on what I have seen of > actual user behavior. > > In terms of an aging policy in an RDBMS, the cost of aging out old entries > is likely to be an index scan or something on that order (DELETE FROM > mailbox WHERE date < xxx or a user-by-user DELETE WHERE id IN (SELECT > messages > 1000 in mailbox)) > > > > > > > During peak load, we will need to prioritize which Users are processing > > > messages/actions such that the system retains responsiveness and can > > drain > > > the load. Put another way, knowing which Users have associated > > long-lived > > > sessions allows us to prioritize the message processing for those > Users. > > > We > > > allow more threads to drain the message queues for those Users while > > > providing fewer threads for session-less Users. Yeah, we could > > prioritize > > > on other heuristics, but long-lived session is dead simple and will > cost > > us > > > 5K bytes per logged in user. Not a huge cost and lots of benefit. > > > > > > > > I have no issue with some session state and 5K is really low, and > therefore > > this is not an issue. I don't get why it has to be in the session's > state > > because you could as well use the information that a user is online as a > > guidance, even if the state would be stored somewhere out of the session. > > Wouldn't make a difference I guess and storing it in the session looks > > natural. > > > > The state itself is not in the session. The session is the guide that the > user is online. The session contains a listener that is attached to the > User. The only real state that resides in the session is the state > necessary to batch up any messages that the User has forwarded to the > listener in between the HTTP polling requests. If there is an HTML front > end, state about that front end will reside in the session as well, but > that's a different issue. > > > > > > > > > So, between the existing long-lived session long polling is more > > efficient > > > than shortlived session repeated polling and the upcoming need for > > message > > > prioritization indicate that long-lived sessions are the right design > > > choice. > > > > > > Also, I hope that the above discussion makes it clear why I am > insistent > > on > > > message-oriented APIs rather than document/REST oriented APIs. ESME's > > > design is not traditional and there are fewer tools helping us get the > > > implementation right. On the other hand, implementing ESME on top of a > > > relational/REST model cannot be done. Let's keep our design consistent > > > from > > > the APIs back. > > > > > > > > I'm really not religious about REST, but I would somehow assume that in > an > > Enterprise context it could be an requirement to send a link to someone > > else > > pointing to a specific potentially old message in a certain Pool. > > > > Yes. That's perfectly reasonable. That message is like a static file on > disk. Once it's written, it remains unchanged until it's deleted. This is > an ideal application of a REST-style approach. That's why I've advocated > for a "message based" approach first, but a REST/static approach when the > message based approach doesn't make sense. What I am opposed to is a "try > to make everything fit the REST model" approach to API design. > > > > That > > sounds to me like a requirement for some kind of REST API. > > Would it be costly in your model to get the message nr. X (+ n older > > messages) in a users timeline?. > > > > A message will exist outside of a timeline. There exists a cache of > recently accessed messages. Sometimes there will be a historic message > that > is referenced and that will be materialized from backing store and > rendered. > It will likely fall out of cache if it's historical and not accessed > again. > > Thanks, > > David > > > > > > Regards, > > Markus > > > > > > > > > Thanks, > > > > > > David > > > > > > -- > > > Lift, the simply functional web framework http://liftweb.net > > > Beginning Scala http://www.apress.com/book/view/1430219890 > > > Follow me: http://twitter.com/dpp > > > Surf the harmonics > > > > > > > > > -- > Lift, the simply functional web framework http://liftweb.net > Beginning Scala http://www.apress.com/book/view/1430219890 > Follow me: http://twitter.com/dpp > Surf the harmonics >
