> > In a case like this, I don't trust a cache miss to indicate presence > or absence of state. I prefer to keep separate key spaces: one for the > positive case and one for the negative. Keep a representation of The > Truth in media with a stronger guarantee of data integrity [1]; as > states change, write through to memcached when recording The Truth and > upon cache misses, consult The Truth. > > > [1] This could be an RDBMS, a Berkeley DB or something else. Which one > is correct depends on your scale, performance and concurrency > requirements.
I'm already using a DB to store the members list. If you're proposing to use another layer of DB, wouldn't it be better to just do a DB query in the first place? On Nov 5, 2007 10:16 PM, Ian Kallen <[EMAIL PROTECTED]> wrote: > In a case like this, I don't trust a cache miss to indicate presence > or absence of state. I prefer to keep separate key spaces: one for the > positive case and one for the negative. Keep a representation of The > Truth in media with a stronger guarantee of data integrity [1]; as > states change, write through to memcached when recording The Truth and > upon cache misses, consult The Truth. > > > [1] This could be an RDBMS, a Berkeley DB or something else. Which one > is correct depends on your scale, performance and concurrency > requirements. > > On 11/5/07, K J <[EMAIL PROTECTED]> wrote: > > > > > However, memcached semantics don't quite give you what you want. > > > Depending on whether you can reasonably get a configuration to do what > > > you want, it might be easier to think of memcached as a bloom filter > > > than as a set in this case. That is, if you negatively cache things > > > that *aren't* part of your list, then the presence of a key will tell > > > you for certain that a particular key is not a member, but the absence > > > of a key would mean that you don't know (or perhaps memcached *did* > > > know, but had since forgotten). > > > > > > I had thought of doing a bloom filter on it as well. The problem here > is, > > the membership list might change sometimes, and reading info on bloom > > filters, it's not really well-suited for dynamically changing lists? > > > > > You could optionally preload objects that are likely to be used if > > > you think the natural population wouldn't do it effectively (you can > > > measure this with stats). > > > > > > > Suppose I cache 10,000 recently-logged in members. Also, suppose 50% of > > traffic actually come from these users. Then, this cache would have a > high > > hit ratio when testing for membership. > > > > However, what about the non-members? For instance let's say 40% of the > > traffic come from non-members. This would mean there'd need to be a > full > > listing of members to check? > > > > Hmmm an interesting thought did just come across my mind. Let me hear > your > > thoughts: > > > > Cache 10,000 most recently logged-in members > > Bloom filter on the entire list > > This way, you can test for negatives (bloom filter), and if there's a > > positive, check the 10,000 most recently logged-in users. If that still > > yields nothing, then do a database query. In effect, only a small > minority > > of checks would require a trip to the DB. > > > > > > > > > > > > > > > > > > > > > > On Nov 4, 2007, at 23:32, J A wrote: > > > > > > > I have a fairly large members list that I want to keep in memcache. > > > > What I do with this list is query it against particular user IDs to > > > > see if they are a member of that list or not. If they are they get > > > > certain priviledges. > > > > > > > > The problem is, this list has gotten to the point of saturating the > > > > PHP's memory when fetching the MySQL query the first time. > > > > > > > > Is there a way to do this more effectively, for instance, > > > > partitioning the list into separate smaller lists, grouped by time > > > > of login? I'm thinking of this, as users who have logged in in the > > > > past 3 months are more likely to be in the list anyway. > > > > > > > > > It'd be easier to not think of it as a list if you're just > testing > > > for membership. All you want to know is if a particular object is an > > > element of a particular set. You could do this by key convention if > > > you batch populate the records. > > > > > > However, memcached semantics don't quite give you what you > want. > > > Depending on whether you can reasonably get a configuration to do what > > > you want, it might be easier to think of memcached as a bloom filter > > > than as a set in this case. That is, if you negatively cache things > > > that *aren't* part of your list, then the presence of a key will tell > > > you for certain that a particular key is not a member, but the absence > > > of a key would mean that you don't know (or perhaps memcached *did* > > > know, but had since forgotten). > > > > > > You could, of course, record the status either way so as to > tell > > the > > > difference between not knowing and knowing whether it's a member or > > > not. This is probably best suited to your needs. > > > > > > You could optionally preload objects that are likely to be used > if > > > you think the natural population wouldn't do it effectively (you can > > > measure this with stats). > > > > > > -- > > > Dustin Sallings > > > > > > > > > > > > > > > > >
