Hi All, I have a some questions regarding memcached updates and seek your comments. I am a newbie here (in case there is any separate list to be posted out to, please lemme know)
** Source Wikipedia ** 1 function get_foo (int userid) { 2 result = memcached_fetch("userrow:" + userid); 3 if (!result) { 4 result = db_select("SELECT * FROM users WHERE userid = ?", userid); 5 memcached_add("userrow:" + userid, result); 6 } 7 return result; 8} 9 function update_foo(int userid, string dbUpdateString) { 10 result = db_execute(dbUpdateString); 11 if (result) { 12 data = createUserDataFromDBString(dbUpdateString); 13 memcached_set("userrow:" + userid, data); 14 } 15} ******* Imagine a table now getting queried on 2 columns say userid and username Q1: If we have 100 processes each executing the get_foo function, and lets say memcached does not have the key. As there would be a delay between executing Line 2 and Line 5, there would be atleast dozens of processes querying the db and executing Line 5 creating more bottleneck on the memcached server - How does it scale then (Imagine a million processes now getting triggered)? I understand it is the initial load factorbut how do you take this into account while starting up the memcached servers? Q2: Now imagine, you have 100 processes again querying the key out of which 50 execute get_foo() and 50 update_foo(). And lets say the key is not there on memcached server. Imagine T1 doing a select operation followed by T2 doing an update. T1 is in Line4 doing the select and *GOING* to add the key to cache, while T2 goes ahead and updates the DB and executes Line 13 (i.e. updates the cache). Now if T1 executes Line 5 it would have stale results (in such a case memcache_add fails basically - but is it a sufficient guarantee that such a case would never arise?) Q3: Now we have 2 queries say: select * from users where userid = abc; select * from users where username = xyz; Users |userid|username|userinfo| and I want memcached to improve the query performance I had 2 approaches: 1. Cache1: Key=userid Value=User_Object Cache2: Key=username Value=userid 2. Cache1: Key=userid Value=User_Object Cache2: Key=username Value=User_Object Do you see potential flaws in any of these approaches? I tried to trace the flaws in the first one using various db calls, still would ask if you guys have seen it before. I would like to know in detail how memcached server handles queueing of these requests and atomicity of requests. If there are any posts/info on it, please let me know. Thanks -J