Hi All,

I have a some questions regarding memcached updates and seek your comments. I 
am a newbie here (in case there is any separate list to be posted out to, 
please lemme know)

** Source Wikipedia **

1 function get_foo (int userid) {
2    result = memcached_fetch("userrow:" + userid);
3    if (!result) {
4        result = db_select("SELECT * FROM users WHERE userid = ?", userid);
5        memcached_add("userrow:" + userid,  result);
6    }
7    return result;
8}

9 function update_foo(int userid, string dbUpdateString) {
10    result = db_execute(dbUpdateString);
11    if (result) {
12        data = createUserDataFromDBString(dbUpdateString);
13        memcached_set("userrow:" + userid, data);
14    }
15}

*******

Imagine a table now getting queried on 2 columns say userid and username

Q1:
If we have 100 processes each executing the get_foo function, and lets say 
memcached does not have the key. As there would be a delay between executing 
Line 2 and Line 5, 
there would be atleast dozens of processes querying the db and executing Line 5 
creating more
bottleneck on the memcached server - How does it scale then (Imagine a million 
processes now getting triggered)? 
I understand it is the initial load factorbut how do you take this into account 
while starting up the memcached servers?

Q2:
Now imagine, you have 100 processes again querying the key out of which 50 
execute get_foo() and 50 update_foo().
And lets say the key is not there on memcached server. Imagine T1 doing a 
select operation followed
by T2 doing an update. T1 is in Line4 doing the select and *GOING* to add the 
key to cache, while T2
goes ahead and updates the DB and executes Line 13 (i.e. updates the cache). 
Now if T1 executes Line 5
it would have stale results (in such a case memcache_add fails basically - but 
is it a sufficient guarantee
that such a case would never arise?)

Q3:
Now we have 2 queries say:
select * from users where userid = abc;
select * from users where username = xyz;

Users
|userid|username|userinfo|

and I want memcached to improve the query performance

I had 2 approaches:
1. Cache1: Key=userid Value=User_Object
   Cache2: Key=username Value=userid

2. Cache1: Key=userid Value=User_Object
   Cache2: Key=username Value=User_Object

Do you see potential flaws in any of these approaches? I tried to trace the 
flaws in the first one using 
various db calls, still would ask if you guys have seen it before.

I would like to know in detail how memcached server handles queueing of these 
requests and atomicity of requests. If there are any posts/info on it, please 
let me know.

Thanks

-J



      

Reply via email to