Ok, I'll have a look at TokyoTyrant. Do you have numbers comparing the write
performance of memcached and tt?

Cheers,
Martin


On Sun, Mar 14, 2010 at 2:20 AM, Adam Lee <[email protected]> wrote:

> If your goal is only to make memcached into a reliable datastore, then I
> think you are perhaps going about it in the wrong way.  The memcached server
> is extremely well written and tuned and does it's job incredibly well and
> very efficiently.  If you want to ensure that it is deterministic, I think
> that you should do your code on the client side rather than on the server
> side.
>
> We, for example, did some work in the past to store our user data (we don't
> really use sessions in the traditional sense of the word, but this probably
> the closest thing we have outside of our cookie) in memcached because the
> load on our primary database was just too high.  In order to make it
> deterministic, we wrote our own client and did a special setup.
>
> We had several servers (started with 3, ended up growing it to 5 before we
> replaced it with TokyoTyrant) that had identical configurations, such that
> each server had more than enough memory to fit the entire dataset.  We then
> wrote a client that had the following behavior:
>
> - Writes were sent to every server
> - All updates to the database had to also be written to memcached in order
> to be considered a success
> - Reads were performed on a randomly selected server
>
> We also wrote a populate-user-cache script that could fill a new server
> with the required data. Since we have about 30 million users, this job took
> quite a while, so we also built in the idea of an "is populated" flag.  This
> flag would not be set by the populate script until it was totally finished
> replicating the data.  The client code was written such that it could write
> to a server that didn't have the "is populated" flag, but would never read
> from it.  This meant that we could bring up new servers and they would be
> populated with new data, but only would be used once they were accurate (the
> populate-user-cache script only issued add commands, making sure that it
> didn't clobber any data being written by actual traffic).
>
> One of the key features of this setup was that every server had the full
> dataset-- this meant that we could build a page that needed data for, say,
> 500 users and load it with almost no more latency than needed to get the
> data for one user because of how well memcached handles multi-gets.
>
> We don't use this setup anymore because we moved to using TokyoTyrant as
> our persistent cache layer, but I will say that it worked pretty much
> flawlessly for about two years.  There was no way that our database would
> have been able to handle the read necessary read load, but these servers
> performed exceedingly well-- easily handling over 30,000+ gets per second.
>
> Anyway, I think that building something similar might do a much better job
> of performing the task you're attempting.  The key thing to recognize is
> that memcached is built to do a specific task and it's _GREAT_ at it, so you
> should use it for what it does best. Let me know if any of this doesn't make
> sense to you or if you have any further questions.
>
> --
> awl
>



-- 
Martin Grotzke
http://www.javakaffee.de/blog/

Reply via email to