I recently tried to use the erlang pb client under moderate load (100 qps) and found it failed pretty hard because I was creating a new connection for each query. While I'm not new to riak, I have been using it under fairly light load up until now so I went searching for a solution to the problem. The common answer was: use a connection pool. This wasn't talked about on the wiki, so I checked this forum and found a few connection pool implementations. Most implementations didn't guarantee anything about simultaneous client usage, most didn't compile, etc. So I went about writing my own and doing a few experiments. Below are my findings. Please comment, check out the code, and feel free to tell me where/if I went wrong. :)
Executive Summary ================= Under moderate load, creating a new connection per request works fine, provided that you call riakc_pb_socket:stop/1 when you are done with it. I also created a client pooling application that seems to work but is not in production yet. Testing ======= I created a test which queries a local riak DB (ets backend) at a rate of 500 qps for 10 seconds. Each query does some random work (80% chance of a put/update, 20% chance of a delete) on a finite key-space. After each job is completed the calling thread sleeps for 100 ms before completing. Riak was restarted after each test. I tested three scenarios: for each query... 1) call to riakc_pb_socket:start_link/2 without a subsequent call to riakc_pb_socket:stop/1 2) call to riakc_pb_socket:start_link/2 with a subsequent call to riakc_pb_socket:stop/1 3) use riakpool (my pooling application) Riakpool ======== Riakpool maintains a queue of connection pids as the state in a gen_server. Pids are checked in and out by calling clients so as to prevent simultaneous use. If no connections exist in the queue, a new one is created. All connections are supervised by a simple_one_for_one supervisor. Pids are explicitly checked for liveness before being checked out. You can find the project here: https://github.com/dweldon/riakpool Results ====== 1) Fails almost immediately. I assume this is because the max number of file handles gets reached. 2) Works without error. 3) Works without error. At the end of the test, there were roughly 50 open connections which is what we expect from little's law. Conclusions =========== Bryan's statement here: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-February/000497.html seems to be incorrect, although it was probably made prior to the PB client. Apart from potential vector clock bloat as a result of changing client ids as mentioned here: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-April/000843.html I'm unsure why pooling is really even necessary (at least at the tested level of load). I think Basho should have some official position on this on the wiki. Comments and suggestions are very welcome! Dave _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
