I toyed with a pmap in Python a while back to attempt to speed up multiple HTTP request to our web services layer at work. You may want to attempt that with gevent.
Here's the code I wrote which is probably not production ready. https://github.com/ericmoritz/pmap On Aug 9, 2012 4:46 AM, "Parnell Springmeyer" <[email protected]> wrote: > Jeremy, > > I was looking for something similar and first built an extra handler onto > an internal erlang cowboy API server that used maelstrom (my own worker > pool OTP application). > > It was used to make a simple POST with a string of the {bucket, key} pairs > and the server would concurrently GET and combine the results and send it > back. This was very fast (thousands of keys GET in ms). > > Since that seemed gross, I then decided (based on some input from someone > else on the list) to try using a simple Map/Reduce phase that did not use > javascript but the erlang functions (since those are going to be really > fast and take advantage Erlang's concurrency better than the javascript > VM's). > > In python, you can do this to run that type of M/R phase without knowing > any Erlang code: > > client = riak.RiakClient() > > # Add your KNOWN bucket and key pairs (you can do this in a loop) > query = client.add(bucket, key) > query.add(bucket, key) > query.add(bucket, key) > etc… (as many as you like) > > # Now tell the map and reduce phases to use Erlang module > "riak_kv_mapreduce" and its given function > # "map_object_value" and "reduce_set_union". > results = client.map(["riak_kv_mapreduce", "map_object_value"]) \ > .reduce(["riak_kv_mapreduce", "reduce_set_union"]) \ > .run() > > The above returns results faster for me, than the brokered multi-get > approach I used (I guarantee my brokered multi-get is faster than anything > you can do with python + gevent, if that's the case, the M/R phase is > definitely the route you want to go). > > So IMHO, it is very fast as long as you know the buckets and keys you want > to get. > > On Aug 9, 2012, at 12:11 AM, Jeremy Dunck wrote: > > > I'm new to riak and need multi-get (that is, getting the value and/or > > existence of keys in a single network-trip latency). > > > > I was wondering what the latency of the map-reduce approach is? > > > http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-February/003229.html > > > > Alternatively, has anyone tried scaling concurrent gets (perhaps with > > evented io) to do many concurrent requests and combining results on > > the client? > > > > I am toying with a python+gevent multiget function. If the stance is > > still that a multiget operation doesn't belong in core, I'm a little > > surprised that there doesn't seem to at least be a nice client-lib API > > func to do it. It sure seems useful... > > > > In my use-case, the immediate need is to know whether a db insert > > needs to be done. We're handling too many keys to want to store in > > memory (so no redis, etc), and we don't want to go to the db more than > > we need to, so it seems riak would be good here. But we're getting > > 1000s of potential insert keys and want to whittle down all those to a > > relative few db inserts. > > > > So I was thinking riak key-per-id, and insert to the db iff the riak > > key doesn't exist, then add the riak key. We'll get some race > > conditions on the insert, but that's OK in our case. > > > > We do need low latency on the riak check, though, hence either > > multiplexing w/ eventing or map-reduce (if that latency is actually > > good). > > > > Am I doing it wrong? > > > > _______________________________________________ > > riak-users mailing list > > [email protected] > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
