We have done so far up to 3.5m keys listing for the same purpose using
2i search over protocol buffers and it seems to be fast enough.
Maybe it is fast because it streams and compress the keys list directly
into protocol buffer I/O stream without leaving a big foot print in
memory? I don't know the answer to that question though.
Using 2i listing has never failed for us where Map reduce identity on a
2i (for counting few millions keys based on 2i) have had a 50% chance to
fail depending on the key's size/count, at least for us.
We use Riak Java client, so that's also another concern, if you are
using other programming language then you would wonder if the client
uses 2i over PB.
Hope that helps,
Guido.
On 04/03/13 13:18, Pavel Kirienko wrote:
Hi everyone,
Is there any way to request a large number of keys through 2i
streaming? Say, there is index with 10M entries, I want to extract 1M
of them. Obviously the block request (i.e. all data packed into the
single response) is not a best idea since it requires a good amount of
memory either on client and the server.
One can suggest to feed 2i output into the Map/Reduce job with
streaming output, but this way is not so hot either: it is really slow
(our 3-node cluster stumbles on 100k keys for a minutes); and
sometimes it just isn't working (streaming may
stop occasionally before all data being kicked out). Not to mention
that on 1M of keys Map/Reduce job just never starts.
Is it possible to perform 2i queries for large number of keys, or
shall I use another storage for indexing instead? (like Redis maybe)
Thanks in advance.
Pavel.
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com