Re: Key Filter Timeout

Jim Adler Mon, 24 Oct 2011 12:19:47 -0700

Great validation. Thanks Mark.

Jim


On 10/24/11 7:51 AM, "Mark Steele" <[email protected]> wrote:

>It was a pretty simple benchmark test using a custom built protocol
>buffer client, so I wouldn't put too much faith in it.
>
>As far as performance, my client was able to retrieve keys at a rate of
>about 120 thousand keys per second from the key listing operation. The
>key listing performance was constant, my testing went from 1 million keys
>stored to 60 million with very little variation in throughput.
>
>I guess the mantra is: Test test test test with your app, YMMV
>
>Cheers,
>
>Mark
>
>
>On Monday 24 October 2011 07:37:47 Jim Adler wrote:
>> Yes, using 1.0.1 with LevelDB. I moved to it from Bitcask in the hopes
>>of better performance.
>> 
>> Good to hear about your 60M key use-case. Can you share any key access
>>performance numbers?
>> 
>> Jim
>> 
>> On Oct 24, 2011, at 7:23 AM, Mark Steele <[email protected]>
>>wrote:
>> 
>> > Just curious Kyle, you using the 1.0 series?
>> > 
>> > I've done some informal testing on a 3 node 1.0 cluster and key
>>listing was working just peachy on 60 million keys using bitcask as the
>>backend.
>> > 
>> > Cheers,
>> > 
>> > Mark
>> > 
>> > On Sunday 23 October 2011 12:26:35 Aphyr wrote:
>> >> On 10/23/2011 12:11 PM, Jim Adler wrote:
>> >>> I will be loosening the key filter criterion after I get the basics
>> >>> working, which I thought would be a simple equality check. 8M keys
>> >>> isn't really a large data set, is it? I thought that keys were
>>stored
>> >>> in memory and key filters just operated on those memory keys and not
>> >>> data.
>> >>> 
>> >>> Jim
>> >> 
>> >> That's about where we started seeing timeouts in list-keys. Around 25
>> >> million keys, list-keys started to take down the cluster. (6 nodes,
>>1024
>> >> partitions). You may not encounter these problems, but were I in your
>> >> position and planning to grow... I would prepare to stop using key
>> >> filters, bucket listing, and key listing early.
>> >> 
>> >> Our current strategy is to store the keys in Redis, and synchronize
>>them
>> >> with post-commit hooks and a process that reads over bitcask. With
>> >> ionice 3, it's fairly low-impact.
>>https://github.com/aphyr/bitcask-ruby
>> >> may be useful.
>> >> 
>> >> --Kyle
>> >> 
>> >>   # Simplified code, extracted from our bitcask scanner:
>> >>   def run
>> >>     `renice 10 #{Process.pid}`
>> >>     `ionice -c 3 -p #{Process.pid}`
>> >> 
>> >>       begin
>> >>         bitcasks_dir = '/var/lib/riak/bitcask'
>> >>         dirs = Dir.entries(bitcasks_dir).select do |dir|
>> >>           dir =~ /^\d+$/
>> >>         end.map do |dir|
>> >>           File.join(bitcasks_dir, dir)
>> >>         end
>> >> 
>> >>         dirs.each do |dir|
>> >>           scan dir
>> >>           GC.start
>> >>         end
>> >>         log.info "Completed run"
>> >>       rescue => e
>> >>         log.error "#{e}\n#{e.backtrace.join "\n"}"
>> >>         sleep 10
>> >>       end
>> >>     end
>> >>   end
>> >> 
>> >>   def scan(dir)
>> >>     log.info "Loading #{dir}"
>> >>     b = Bitcask.new dir
>> >>     b.load
>> >> 
>> >>     log.info "Updating #{dir}"
>> >>     b.keydir.each do |key, e|
>> >>       bucket, key = BERT.decode(key).map { |x|
>> >>         Rack::Utils.unescape x
>> >>       }
>> >>       # Handle determines what to do with this particular bucket/key
>> >>       # combo; e.g. insert into redis.
>> >>       handle bucket, key, e
>> >>     end
>> >>   end
>> >> 
>> >> _______________________________________________
>> >> riak-users mailing list
>> >> [email protected]
>> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> > 
>> > _______________________________________________
>> > riak-users mailing list
>> > [email protected]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Key Filter Timeout

Reply via email to