On 10/23/2011 12:11 PM, Jim Adler wrote:
I will be loosening the key filter criterion after I get the basics
working, which I thought would be a simple equality check. 8M keys
isn't really a large data set, is it? I thought that keys were stored
in memory and key filters just operated on those memory keys and not
data.
Jim
That's about where we started seeing timeouts in list-keys. Around 25
million keys, list-keys started to take down the cluster. (6 nodes, 1024
partitions). You may not encounter these problems, but were I in your
position and planning to grow... I would prepare to stop using key
filters, bucket listing, and key listing early.
Our current strategy is to store the keys in Redis, and synchronize them
with post-commit hooks and a process that reads over bitcask. With
ionice 3, it's fairly low-impact. https://github.com/aphyr/bitcask-ruby
may be useful.
--Kyle
# Simplified code, extracted from our bitcask scanner:
def run
`renice 10 #{Process.pid}`
`ionice -c 3 -p #{Process.pid}`
begin
bitcasks_dir = '/var/lib/riak/bitcask'
dirs = Dir.entries(bitcasks_dir).select do |dir|
dir =~ /^\d+$/
end.map do |dir|
File.join(bitcasks_dir, dir)
end
dirs.each do |dir|
scan dir
GC.start
end
log.info "Completed run"
rescue => e
log.error "#{e}\n#{e.backtrace.join "\n"}"
sleep 10
end
end
end
def scan(dir)
log.info "Loading #{dir}"
b = Bitcask.new dir
b.load
log.info "Updating #{dir}"
b.keydir.each do |key, e|
bucket, key = BERT.decode(key).map { |x|
Rack::Utils.unescape x
}
# Handle determines what to do with this particular bucket/key
# combo; e.g. insert into redis.
handle bucket, key, e
end
end
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com