On 10/23/2011 12:11 PM, Jim Adler wrote:
I will be loosening the key filter criterion after I get the basics
working, which I thought would be a simple equality check. 8M keys
isn't really a large data set, is it? I thought that keys were stored
in memory and key filters just operated on those memory keys and not
data.

Jim

That's about where we started seeing timeouts in list-keys. Around 25
million keys, list-keys started to take down the cluster. (6 nodes, 1024
partitions). You may not encounter these problems, but were I in your
position and planning to grow... I would prepare to stop using key
filters, bucket listing, and key listing early.

Our current strategy is to store the keys in Redis, and synchronize them
with post-commit hooks and a process that reads over bitcask. With
ionice 3, it's fairly low-impact. https://github.com/aphyr/bitcask-ruby
may be useful.

--Kyle

  # Simplified code, extracted from our bitcask scanner:
  def run
    `renice 10 #{Process.pid}`
    `ionice -c 3 -p #{Process.pid}`

      begin
        bitcasks_dir = '/var/lib/riak/bitcask'
        dirs = Dir.entries(bitcasks_dir).select do |dir|
          dir =~ /^\d+$/
        end.map do |dir|
          File.join(bitcasks_dir, dir)
        end

        dirs.each do |dir|
          scan dir
          GC.start
        end
        log.info "Completed run"
      rescue => e
        log.error "#{e}\n#{e.backtrace.join "\n"}"
        sleep 10
      end
    end
  end

  def scan(dir)
    log.info "Loading #{dir}"
    b = Bitcask.new dir
    b.load

    log.info "Updating #{dir}"
    b.keydir.each do |key, e|
      bucket, key = BERT.decode(key).map { |x|
        Rack::Utils.unescape x
      }
      # Handle determines what to do with this particular bucket/key
      # combo; e.g. insert into redis.
      handle bucket, key, e
    end
  end

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to