Re: View Performance (was Re: The 1.0 Thread)

Brian Candler Thu, 02 Jul 2009 06:21:30 -0700

My very simple benchmarking script is attached. I have run it on a 1.2GHz
Thinkpad X30 laptop running Ubuntu Jaunty, stock Erlang 12B5, stock
ruby-1.8.7, and CouchDB 0.10.0a787397. Both the benchmark client and CouchDB
are running on the same machine.


A quick look at the results:

  *** Testing view performance as function of doc padding size
  --- doc padded with 1 bytes
  Insert 3000 simple documents...                                1.396 secs
  Retrieve documents...                                          2.092 secs
  Full view...                                                   3.252 secs
  View with ?limit=1...                                          2.877 secs
  --- doc padded with 1000 bytes
  Insert 3000 simple documents...                                3.232 secs
  Retrieve documents...                                          3.475 secs
  Full view...                                                   8.497 secs
  View with ?limit=1...                                          8.161 secs
  --- doc padded with 2000 bytes
  Insert 3000 simple documents...                                4.985 secs
  Retrieve documents...                                          4.777 secs
  Full view...                                                  10.412 secs
  View with ?limit=1...                                          9.963 secs

In this test each document emits only an integer key and a null value. The
padding exists in the source document but not in the emitted key/value, so
is an attempt to measure the overhead of JSON-encoding the docs and sending
them to the view server only, since the emits are the same.

The difference in time for the full view versus ?limit=1 is very small,
showing the JSON encoding and HTTP transfer overhead to be small. The figure
with ?limit=1 is the time to index the documents, without sending them back
to the client.

Adding 1000 then 2000 bytes to each document linearly increases the time to
insert (~1.8s per 1000*3000 bytes) and the time to retrieve (~1.4s).

Strangely there is a jump in the indexing time: 5.3s for the first
additional 1000*3000 bytes, but only 1.8s for the next. But even taking this
lower figure, this suggests a transfer rate of only 1.7MB/s between Erlang
and the Javascript view server, including JSON serialisation and
deserialisation overhead of course. I wonder which end is the bottleneck?
Comparing different view servers and an integrated Erlang view server would
be very interesting.

  *** Performance as function of K/V pairs emitted (simple keys)
  Insert 3000 simple documents...                                1.450 secs
  1 K/V pairs per doc...                                         2.899 secs
  2 K/V pairs per doc...                                         4.136 secs
  3 K/V pairs per doc...                                         5.324 secs
  4 K/V pairs per doc...                                         6.459 secs
  5 K/V pairs per doc...                                         7.693 secs
  10 K/V pairs per doc...                                       13.522 secs

  *** Performance as function of K/V pairs emitted (compound keys)
  Insert 3000 compound documents...                              1.486 secs
  1 K/V pairs per doc...                                         3.575 secs
  2 K/V pairs per doc...                                         5.308 secs
  3 K/V pairs per doc...                                         6.926 secs
  4 K/V pairs per doc...                                         8.706 secs
  5 K/V pairs per doc...                                        10.471 secs
  10 K/V pairs per doc...                                       19.049 secs

This test attempts to measure the speed of emitting K/V pairs and inserting
them into the view. The views are all queried with ?limit=1 so that only the
indexing time is measured.

In the first run, simple integer keys are emitted. In the second, keys of
the form [x,y,z] are emitted, mixing strings, null and integers. I then vary
the number of emits per document in the design document. In both cases null
values are emitted.

This grows linearly. It takes an additional 1.18s for each 3000 simple keys,
and an additional 1.72s for each 3000 compound keys. This suggests a peak
emit and insert rate of 2540 simple keys or 1740 compound keys per second.

  *** Testing view performance as function of number of views
  Insert 3000 simple documents...                                1.391 secs
  Temp view...                                                   3.521 secs
  Temp view...                                                   0.553 secs
  Temp view...                                                   0.629 secs
  1 real view...                                                 3.267 secs
  2 identical real views...                                      3.248 secs
  3 identical real views...                                      3.961 secs
  4 identical real views,limit=1...                              3.077 secs
  2 different real views...                                      5.162 secs
  3 different real views...                                      6.872 secs
    second view in same ddoc...                                  0.375 secs
    third view in same ddoc...                                   0.372 secs
  1 real + 1 dummy views...                                      3.799 secs
  1 real + 2 dummy views...                                      4.027 secs
  1 real + 3 dummy views...                                      4.418 secs
    dummy view in same ddoc...                                   0.024 secs

Here we can see CouchDB is quite clever with view handling. Firstly, if you
submit exactly the same request repeatedly to a temp view, the old view is
reused. Then if you create a design document where multiple views have
exactly the same map code, only a single index is created.

However, views which are slightly different are processed differently. Each
different real view (one which emits k/v pairs) adds about 1.8 secs to the
overall time. In this case I didn't add limit=1 except where noted, but
since only one view is being downloaded, the 1.8 secs is the overhead of
creating the additional view.

Dummy views are "function(doc){}". Each one added linearly adds 0.4 secs to
the overall time. This is the time for the Javascript engine to pass all the
docs to an additional view function which does nothing.

  *** Testing view performance with reduce functions (simple keys)
  Insert 3000 simple documents...                                1.392 secs
  no reduce...                                                   2.897 secs
  null reduce...                                                 3.606 secs
  counter reduce...                                              3.659 secs
  min/max reduce...                                              3.951 secs
  banded reduce...                                               3.861 secs

  *** Testing view performance with reduce functions (compound keys)
  Insert 3000 compound documents...                              1.565 secs
  no reduce...                                                   4.883 secs
  null reduce...                                                 5.831 secs
  counter reduce...                                              5.881 secs
  min/max reduce...                                              6.578 secs
  banded reduce...                                               6.290 secs

Reduce functions seem to behave well: the overhead of a reduce is
significantly less than the map, adding only 0.7-0.9s to reduce 3000
documents, versus 2.9-4.9s to map them. This is interesting. It may reflect:

(a) the overhead of inserting the results into the Btree (one per document
in the case of map, but only one per group of documents in the case of
reduce);

and/or

(b) the fact that the K/V pairs are sent in batches to the reduce function,
for a single context switch, versus sending documents individually to the
map function (the "map_doc" call in main.js). It may be worth experimenting
with a "map_docs" call to send groups of documents for mapping.

In real life, the mapped items are likely to be smaller than the original
docs. However in this test the original docs themselves are also very small,
so I doubt there is much difference in this regard.

So to look at the storage and transfer overhead, I added a dummy value to
each emit (the value is a 32-byte string, instead of null)

  *** Testing view performance with reduce (compound key, emit pad value)
  Insert 3000 compound documents...                              1.497 secs
  no reduce...                                                   5.150 secs
  null reduce...                                                 6.327 secs
  counter reduce...                                              6.407 secs
  min/max reduce...                                              7.309 secs
  banded reduce...                                               6.914 secs

The mapping time has gone up by ~0.3s, and the reduce time by ~0.6s. I would
need to send a larger chunk or iterate more times to make this an accurate
measurement. But this seems reasonable: for map the larger data is
transferred once - from JS engine to Couch - and then inserted into Btree.
For reduce the larger data is transferred twice - from JS engine to Couch
(map output) and from Couch to JS (reduce input) where in this case it is
discarded. This suggests that the system is reasonably well balanced in
terms of encoding/decoding in both directions.

OK, well I'm not sure how much this adds to the debate :-) But perhaps
someone can read more into this than me.

Regards,

Brian.

require 'rubygems'
require 'restclient'
require 'json'
STDOUT.sync = true
DB="http://127.0.0.1:5984/test";
TMPFILE="/tmp/time.out"

class TimeTest
  DOCCOUNT = 1000
  
  def time(msg,&blk)
    printf "%-60s", msg+"..."
    t1 = Time.now.to_f
    yield
    t2 = Time.now.to_f
    printf "%8.3f secs\n", t2-t1
  end

  def reset
    RestClient.delete DB rescue nil
    RestClient.put DB, {}.to_json
  end

  def time_views(ddoc, *views)
    res = RestClient.put("#{DB}/_design/test", ddoc.to_json)
    rev = JSON.parse(res)['rev']
    views.each do |msg, view, exp|
      time(msg) do
        system "curl -s '#{DB}/_design/test/_view/#{view}' > #{TMPFILE}"
        res = nil
        case exp
        when Regexp
          res = File.read(TMPFILE)
          puts "Unexpected result: #{res}" unless exp =~ res
        else
          File.open(TMPFILE) { |f| res = f.gets }
          puts "Unexpected result: #{res}" unless /"total_rows":#{exp},/ =~ res
        end
      end
    end
    RestClient.delete("#{DB}/_design/test?rev=#{rev}")
  end

  def time_view(exp, mapred, msg='View')
    time_views({'views'=>{'test'=>mapred}}, [msg,'test',exp])
  end

  def load_simple(nbytes=1)
    reset
    docs = []
    (1..DOCCOUNT).each do |i|
      docs << {"foo" => i*10, "bar"=>"a"*nbytes}
      docs << {"foo" => i*10+10_000, "bar"=>"a"*nbytes}
      docs << {"foo" => i*10+20_000, "bar"=>"a"*nbytes}
    end
    time("Insert #{docs.size} simple documents") do
      RestClient.post "#{DB}/_bulk_docs", {'docs'=>docs}.to_json
    end
    docs
  end

  # This version loads some compound keys
  def load_compound(nbytes=1)
    reset
    docs = []
    (1..DOCCOUNT).each do |i|
      docs << {"foo" => [nil, "x", i*10], "bar"=>"a"*nbytes}
      docs << {"foo" => ["y", "z", i*10], "bar"=>"a"*nbytes}
      docs << {"foo" => ["z", nil, i*10], "bar"=>"a"*nbytes}
    end
    time("Insert #{docs.size} compound documents") do
      RestClient.post "#{DB}/_bulk_docs", {'docs'=>docs}.to_json
    end
    docs
  end

  # Test performance as a function of JSON document size.
  # Note that the padding is not emitted, so K/V pairs are the same.
  # Any slowdown must be due to serialisation/deserialisation and passing
  # the docs between processes, not in the Btree building.
  def test_docpad(model, nbytes)
    puts "--- doc padded with #{nbytes} bytes"
    docs = send("load_#{model}", nbytes)
    time("Retrieve documents") do
      system "curl -s '#{DB}/_all_docs?include_docs=true' >/dev/null"
    end
    view = {
    'map' => <<-MAP,
      function(doc) {
        if (doc.foo) {
          emit(doc.foo,null);
        }
      }
      MAP
    }
    ddoc = {'views'=>{'test'=>view}}
    time_views(ddoc, ["Full view","test",docs.size])
    view['map'] << ";"
    time_views(ddoc, ["View with ?limit=1","test?limit=1",docs.size])
  end

  # Testing as function of number of views. Each of the additional views
  # is just a dummy view which emits nothing.
  def test_numviews(model)
    docs = send("load_#{model}")
    view = {
      'map' => <<-MAP,
      function(doc) {
        if (doc.foo) {
          emit(doc.foo,null);
        }
      }
      MAP
    }

    3.times do
      time("Temp view") do
        res = RestClient.post "#{DB}/_temp_view", view.to_json
        puts "Wot? #{res.lines.first}" unless res.lines.first =~ 
/"total_rows":#{docs.size},/
      end
    end
    
    # Note: CouchDB appears to be clever if identical map code appears
    ddoc = {'views'=>{'test'=>view}}
    time_views(ddoc, ['1 real view','test',docs.size])
    ddoc['views']['test1'] = view
    time_views(ddoc, ['2 identical real views','test',docs.size])
    ddoc['views']['test2'] = view
    time_views(ddoc, ['3 identical real views','test',docs.size])
    ddoc['views']['test3'] = view
    time_views(ddoc, ['4 identical real 
views,limit=1','test?limit=1',docs.size])

    ddoc = {'views'=>{'test'=>view}}
    ddoc['views']['test1'] = {
      'map' => <<-MAP,
      function(doc) {
        if (doc.foo) {
          emit(doc.foo+1,null);
        }
      }
      MAP
    }
    time_views(ddoc, ['2 different real views','test',docs.size])
    ddoc['views']['test2'] = {
      'map' => <<-MAP,
      function(doc) {
        if (doc.foo) {
          emit(doc.foo+2,null);
        }
      }
      MAP
    }
    time_views(ddoc, ['3 different real views','test',docs.size],
                     ['  second view in same ddoc','test1',docs.size],
                     ['  third view in same ddoc','test2',docs.size])

    ddoc = {'views'=>{'test'=>view}}
    ddoc['views']['dummy1'] = {
      'map' => "function(doc) { }"
    }
    time_views(ddoc, ['1 real + 1 dummy views','test',docs.size])
    ddoc['views']['dummy2'] = {
      'map' => "function(doc) { 0; }"
    }
    time_views(ddoc, ['1 real + 2 dummy views','test',docs.size])
    ddoc['views']['dummy3'] = {
      'map' => "function(doc) { 1; }"
    }
    time_views(ddoc, ['1 real + 3 dummy views','test',docs.size],
                     ['  dummy view in same ddoc','dummy1',0])
  end

  # Testing as a function of K/V pairs emitted
  def test_kvpairs(model)
    docs = send("load_#{model}")
    ([*1..5]+[10]).each do |nkv|
      map = "function(doc) { if (doc.foo) {\n"
      nkv.times do |i|
        map << "emit(doc.foo+#{i},null);\n"
      end
      map << "} }"
      time_views({'views'=>{'test'=>{'map'=>map}}},
        ["#{nkv} K/V pairs per doc","test?limit=1",docs.size*nkv])
    end
  end

  def test_reduce(model, view=nil)
    docs = send("load_#{model}")
    view ||= {
      'map' => <<-MAP,
      function(doc) {
        if (doc.foo) {
          emit(doc.foo,null);
        }
      }
      MAP
    }

    ddoc = {'views'=>{'test'=>view}}
    time_views(ddoc, ['no reduce','test?limit=1',docs.size])

    view['reduce'] = REDUCE_NULL
    time_views(ddoc, ['null reduce','test',/"value":null\b/])

    view['reduce'] = REDUCE_COUNT
    time_views(ddoc, ['counter reduce','test',/"value":3000\b/])

    view['reduce'] = REDUCE_MIN_MAX
    time_views(ddoc, ['min/max reduce','test',/"count":3000\b/])

    view['reduce'] = REDUCE_BAND
    time_views(ddoc, ['banded reduce','test',/"high":/])
  end

  def run
    puts "\n*** Testing view performance as function of doc padding size"
    test_docpad(:simple, 1)
    test_docpad(:simple, 1_000)
    test_docpad(:simple, 2_000)

    puts "\n*** Performance as function of K/V pairs emitted (simple keys)"
    test_kvpairs(:simple)
    puts "\n*** Performance as function of K/V pairs emitted (compound keys)"
    test_kvpairs(:compound)
    
    puts "\n*** Testing view performance as function of number of views"
    test_numviews(:simple)

    puts "\n*** Testing view performance with reduce functions (simple keys)"
    test_reduce(:simple)
    puts "\n*** Testing view performance with reduce functions (compound keys)"
    test_reduce(:compound)
    puts "\n*** Testing view performance with reduce (compound key, emit pad 
value)"
    test_reduce(:compound, "map" => <<-MAP
      function(doc) {
        if (doc.foo) {
          emit(doc.foo,"abcdefghijklmnopqrstuvwxyz012345");
        }
      }
    MAP
    )
  end

  REDUCE_NULL = <<-REDUCE
    function(ks,vs,co) { return null; }
  REDUCE

  REDUCE_COUNT = <<-REDUCE
    function(ks, vs, co) {
      if (co) {
        return sum(vs);
      } else {
        return vs.length;
      }
    }
  REDUCE

  REDUCE_MIN_MAX = <<-REDUCE
    function(ks, vs, co) {
      if (co) {
        var res = vs.shift();
        for (var k in vs) {
          var sub = vs[k];
          res.count += sub.count;
          if (res.min > sub.min) { res.min = sub.min; }
          if (res.max < sub.max) { res.max = sub.max; }
        }
        return res;
      } else {
        var min, max;
        for (var k in ks) {
          var key = ks[k][0];
          if (!min || min > key) { min = key };
          if (!max || max < key) { max = key };
        }
        return {
          count: ks.length,
          min:   min,
          max:   max
        }
      }
    }
  REDUCE

  REDUCE_BAND = <<-REDUCE
    function(ks, vs, co) {
      if (co) {
        var result = vs.shift();
        for (var i in vs) {
          for (var j in vs[i]) {
            result[j] = (result[j] || 0) + vs[i][j];
          }
        }
        return result;
      } else {
        var result = {};
        for (var i in ks) {
          var key = ks[i];
          var band;
          if (key[0] < 1000) {
            band = "low";
          }
          else if (key[0] < 10000) {
            band = "mid";
          }
          else {
            band = "high";
          }
          result[band] = (result[band] || 0) + 1;
        }
        return result;
      }
    }
  REDUCE
end

TimeTest.new.run

Re: View Performance (was Re: The 1.0 Thread)

Reply via email to