Re: some more performance benchmarks

Neels Hofmeyr Thu, 07 Apr 2011 07:43:43 -0700

On Thu, 2011-04-07 at 08:03 -0500, Hyrum K Wright wrote:

> This is great work, though as you admit the results are bit
> incomprehensible at this stage.  Would it make sense to somehow


Hmm, I think stsp was a bit early to post, I'm still in the sanity
checking phase ;)

E.g.,:

[[[
Timings for 5x5_1.6
    N   min     max     avg    operation  (unit is seconds)
    6 330.544 561.406 425.169  TOTAL RUN
]]]

-->  variation of 330s to 561s in 6 identical runs?
(later note: I just found the random seeding might be in the wrong
place. I recently refactored a few things and it should prbly go inside
the N loop now.)

I have no idea about other loads on our vm. Maybe we could use the
user/sys output of 'time -p' instead of the real-world time measurements
it is taking right now, although my intention was to stick to the actual
real world time. Of course that works best on an otherwise idle system.
danielsh, or anyone, do you have any idea at what times-of-day (or
something) the VM host is most likely to idle for ~2 hours, so we can
schedule the tests then?


> combine the scripts to prevent the bifurcation of benchmark suites?

For the curious, I attached the current state of my benchmark scripts,
but I'm still editing.

> Do you have the script of commands, and the accompanying repo?

ATM my scripts are in an elego-private repo, they were initially going
to be just a quick test to have real timing results in an article about
1.7.

Actually, I think they have matured enough to be committed in tools/dev.
If Mark does the same we could have a collective look at them and
decide. I haven't actually seen Mark's results nor scripts yet. I have
posted mine before, older versions, a few months ago...
So where to put it?

tools/dev/benchmark/A  <--- mark's benches
tools/dev/benchmark/B  <--- my benches
???

I'm reluctant to label the dir with my name...
It'd be nice to find two descriptive words to distinguish between the
benchmarks, "by type".


In reply to CMike, an outline of what my test does:

I have a little python magic I use to call svn subcommands, calling the
cmdline svn client from py. Each call is being timed and added to a set
of N calls of the same subcommand.
So it has no understanding of whether it got 'svn delete one/file.c' or
'svn delete huge/tree/*', it simply adds another sample. So that's
almost madness.
But it goes on and runs the same identical sequence a number of times,
currently 6. Their values are averaged, and the min/max are retained to
have some sense of the real-world meaning of the average.

These 6 identical runs are done once for 1.6.x and once for trunk. They
are identical because I use a pseudo random number generator (to get
"organic" modifications) that is seeded with 0 for each run :P

Finally, the averages/mins/maxes for 1.6.x are compared to those of
trunk, showing a factor and a difference in seconds. (A factor of 10
were negligible if the actual time added was only 2 microseconds...)

Because of this "madness" I mentioned above (variation in single calls),
the *averages* are what's really interesting to look at. Even though an
average for 1.6 may be made up of (60 + 0.06)/2, trunk's average could
be (90 + 0.01)/2, and still shows the worst case pretty visibly (a long
call taking even longer). You can stare at the min/max differences to
make up your mind as to whether the averages are actual timings in
seconds or unitless fantasy ratings. Yet I am pretty confident that they
sufficiently compare the two clients.

In stsp's post, every single run was printed out. The really interesting
places to look though are where it says "Results for"... and
"COMPARE"...

[[[
COMPARE 5x5_1.6 to 5x5_trunk
  1.23|+0.45  means factor=1.23, difference in seconds = 0.45
  factor < 1 or difference < 0 means '5x5_trunk' is faster than
'5x5_1.6'
      min              max              avg         operation
  1.151|+49.7471   0.863|-76.9170   1.036|+15.1526  TOTAL RUN
  1.322|+0.0026    0.631|-1.4086    0.938|-0.0019   add
  1.864|+0.0113    0.413|-11.9194   0.513|-3.5088   checkout
  0.625|-2.3169    0.533|-24.8236   0.769|-4.0490   commit
  0.899|-0.0030    0.129|-0.7784    0.122|-0.3825   copy
 44.382|+100.3599  16.604|+122.9722  24.856|+109.4408  delete
  1.245|+1.6928    0.943|-1.2191    0.768|-3.3842   merge
  1.258|+0.0023    0.263|-1.8166    1.189|+0.0028   mkdir
  1.256|+0.0022    0.187|-2.3871    0.910|-0.0016   prop mod
  1.237|+0.0021    0.153|-0.4198    0.480|-0.0174   propdel
  1.267|+0.0020    0.267|-0.2795    1.544|+0.0050   proplist
  1.267|+0.0023    0.091|-5.4157    1.018|+0.0003   propset
  1.148|+0.0013    2.347|+0.0133    1.254|+0.0024   resolve
  2.764|+0.5885    0.878|-0.3298    1.565|+0.5413   resolved
  1.410|+0.0733    3.061|+1.9744    2.038|+0.4877   status
  0.602|-3.3563    0.422|-8.9009    0.470|-6.4426   switch
  1.622|+0.0934    0.302|-21.6808   0.421|-3.9173   update
]]]

An interpretation could be: 'svn delete' currently sucks for some reason
(24 times slower???), as well as 'svn status' (twice as slow, but then
again it only added half a second to a run that was >400 seconds long,
on average). The rest looks quite encouraging. IIRC, when I ran this a
few months back I had higher slowness factor numbers than this (and,
notably, a much saner 'svn delete' factor, so there must be some newly
introduced speed bug in trunk's 'svn delete' -- or in my script).
Interesting to note is that the *variation* in time it takes for a run
became somewhat smaller using trunk. So trunk's timings seem to be
slightly more homogeneous than 1.6.x's (looking at the TOTAL min/max).

Now you may note that there are several types of these, i.e. 5x5, 1x100
and 100x1. 5x5 means there is a WC depth of 5 directories, each one
holding 5 subdirs and 5 files. 1x100 is a flat directory with 100 dir
and file entries. 100x1 is a cascade of single subdirs that is 100 dir
levels deep, also with one file on each subdir level.

The 100x1 case is actually where the 1.6 client sucks very badly -- it's
visible in timing but even worse memory wise. Watching `top` while it
runs reveals a memory footprint going towards gigabyte scale.
Consequently, this is where trunk really really ROCKS because it does
the same job with a few mb of RAM, in a fraction of the time.

I'm playing with the thought of making graphical charts of these
numbers, but that might be overkill :)

Any ideas/opinions about adding to tools/dev/ would be appreciated.

Finally, the interesting part that shows what the script tests:

[[[
  run_cmd(['svnadmin', 'create', repos])
  svn('checkout', file_url, wc)

  trunk = j(wc, 'trunk')
  create_tree(trunk, levels, spread)
  add(trunk)
  st(wc)
  ci(wc)
  up(wc)
  propadd_tree(trunk, 0.5)
  ci(wc)
  up(wc)
  st(wc)

  trunk_url = file_url + '/trunk'
  branch_url = file_url + '/branch'

  svn('copy', '-mm', trunk_url, branch_url)
  st(wc)

  up(wc)
  st(wc)

  svn('checkout', trunk_url, wc2)
  st(wc2)
  modify_tree(wc2, 0.5)
  st(wc2)
  ci(wc2)
  up(wc2)
  up(wc)

  svn('switch', branch_url, wc2)
  modify_tree(wc2, 0.5)
  st(wc2)
  ci(wc2)
  up(wc2)
  up(wc)

  modify_tree(trunk, 0.5)
  st(wc)
  ci(wc)
  up(wc2)
  up(wc)

  svn('merge', '--accept=postpone', trunk_url, wc2)
  st(wc2)
  svn('resolve', '--accept=mine-conflict', wc2)
  st(wc2)
  svn('resolved', '-R', wc2)
  st(wc2)
  ci(wc2)
  up(wc2)
  up(wc)

  svn('merge', '--accept=postpone', '--reintegrate', branch_url, trunk)
  st(wc)
  svn('resolve', '--accept=mine-conflict', wc)
  st(wc)
  svn('resolved', '-R', wc)
  st(wc)
  ci(wc)
  up(wc2)
  up(wc)

  svn('delete', j(wc, 'branch'))
  ci(wc)
  up(wc2)
  up(wc)
]]]

Note that I still have a few plans for tweaks I'm not discussing because
this mail is long enough.

~Neels

#!/usr/bin/env python

"""
usage: benchmark.py run <run_file> <levels> <spread> [N]
       benchmark.py show <run_file>
       benchmark.py compare <run_file1> <run_file2>

Test data is written to run_file.
If a run_file exists, data is added to it.
<levels> is the number of directory levels to create
<spread> is the number of child trees spreading off each dir level
If <N> is provided, the run is repeated N times.
"""

import os, sys, time
import tempfile

from datetime import datetime, timedelta
from subprocess import Popen, PIPE, call
import random
import shutil

import cPickle

VERBOSE = False

DEFAULT_TIMINGS_PATH = './benchmark_py_last_run.py-pickle'

timings = None

def run_cmd(cmd, stdin=None, shell=False):

  if shell:
    printable_cmd = 'CMD: ' + cmd
  else:
    printable_cmd = 'CMD: ' + ' '.join(cmd)
  if VERBOSE:
    print printable_cmd

  if stdin:
    stdin_arg = PIPE
  else:
    stdin_arg = None

  p = Popen(cmd, stdin=stdin_arg, stdout=PIPE, stderr=PIPE, shell=shell)
  stdout,stderr = p.communicate(input=stdin)

  if VERBOSE:
    if (stdout):
      print "STDOUT: [[[\n%s]]]" % ''.join(stdout)
  if (stderr):
    print "STDERR: [[[\n%s]]]" % ''.join(stderr)

  return stdout,stderr

def timedelta_to_seconds(td):
  return ( float(td.seconds)
           + float(td.microseconds) / (10**6)
           + td.days * 24 * 60 * 60 )


class Timings:

  def __init__(self, *ignore_svn_cmds):
    self.timings = {}
    self.current_name = None
    self.tic_at = None
    self.ignore = ignore_svn_cmds
    self.name = None

  def tic(self, name):
    if name in self.ignore:
      return
    self.toc()
    self.current_name = name
    self.tic_at = datetime.now()

  def toc(self):
    if self.current_name and self.tic_at:
      toc_at = datetime.now()
      self.submit_timing(self.current_name, 
                         timedelta_to_seconds(toc_at - self.tic_at))
    self.current_name = None
    self.tic_at = None

  def submit_timing(self, name, seconds):
    times = self.timings.get(name)
    if not times:
      times = []
      self.timings[name] = times
    times.append(seconds)

  def summary(self):
    s = []
    if self.name:
      s.append('Timings for %s' % self.name)
    s.append('    N   min     max     avg    operation  (unit is seconds)')

    names = sorted(self.timings.keys())

    for name in names:
      timings = self.timings.get(name)
      if not name or not timings: continue

      s.append('%5d %7.3f %7.3f %7.3f  %s' % (
                 len(timings),
                 min(timings),
                 max(timings),
                 reduce(lambda x,y: x + y, timings) / len(timings),
                 name))
    return '\n'.join(s)


  def compare_to(self, other):
    selfname = self.name
    if not selfname:
      selfname = 'unnamed'
    othername = other.name
    if not othername:
      othername = 'the other'

    s = ['COMPARE %s to %s'%(othername, selfname),
         '  1.23|+0.45  means factor=1.23, difference in seconds = 0.45',
         '  factor < 1 or difference < 0 means \'%s\' is faster than \'%s\''
           % (self.name, othername)]

    s.append('      min              max              avg         operation')

    def do_div(a, b):
      if b:
        return float(a) / float(b)
      else:
        return 0.0

    def do_diff(a, b):
      return float(a) - float(b)

    def min_max_avg(ttimings):
      return ( min(ttimings),
               max(ttimings),
               reduce(lambda x,y: x + y, ttimings) / len(ttimings) )

    names = sorted(self.timings.keys())

    for name in names:
      timings = self.timings.get(name)
      other_timings = other.timings.get(name)
      if not other_timings:
        continue


      min_me, max_me, avg_me = min_max_avg(timings)
      min_other, max_other, avg_other = min_max_avg(other_timings)

      s.append('%-16s %-16s %-16s  %s' % (
                 '%7.3f|%+7.4f' % (
                     do_div(min_me, min_other),
                     do_diff(min_me, min_other)
                   ),

                 '%7.3f|%+7.4f' % (
                     do_div(max_me, max_other),
                     do_diff(max_me, max_other)
                   ),

                 '%7.3f|%+7.4f' % (
                     do_div(avg_me, avg_other),
                     do_diff(avg_me, avg_other)
                   ),

                 name))
    return '\n'.join(s)


  def add(self, other):
    for name, other_times in other.timings.items():
      my_times = self.timings.get(name)
      if not my_times:
        my_times = []
        self.timings[name] = my_times
      my_times.extend(other_times)




j = os.path.join

_create_count = 0

def next_name(prefix):
  global _create_count
  _create_count += 1
  return '_'.join((prefix, str(_create_count)))

def create_tree(in_dir, levels, spread=5):
  try:
    os.mkdir(in_dir)
  except:
    pass

  for i in range(spread):
    # files
    fn = j(in_dir, next_name('file'))
    f = open(fn, 'w')
    f.write('This is %s\n' % fn)
    f.close()

    # dirs
    if (levels > 1):
      dn = j(in_dir, next_name('dir'))
      create_tree(dn, levels - 1, spread)


def svn(*args):
  global timings
  name = args[0]
  cmd = ['svn']
  cmd.extend(args)
  if VERBOSE:
    print 'svn cmd: ' + ' '.join(cmd)
 
  stdin = None
  if stdin:
    stdin_arg = PIPE
  else:
    stdin_arg = None

  timings.tic(name)
  try:
    p = Popen(cmd, stdin=stdin_arg, stdout=PIPE, stderr=PIPE, shell=False)
    stdout,stderr = p.communicate(input=stdin)
  finally:
    timings.toc()

  if VERBOSE:
    if (stdout):
      print "STDOUT: [[[\n%s]]]" % ''.join(stdout)
    if (stderr):
      print "STDERR: [[[\n%s]]]" % ''.join(stderr)

  return stdout,stderr


def add(*args):
  return svn('add', *args)

def ci(*args):
  return svn('commit', '-mm', *args)

def up(*args):
  return svn('update', *args)

def st(*args):
  return svn('status', *args)

_chars = [chr(x) for x in range(ord('a'), ord('z') +1)]

def randstr(len=8):
  return ''.join( [random.choice(_chars) for i in range(len)] )

def _copy(path):
  dest = next_name(path + '_copied')
  svn('copy', path, dest)

def _move(path):
  dest = path + '_moved'
  svn('move', path, dest)

def _propmod(path):
  so, se = svn('proplist', path)
  propnames = [line.strip() for line in so.strip().split('\n')[1:]]

  # modify?
  if len(propnames):
    svn('ps', propnames[len(propnames) / 2], randstr(), path)

  # del?
  if len(propnames) > 1:
    svn('propdel', propnames[len(propnames) / 2], path)


def _propadd(path):
  # set a new one.
  svn('propset', randstr(), randstr(), path)


def _mod(path):
  if os.path.isdir(path):
    return _propmod(path)

  f = open(path, 'a')
  f.write('\n%s\n' % randstr())
  f.close()

def _add(path):
  if os.path.isfile(path):
    return _mod(path)

  if random.choice((True, False)):
    # create a dir
    svn('mkdir', j(path, next_name('new_dir')))
  else:
    # create a file
    new_path = j(path, next_name('new_file'))
    f = open(new_path, 'w')
    f.write(randstr())
    f.close()
    svn('add', new_path)

def _del(path):
  svn('delete', path)

_mod_funcs = (_mod, _add, _propmod, _propadd, )#_copy,) # _move, _del)
  
def modify_tree(in_dir, fraction):
  child_names = os.listdir(in_dir)
  for child_name in child_names:
    if child_name[0] == '.':
      continue
    if random.random() < fraction:
      path = j(in_dir, child_name)
      random.choice(_mod_funcs)(path)

  for child_name in child_names:
    if child_name[0] == '.': continue
    path = j(in_dir, child_name)
    if os.path.isdir(path):
      modify_tree(path, fraction)
  
def propadd_tree(in_dir, fraction):
  for child_name in os.listdir(in_dir):
    if child_name[0] == '.': continue
    path = j(in_dir, child_name)
    if random.random() < fraction:
      _propadd(path)
    if os.path.isdir(path):
      propadd_tree(path, fraction)


def run(levels, spread, N):
  global timings

  # ensure identical modifications for every run of this script
  random.seed(0)

  for i in range(N):

    base = tempfile.mkdtemp()
    try:
      repos = j(base, 'repos')
      wc = j(base, 'wc')
      wc2 = j(base, 'wc2')

      file_url = 'file://%s' % repos

      so, se = run_cmd(['which', 'svn'])
      if not so:
        print "Can't find svn."
        exit(1)

      print '\nRunning svn benchmark in', base
      print 'dir levels: %s; new files and dirs per leaf: %s; run %d of %d' %(
            levels, spread, i + 1, N)

      so, se = svn('--version')
      print ', '.join( so.split('\n')[:2] )
      started = datetime.now()

      try:
        run_cmd(['svnadmin', 'create', repos])
        svn('checkout', file_url, wc)

        trunk = j(wc, 'trunk')
        create_tree(trunk, levels, spread)
        add(trunk)
        st(wc)
        ci(wc)
        up(wc)
        propadd_tree(trunk, 0.5)
        ci(wc)
        up(wc)
        st(wc)

        trunk_url = file_url + '/trunk'
        branch_url = file_url + '/branch'

        svn('copy', '-mm', trunk_url, branch_url)
        st(wc)

        up(wc)
        st(wc)

        svn('checkout', trunk_url, wc2)
        st(wc2)
        modify_tree(wc2, 0.5)
        st(wc2)
        ci(wc2)
        up(wc2)
        up(wc)

        svn('switch', branch_url, wc2)
        modify_tree(wc2, 0.5)
        st(wc2)
        ci(wc2)
        up(wc2)
        up(wc)

        modify_tree(trunk, 0.5)
        st(wc)
        ci(wc)
        up(wc2)
        up(wc)

        svn('merge', '--accept=postpone', trunk_url, wc2)
        st(wc2)
        svn('resolve', '--accept=mine-conflict', wc2)
        st(wc2)
        svn('resolved', '-R', wc2)
        st(wc2)
        ci(wc2)
        up(wc2)
        up(wc)

        svn('merge', '--accept=postpone', '--reintegrate', branch_url, trunk)
        st(wc)
        svn('resolve', '--accept=mine-conflict', wc)
        st(wc)
        svn('resolved', '-R', wc)
        st(wc)
        ci(wc)
        up(wc2)
        up(wc)

        svn('delete', j(wc, 'branch'))
        ci(wc)
        up(wc2)
        up(wc)


      finally:
        stopped = datetime.now()
        print '\nDone with svn benchmark in', (stopped - started)
        timings.submit_timing('TOTAL RUN', timedelta_to_seconds(stopped - started))

        # rename ps to prop mod
        if timings.timings.get('ps'):
          has = timings.timings.get('prop mod')
          if not has:
            has = []
            timings.timings['prop mod'] = has
          has.extend( timings.timings['ps'] )
          del timings.timings['ps']

        print timings.summary()
    finally:
      shutil.rmtree(base)


def read_from_file(file_path):
  f = open(file_path, 'rb')
  try:
    instance = cPickle.load(f)
    instance.name = os.path.basename(file_path)
  finally:
    f.close()
  return instance


def write_to_file(file_path, instance):
  f = open(file_path, 'wb')
  cPickle.dump(instance, f)
  f.close()

def usage():
  print __doc__

if __name__ == '__main__':
  if len(sys.argv) > 1 and 'compare'.startswith(sys.argv[1]):
    if len(sys.argv) < 4:
      usage()
      exit(1)
    
    p1,p2 = sys.argv[2:4]

    t1 = read_from_file(p1)
    t2 = read_from_file(p2)

    print t1.summary()
    print '---'
    print t2.summary()
    print '---'
    print t2.compare_to(t1)

  elif len(sys.argv) > 1 and 'combine'.startswith(sys.argv[1]):
    if len(sys.argv) < 5:
      usage()
      exit(1)
    
    dest = sys.argv[-1]
    paths = sys.argv[2:-1]

    total = Timings('--version');

    for path in paths:
      t = read_from_file(path)
      total.add(t)

    print total.summary()
    write_to_file(dest, total)



  elif len(sys.argv) > 1 and 'run'.startswith(sys.argv[1]):
    try:
      timings_path = sys.argv[2]
      levels = int(sys.argv[3])
      spread = int(sys.argv[4])

      if len(sys.argv) > 5:
        N = int(sys.argv[5])
      else:
        N = 1
    except:
      usage()
      raise

      
    print '\n\nHi, going to run a Subversion benchmark series of %d runs...' % N

    if os.path.isfile(timings_path):
      print 'Going to add results to existing file', timings_path
      timings = read_from_file(timings_path)
    else:
      print 'Going to write results to new file', timings_path
      timings = Timings('--version')

    run(levels, spread, N)

    write_to_file(timings_path, timings)


  elif len(sys.argv) > 1 and 'show'.startswith(sys.argv[1]):
    if len(sys.argv) < 2:
      usage()
      exit(1)
      
    for timings_path in sys.argv[2:]:
      timings = read_from_file(timings_path)
      print '---\n%s' % timings_path
      print timings.summary()

  else: usage()

run
Description: application/shellscript

Re: some more performance benchmarks

Reply via email to