Re: [viff-patches] [PATCH 4 of 7] Added benchmark suite class

Martin Geisler Mon, 01 Dec 2008 12:27:48 -0800

Thomas Pelle Jakobsen <[EMAIL PROTECTED]> writes:

> # HG changeset patch
> # User Thomas Pelle Jakobsen <[EMAIL PROTECTED]>
> # Date 1226015502 -3600
> # Node ID 75e5113f27777649c2001b1221c9717d8d375423
> # Parent  0985564470de2bb2c5247effd30dc40f74048f17
> Added benchmark suite class.
>
> diff -r 0985564470de -r 75e5113f2777 apps/benchmark/suite.py
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/apps/benchmark/suite.py Fri Nov 07 00:51:42 2008 +0100
> @@ -0,0 +1,279 @@
> +# -*- coding: utf-8 -*-


Are there any non-ASCII characters here that require an explicit
encoding?

> +# Copyright 2007, 2008 VIFF Development Team.
> +#
> +# This file is part of VIFF, the Virtual Ideal Functionality Framework.
> +#
> +# VIFF is free software: you can redistribute it and/or modify it
> +# under the terms of the GNU Lesser General Public License (LGPL) as
> +# published by the Free Software Foundation, either version 3 of the
> +# License, or (at your option) any later version.
> +#
> +# VIFF is distributed in the hope that it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +# or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General
> +# Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with VIFF. If not, see <http://www.gnu.org/licenses/>.
> +""" 
> +    Description:
> +        Main goal is to make it easy to write and run VIFF benchmarks while
> +        at the same time not limiting the kind of benchmarks that could be
> +        made. Also, the solution should scale as the number of benchmarks
> +        increase. Finally, the benchmark data should be collected in a 
> central
> +        database rather than as a bunch of text files.
> +        
> +        A benchmark is created by subclassing either Benchmark or
> +        VIFFBenchmark. By subclassing VIFFBenchmark, you need only specify 
> the
> +        runtime that is to be used, the protocol that is to be benchmarked, 
> and
> +        the code that does the measure. See the examples.
> +
> +
> +    Features:
> +        When setting up a suite, one can specify which revision to benchmark.
> +        Uses ssh and thus benefits from key agents like ssh-agent.
> +        Non-viff benchmarks can be done by subclassing Benchmark while VIFF
> +            benchmarks can easily be benchmarked by subclassing 
> VIFFBenchmark.
> +        Data is automatically stored in central database.
> +        Benchmark results are stored for each individual run. This enables 
> all
> +            kind of statistics to be applied at a later time, since no info 
> is
> +            lost. E.g. storing only mean values would prevent calculating
> +            conficence intervals and other statistical stuff.
> +        Dynamic approach: All parameter names are available except a few
> +            pre-defined, e.g. work_dir, player_id, ... (TODO: List them 
> here.)
> +        
> + 
> +    Issues:
> +        Looks like the first run takes longer time than the rest. Is that ok?
> +        Results for all runs for a benchmark is kept in memory at once.
> +            Problem?
> +        Currently master only works on *nix since we use select.
> +        Benchmark slaves works only on *nix now due to small issues with 
> pipes,
> +            etc.
> +        It is assumed that all benchmark hosts have access to the same file
> +            system and that the dir supplied to the suite is on this shared
> +            file system.
> +        Can we run multiple benchmark slaves on one host?
> +        Currently, default parameters are always used, e.g. use_ssl = no.
> +        Support for string, float and boolean results. Currently only 
> integers
> +            (up to Mysql.BIGNUM) are supported.
> +        We assume that python setup.py install --home=$HOME/opt overwrites 
> old
> +            installation nicely.
> +        Log err and out nicely to various files and/or use python logging.
> +        Write unit tests!
> +        All random seeds should come in as parameters.
> +        Currently, the runs are done in sequence. Maybe also support parallel
> +            runs; this is ok if benchmark doesn't measure bandwith or time, 
> and
> +            it is (presumably) faster.
> +        The table TimedResults cannot be used yet.
> +        The benchmark suite must be run in the viff/benchmark dir due to scp
> +            copy of the benchmark classes, ok?
> +        Better documentation.
> +        Security. Currently, password to the mysql db is supplied on the
> +            command line both in example.py and when executing benchmark.py 
> on
> +            each host. This is not secure as everyone having access to the
> +            benchmark computers will be able to read off the password using
> +            e.g. ps -eLf | grep benchmark.py.
> +        Can't get the popen objects to finish when ssh finishes, so I've 
> done a
> +            hack by letting each benchmark write COMPLETED WITH RETURNCODE x 
> as
> +            the last thing. Not pretty, but it works (at least now..)
> +        Non-VIFF benchmarks are always run on the same host that starts the
> +            suite. It probably would be nice to be able to specify which host
> +            to use for the benchmark.
> +        VIFFBenchmark reports back to master on the fly. Would also be nice
> +            if non-VIFF benchmarks did that.

Wow, you've made a big system... I cannot help but think that it does
half of what buildbot already does.

> +def parse(args):
> +    """ Creates a dict form a list of strings a la k1=val1. """
> +    res = {}
> +    for a in args[1:]:

Why not the first item? I have a hunch that you use this for command
line parsing -- would the optparse module not be better for this? At
least add some doctests that show what this function really does.

> +        # Note: This allows attribute names to contain '='
> +        s = a.rsplit("=",1)
> +        # TODO: Info about attribute types should be included. Here, we
> +        #       simply treat an attribute as int if possible and otherwise
> +        #       as a string.
> +        try:
> +            res[s[0]] = int(s[1])
> +        except ValueError:
> +            res[s[0]] = s[1]
> +    return res
> +
> +
> +class Suite:
> +    
> +    # TODO: Change revision to hg_revision.
> +    
> +    """   
> +    hosts: A list of (hostname, port). This is the list of all available 
> hosts
> +        that can be used for the benchmark and the port numbers that should 
> be
> +        used on each host. As hostname you can either supply a real hostname,
> +        e.g. camel17.daimi.au.dk, or an integer which then refers to the
> +        host_id in the benchmark database. This is useful if one host like
> +        camel17.daimi.au.dk is used with several configurations and thus 
> should
> +        be treated as multiple "hosts". If multiple such host configurations
> +        exists in the database and only the string hostname is given, an
> +        exception is thrown.
> +                   
> +        Note that this should be the complete list of hosts that can be used
> +        for the benchmark. Some of the benchmarks will only use a subset of 
> the
> +        hosts listed.
> +           
> +        Note also that the hosts are not nescessarily given protocol ids in 
> the
> +        same order as this list.
> +           
> +    user: The username that should be used to log into the benchmark hosts.
> +        TODO: Defaults to the username of the user executing this script.
> +    
> +    work_dir: A directory on a shared filesystem that all the benchmark 
> slaves
> +        have access to. Here, viff will be checked out and temporary files 
> will
> +        possibly be written. TODO: Defaults to same directory as on the 
> master
> +        host (e.g. where this script is executed).
> +    
> +    database: The benchmark database. Note that the credentials of this
> +        database should be set up to provide write access to some of the
> +        database tables.
> +    
> +    revision: The VIFF revision which should be benchmarked. Defaults to tip.
> +    
> +    hg_repository: The repository from where the revision is checked out. 
> This
> +        could be the main repository http://hg.viff.dk/viff, but you can also
> +        use your own clone such as ssh://[EMAIL PROTECTED]/viff-benchmark. If
> +        no hg_repository is supplied it is assumed that a hg repository clone
> +        already exists in work_dir/viff and the needed revision is simply
> +        checked out from this clone. TODO: Add support for pull.
> +                   
> +    viff_dir: Where VIFF should be installed. When the appropriate VIFF
> +        revision has been checked out, the suite executes
> +              
> +            python setup.py install --home=viff_dir

VIFF does not really need to be installed to be used. If you put the
root of a VIFF checkout in the PYTHONPATH, then it just works. That
might be easier and simpler than installing it.

In any case, it would be cool if the directories were created with the
tempfile module:

  http://docs.python.org/library/tempfile.html

That would make it unnecessary for the user to specify them and easier
to cleanup since they can just be nuked afterwards.

> +        on one of the available hosts. Make sure that your PATH and 
> PYTHONPATH
> +        are set up correspondingly, e.g. as described in "Installing from 
> +        Source" at http://viff.dk/doc/install.html. One example could be to 
> use
> +                  
> +            viff_dir = $HOME/opt
> +              
> +        and to include these in your .bashrc file:
> +              
> +            export PYTHONPATH=$PYTHONPATH:$HOME/opt/lib/python
> +            export PATH=$PATH:$HOME/opt/bin

Please don't mention bash here -- not everybody used it. It is bad
enough that we do it in the installation guide :-)

> +    """
> +    def __init__(self, database, hosts, user, work_dir, viff_dir,
> +                 revision=None,
> +                 hg_repository=None):
> +        self.user = user
> +        self.viff_dir = viff_dir
> +        self.revision = revision
> +        self.benchmarks = {}
> +        self.host_name = {}
> +        self.host_port = {}
> +        self.database = database
> +        self.hg_repository = hg_repository
> +        self.suite_id = self.database.create_suite(revision)
> +        self.work_dir = work_dir
> +        for hostname, port in hosts:
> +            if type(hostname) is str:
> +                host_id = database.get_host_id(hostname)
> +                self.host_name[host_id] = hostname
> +                self.host_port[host_id] = port
> +            else:
> +                self.host_name[hostname] = database.get_host_name(hostname)
> +                self.host_port[hostname] = port
> +    
> +    def setup(self):
> +        print "Setting up Suite"
> +        somehost = self.host_name.values()[0]
> +        
> +        # If user supplied a hg_repository, then check out viff from there.
> +        # TODO: Take care using rm -rf in a script like this!!!
> +        if self.hg_repository:
> +            exec_on_host(self.user, somehost,
> +                         ["rm -rf %s/viff; cd %s; hg clone %s viff" % 
> +                          (self.work_dir, self.work_dir, 
> self.hg_repository)])
> +        
> +        # If user supplied revision, check it out. Otherwise, check out the 
> tip.
> +        if self.revision:
> +            rev = "--rev %s" % self.revision
> +        else:
> +            rev = ""
> +        exec_on_host(self.user, somehost,
> +                     ["cd %s/viff; hg update --clean %s" %
> +                      (self.work_dir, rev)])
> +
> +        # Build VIFF.
> +        exec_on_host(self.user, somehost,
> +                     ["cd %s/viff; python setup.py install --home=%s" % 
> +                      (self.work_dir, self.viff_dir)])
> +        
> +    def teardown(self):
> +        print "Tearing down Suite"
> +        # TODO: Remove local checkout but not the hg clone?
> +    
> +    def add_benchmark(self, benchmark):
> +        """ Note that if the benchmark has already database parameters, e.g.
> +        db_host, db_user, db_password, db_port, db_name, these are used to
> +        report the result. If they are not set, the same database parameters
> +        are used as those given when creating the Suite."""
> +        
> +        # TODO: Hack -> Benchmarks name is derived from class name.
> +        benchmark_name = str(benchmark.__class__).split('.')[-1]

Hmm, well. Is this related to the scp-stuff I think I've seen somewhere
else? If so, then since everybody shares the same filesystem, there
might be an easier and nicer way... I don't know.

> +        # initializing the Suite. Normally,it should be enough to use a
> +        # 'benchmark' user in the database that  has only write access to the
> +        # Result and TimedResult tables.
> +        benchmark.attr['benchmark_id'] = benchmark_id
> +        if not 'db_host' in benchmark.attr.keys():

I think I said this previously, but 'x in d' == 'x in d.keys()'.

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

pgpB73HD6Lzh3.pgp
Description: PGP signature

_______________________________________________
viff-patches mailing list
[email protected]
http://lists.viff.dk/listinfo.cgi/viff-patches-viff.dk

Re: [viff-patches] [PATCH 4 of 7] Added benchmark suite class

Reply via email to