Hello community, here is the log from the commit of package booth for openSUSE:Factory checked in at 2014-02-18 14:43:51 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/booth (Old) and /work/SRC/openSUSE:Factory/.booth.new (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "booth" Changes: -------- --- /work/SRC/openSUSE:Factory/booth/booth.changes 2014-02-01 08:43:13.000000000 +0100 +++ /work/SRC/openSUSE:Factory/.booth.new/booth.changes 2014-02-18 14:43:53.000000000 +0100 @@ -1,0 +2,21 @@ +Mon Feb 17 13:12:42 CET 2014 - [email protected] + +- Enhancements to the unit-tests - more data when they fail. +- A few other fixes +- Version 0.1.5 + +------------------------------------------------------------------- +Wed Feb 12 13:31:21 CET 2014 - [email protected] + +- Tim Serong reported problems, I'd guess because of compiler + optimizations. Converted a macro to a (less clever) function. +- Some other fixes for catchup and ticket re-acquiration. + +------------------------------------------------------------------- +Mon Feb 10 15:25:00 CET 2014 - [email protected] + +- New sub-package that includes all the test scripts and data. + Doesn't work yet; but available to make testing (the packaging + problem) easier. + +------------------------------------------------------------------- ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ booth.spec ++++++ --- /var/tmp/diff_new_pack.Pl0bTQ/_old 2014-02-18 14:43:54.000000000 +0100 +++ /var/tmp/diff_new_pack.Pl0bTQ/_new 2014-02-18 14:43:54.000000000 +0100 @@ -31,11 +31,14 @@ %define pkg_group Productivity/Clustering/HA %endif +%global booth_share %{_prefix}/share/booth +%global test_path %{booth_share}/tests + Name: booth Summary: The Booth Cluster Ticket Manager License: GPL-2.0+ Group: Productivity/Clustering/HA -Version: 0.1.3 +Version: 0.1.5 Release: 0 Source: booth.tar.bz2 Source1: %name-rpmlintrc @@ -85,6 +88,15 @@ gzip < docs/boothd.8 > %{buildroot}/%{_mandir}/man8/booth.8.gz ln %{buildroot}/%{_mandir}/man8/booth.8.gz %{buildroot}/%{_mandir}/man8/boothd.8.gz +#install test-parts + +mkdir -p %{buildroot}/%{test_path} +cp -a unit-tests/ script/unit-test.py README-testing test conf %{buildroot}/%{test_path}/ + +mkdir -p %{buildroot}/%{test_path}/src/ +ln -s %{_sbindir}/boothd %{buildroot}/%{test_path}/src/ +rm -f %{buildroot}/%{test_path}/test/*.pyc + %check %if 0%{?run_build_tests} echo "%%run_build_tests set to %run_build_tests; including tests" @@ -119,4 +131,20 @@ %config %{_sysconfdir}/booth/booth.conf.example %doc README COPYING +%package test +Summary: Test scripts for Booth +Group: Productivity/Clustering/HA +Requires: booth +Requires: python + +%description test +This package contains automated tests for Booth, +the Cluster Ticket Manager for Pacemaker. + +%files test +%defattr(-,root,root) + +%doc README-testing +%{booth_share} + %changelog ++++++ booth.tar.bz2 ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/README-testing new/booth/README-testing --- old/booth/README-testing 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/README-testing 2014-02-17 13:11:17.000000000 +0100 @@ -1,11 +1,14 @@ +There's a booth-test RPM available that contains two types of tests. + +It installs the necessary files into `/usr/share/booth/tests`. === Simple tests (commandline, config file) -Run +Run (as non-root) - # make check-TESTS + # python test/runtests.py to run the tests written in python. @@ -19,7 +22,12 @@ # python script/unit-test.py src/boothd unit-tests/ -This must (currently?) be run as a non-root user. +Or, if using the 'booth-test' RPM, + + # python unit-test.py src/boothd unit-tests/ + +This must (currently?) be run as a non-root user; another optional argument is +the test to start from, eg. '003'. Basically, boothd is started with the config file `unit-tests/booth.conf`, and @@ -72,4 +80,25 @@ * The "fake" `crm_ticket` script gets the current test via `UNIT_TEST`; test scripts can pass additional information via `UNIT_TEST_AUX`. + +==== Tips and Hints + +There's another special header: `gdb__N__`. These lines are sent to GDB after +injecting a message, but before waiting for an outgoing line. Values that +contain `§` are sent as multiple lines to GDB. + +This means that a stanza like + +-------------------- +gdb0: + watch booth_conf->ticket[0].owner § commands § bt § c § end +-------------------- + +will cause a watchpoint to be set, and when it is triggered a backtrace (`bt`) +is written to the log file. + +This makes it easy to ask for additional data or check for a call-chain when +hitting bugs that can be reproduced via such a unit-test. + + # vim: set ft=asciidoc : diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/configure new/booth/configure --- old/booth/configure 2014-01-24 16:07:33.000000000 +0100 +++ new/booth/configure 2014-02-17 12:46:01.000000000 +0100 @@ -1,6 +1,6 @@ #! /bin/sh # Guess values for system-dependent variables and create Makefiles. -# Generated by GNU Autoconf 2.69 for booth 0.1.3. +# Generated by GNU Autoconf 2.69 for booth 0.1.4. # # Report bugs to <[email protected]>. # @@ -580,8 +580,8 @@ # Identity of this package. PACKAGE_NAME='booth' PACKAGE_TARNAME='booth' -PACKAGE_VERSION='0.1.3' -PACKAGE_STRING='booth 0.1.3' +PACKAGE_VERSION='0.1.4' +PACKAGE_STRING='booth 0.1.4' PACKAGE_BUGREPORT='[email protected]' PACKAGE_URL='' @@ -1316,7 +1316,7 @@ # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF -\`configure' configures booth 0.1.3 to adapt to many kinds of systems. +\`configure' configures booth 0.1.4 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... @@ -1386,7 +1386,7 @@ if test -n "$ac_init_help"; then case $ac_init_help in - short | recursive ) echo "Configuration of booth 0.1.3:";; + short | recursive ) echo "Configuration of booth 0.1.4:";; esac cat <<\_ACEOF @@ -1495,7 +1495,7 @@ test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF -booth configure 0.1.3 +booth configure 0.1.4 generated by GNU Autoconf 2.69 Copyright (C) 2012 Free Software Foundation, Inc. @@ -2094,7 +2094,7 @@ This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. -It was created by booth $as_me 0.1.3, which was +It was created by booth $as_me 0.1.4, which was generated by GNU Autoconf 2.69. Invocation command line was $ $0 $@ @@ -2958,7 +2958,7 @@ # Define the identity of the package. PACKAGE='booth' - VERSION='0.1.3' + VERSION='0.1.4' cat >>confdefs.h <<_ACEOF @@ -7745,7 +7745,7 @@ # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" -This file was extended by booth $as_me 0.1.3, which was +This file was extended by booth $as_me 0.1.4, which was generated by GNU Autoconf 2.69. Invocation command line was CONFIG_FILES = $CONFIG_FILES @@ -7815,7 +7815,7 @@ cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ -booth config.status 0.1.3 +booth config.status 0.1.4 configured by $0, generated by GNU Autoconf 2.69, with options \\"\$ac_cs_config\\" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/configure.ac new/booth/configure.ac --- old/booth/configure.ac 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/configure.ac 2014-02-17 13:11:17.000000000 +0100 @@ -4,7 +4,7 @@ # bootstrap / init AC_PREREQ([2.61]) -AC_INIT([booth], [0.1.3], [[email protected]]) +AC_INIT([booth], [0.1.5], [[email protected]]) AM_INIT_AUTOMAKE([-Wno-portability]) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/docs/boothd.8.html new/booth/docs/boothd.8.html --- old/booth/docs/boothd.8.html 2014-01-24 16:15:59.000000000 +0100 +++ new/booth/docs/boothd.8.html 2014-02-17 13:12:37.000000000 +0100 @@ -1,5 +1,5 @@ <?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>boothd</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.78.1" /></head><body><div xml:lang="en" class="refentry" lang="en"><a id="idm140238218814272"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>boothd — The Booth Cluster Ticket Manager.</p></div><div class="refsynopsisdiv"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>boothd</strong></span> <span class="emphasis"><em>daemon</em></span> [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> [<span class="emphasis"><em>client</em></span>] {<span class="emphasis"><em>list</em></span>} [-S <span class="emphasis"><em>site</em></span>] [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> [<span class="emphasis"><em>client</em></span>] {<span class="emphasis"><em>grant</em></span>|<span class="emphasis"><em>revoke</em></span>} [-S <span class="emphasis"><em>site</em></span>] [<span class="emphasis"><em>-D</em></span>] [-t] <span class="emphasis"><em>ticket</em></span> [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> <span class="emphasis"><em>status</em></span> [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p></div><div class="refsect1"><a id="_description"></a><h2>DESCRIPTION</h2><p>Booth manages tickets which authorizes one of the cluster sites located in +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>boothd</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.78.1" /></head><body><div xml:lang="en" class="refentry" lang="en"><a id="idm139789799956336"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>boothd — The Booth Cluster Ticket Manager.</p></div><div class="refsynopsisdiv"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>boothd</strong></span> <span class="emphasis"><em>daemon</em></span> [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> [<span class="emphasis"><em>client</em></span>] {<span class="emphasis"><em>list</em></span>} [-S <span class="emphasis"><em>site</em></span>] [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> [<span class="emphasis"><em>client</em></span>] {<span class="emphasis"><em>grant</em></span>|<span class="emphasis"><em>revoke</em></span>} [-S <span class="emphasis"><em>site</em></span>] [<span class="emphasis"><em>-D</em></span>] [-t] <span class="emphasis"><em>ticket</em></span> [-c <span class="emphasis"><em>config</em></span>]</p><p><span class="strong"><strong>booth</strong></span> <span class="emphasis"><em>status</em></span> [<span class="emphasis"><em>-D</em></span>] [-c <span class="emphasis"><em>config</em></span>]</p></div><div class="refsect1"><a id="_description"></a><h2>DESCRIPTION</h2><p>Booth manages tickets which authorizes one of the cluster sites located in geographically dispersed distances to run certain resources. It is designed to be an add-on to Pacemaker, which extends Pacemaker to support geographically distributed clustering.</p><p>It is based on the PAXOS protocol, see eg. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/script/unit-test.py new/booth/script/unit-test.py --- old/booth/script/unit-test.py 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/script/unit-test.py 2014-02-17 13:11:17.000000000 +0100 @@ -3,14 +3,16 @@ # see http://stackoverflow.com/questions/728891/correct-way-to-define-python-source-code-encoding import os, sys, time, signal, tempfile, socket, posix, time -import re, shutil, pexpect, logging +import re, shutil, pexpect, logging, pprint import random, copy, glob, traceback # Don't make that much sense - function/line is write(). # Would have to use traceback.extract_stack() manually. # %(funcName)10.10s:%(lineno)3d %(levelname)8s -default_log_format = '%(asctime)s: %(message)s' +# The second ":" is to get correct syntax highlightning, +# eg. messages with ERROR etc. are red in vim. +default_log_format = '%(asctime)s: : %(message)s' default_log_datefmt = '%b %d %H:%M:%S' @@ -59,14 +61,19 @@ this_port = None this_site = "127.0.0.1" this_site_id = None + running = False gdb = None booth = None prompt = "CUSTOM-GDB-PROMPT-%d-%d" % (os.getpid(), time.time()) dont_log_expect = 0 + current_nr = None udp_sock = None + + # http://stackoverflow.com/questions/384076/how-can-i-color-python-logging-output + BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE = range(8) # }}} @@ -88,7 +95,10 @@ def read_test_input(self, file, state=None, m = dict()): fo = open(file, "r") state = None + line_nr = 0 for line in fo.readlines(): + line_nr += 1 + # comment? if re.match(r"^\s*#", line): continue @@ -105,6 +115,7 @@ m[state] = dict_plus() if res.group(2): m[state].aux["comment"] = res.group(2) + m[state].aux["line"] = line_nr continue assert(state) @@ -138,6 +149,16 @@ return this_test_log + def running_on_console(self): + return sys.stdout.isatty() + + + def colored_string(self, stg, color): + if self.running_on_console(): + return "\033[%dm%s\033[0m" % (30+color, stg) + return stg + + # We want shorthand in descriptions, ie. "state" # instead of "booth_conf->ticket[0].state". def translate_shorthand(self, name, context): @@ -219,13 +240,14 @@ self.this_port = int(self.query_value("booth_conf->port")) # do a self-test - self.check_value("local->site_id", self.this_site_id); + assert(self.check_value("local->site_id", self.this_site_id)) # Now we're set up. self.send_cmd("break ticket_cron") - self.send_cmd("break booth_udp_send") - self.send_cmd("break booth_udp_broadcast") + self.send_cmd("break booth_udp_send if to == &(booth_conf->site[1])") self.send_cmd("break recvfrom") + + self.running = False # }}} @@ -277,29 +299,40 @@ if val == "1": return True # for easier (test) debugging we'll show the _real_ value, too. + want = self._query_value(value) + # Order is important, so that next query works!! has = self._query_value(which) - logging.error("«%s»: expected «%s», got «%s»." % (which, value, has)) - sys.exit(1) + # for informational purposes + self._query_value('state_to_string($$)') + logging.error("«%s»: got «%s», expected «%s». ERROR." % (which, has, want)) + return False # Send data to GDB, to inject them into the binary. # Handles different data types def set_val(self, name, value, numeric_conv=None): logging.debug("setting value «%s» to «%s» (num_conv %s)" %(name, value, numeric_conv)) + res = None # string value? if re.match(r'^"', value): - self.send_cmd("print strcpy(" + name + ", " + value + ")") + res = self.send_cmd("print strcpy(" + name + ", " + value + ")") # numeric elif numeric_conv: - self.send_cmd("set variable " + name + " = " + numeric_conv + "(" + value + ")") + res = self.send_cmd("set variable " + name + " = " + numeric_conv + "(" + value + ")") else: - self.send_cmd("set variable " + name + " = " + value) + res = self.send_cmd("set variable " + name + " = " + value) + for r in [r"There is no member named", + r"Structure has no component named", + r"No symbol .* in current context", ]: + assert(not re.search(r, res, re.MULTILINE)) logging.debug("set_val %s done" % name) # }}} GDB communication # there has to be some event waiting, so that boothd stops again. def continue_debuggee(self, timeout=30): - res = self.send_cmd("continue", timeout) + res = None + if not self.running: + res = self.send_cmd("continue", timeout) self.drain_booth_log() return res @@ -307,6 +340,10 @@ # {{{ High-level functions. # Generally, GDB is attached to BOOTHD, and has it stopped. def set_state(self, kv): + if not kv: + return + + self.current_nr = kv.aux.get("line") #os.system("strace -f -tt -s 2000 -e write -p" + str(self.gdb.pid) + " &") for n, v in kv.iteritems(): self.set_val( self.translate_shorthand(n, "ticket"), v) @@ -314,24 +351,29 @@ def user_debug(self, txt): - print self.gdb.buffer - print "\n\nProblem detected (%s), entering interactive mode.\n\n" % txt - # can't use send_cmd, doesn't reply with expected prompt anymore. - self.gdb.interact() - #while True: - # sys.stdout.write("GDB> ") - # sys.stdout.flush() - # x = sys.stdin.readline() - # if not x: - # break - # self.send_cmd(x) - self.gdb.sendline("set prompt GDB> \n") - self.gdb.setecho(True) + logging.error("Problem detected: %s", txt) + logging.info(self.gdb.buffer) + if not sys.stdin.isatty(): + logging.error("Not a terminal, stopping.") + else: + print "\n\nEntering interactive mode.\n\n" + self.gdb.sendline("set prompt GDB> \n") + self.gdb.setecho(True) + # can't use send_cmd, doesn't reply with expected prompt anymore. + self.gdb.interact() + #while True: + # sys.stdout.write("GDB> ") + # sys.stdout.flush() + # x = sys.stdin.readline() + # if not x: + # break + # self.send_cmd(x) self.stop_processes() - sys.exit(0) + sys.exit(1) - def wait_for_function(self, fn): + def wait_for_function(self, fn, timeout=20): + until = time.time() + timeout while True: stopped_at = self.continue_debuggee(timeout=3) if not stopped_at: @@ -340,6 +382,8 @@ self.user_debug("Segfault") if re.search(r"^Breakpoint \d+, (0x\w+ in )?%s " % fn, stopped_at, re.MULTILINE): break + if time.time() > until: + self.user_debug("Didn't stop in function %s" % fn) logging.info("Now in %s" % fn) # We break, change the data, and return the correct size. @@ -363,12 +407,15 @@ def wait_outgoing(self, msg): self.wait_for_function("booth_udp_send") + ok = True for (n, v) in msg.iteritems(): if re.search(r"\.", n): - self.check_value( self.translate_shorthand(n, "inject"), v) + ok = self.check_value( self.translate_shorthand(n, "inject"), v) and ok else: - self.check_value( self.translate_shorthand(n, "ticket"), v) + ok = self.check_value( self.translate_shorthand(n, "ticket"), v) and ok + if not ok: + sys.exit(1) logging.info("out gone") #stopped_at = self.sync() @@ -376,78 +423,123 @@ return dict(base.items() + overlay.items()) - def loop(self, data): + def loop(self, fn, data): matches = map(lambda k: re.match(r"^(outgoing|message)(\d+)$", k), data.iterkeys()) valid_matches = filter(None, matches) nums = map(lambda m: int(m.group(2)), valid_matches) loop_max = max(nums) for counter in range(0, loop_max+1): # incl. last message - logging.info("Part " + str(counter)) kmsg = 'message%d' % counter msg = data.get(kmsg) - if msg: - comment = msg.aux.get("comment") or "" - logging.info("sending " + kmsg + " " + comment) - self.send_message(self.merge_dicts(data["message"], msg)) + + ktkt = 'ticket%d' % counter + tkt = data.get(ktkt) + kout = 'outgoing%d' % counter out = data.get(kout) + + kgdb = 'gdb%d' % counter + gdb = data.get(kgdb) + + + if not any([msg, out, tkt]): + continue + + logging.info("Part %d" % counter) + if tkt: + self.current_nr = tkt.aux.get("line") + comment = tkt.aux.get("comment", "") + logging.info("ticket change %s (%s:%d) %s" % (ktkt, fn, self.current_nr, comment)) + self.set_state(tkt) + if msg: + self.current_nr = msg.aux.get("line") + comment = msg.aux.get("comment", "") + logging.info("sending %s (%s:%d) %s" % (kmsg, fn, self.current_nr, comment)) + self.send_message(self.merge_dicts(data["message"], msg)) + if gdb: + for (k, v) in gdb.iteritems(): + self.send_cmd(k + " " + v.replace("§", "\n")) + if data.has_key(kgdb) and len(gdb) == 0: + self.user_debug("manual override") if out: - comment = out.aux.get("comment") or "" - logging.info("waiting for " + kout + " " + comment) + self.current_nr = out.aux.get("line") + comment = out.aux.get("comment", "") + logging.info("waiting for %s (%s:%d) %s" % (kout, fn, self.current_nr, comment)) self.wait_outgoing(out) logging.info("loop ends") def let_booth_go_a_bit(self): self.drain_booth_log() + logging.debug("running: %d" % self.running) - self.gdb.sendline("continue") + if not self.running: + self.gdb.sendline("continue") time.sleep(1) + self.drain_booth_log() # stop it posix.kill(self.booth.pid, signal.SIGINT) - self.drain_booth_log() + posix.kill(self.gdb.pid, signal.SIGINT) + self.running = False + self.sync(2000) def do_finally(self, data): if not data: return + self.current_nr = data.aux.get("line") # Allow debuggee to reach a stable state self.let_booth_go_a_bit() + ok = True for (n, v) in data.iteritems(): - self.check_value( self.translate_shorthand(n, "ticket"), v) + ok = self.check_value( self.translate_shorthand(n, "ticket"), v) and ok + if not ok: + sys.exit(1) - def run(self): + def run(self, start_from="000"): os.chdir(self.test_base) # TODO: sorted, random order tests = filter( (lambda f: re.match(r"^\d\d\d_.*\.txt$", f)), glob.glob("*")) tests.sort() for f in tests: + if f < start_from: + continue log = None + logfn = UT._filename(f) + if self.running_on_console(): + sys.stdout.write("\n") try: - log = self.setup_log(filename = UT._filename(f)) + log = self.setup_log(filename = logfn) log.setLevel(logging.DEBUG) - logging.warn("running test %s" % f) + logging.error(self.colored_string("Starting test '%s'" % f, self.BLUE) + ", logfile " + logfn) self.start_processes(f) test = self.read_test_input(f, m=copy.deepcopy(self.defaults)) - self.set_state(test["ticket"]) - self.loop(test) + logging.debug("data: %s" % pprint.pformat(test, width = 200)) + + self.set_state(test.get("ticket")) + self.loop(f, test) self.do_finally(test.get("finally")) - logging.warn("test %s ends" % f) + + self.current_nr = None + logging.warn(self.colored_string("Finished test '%s' - OK" % f, self.GREEN)) except: - logging.error("Broke in %s: %s" % (f, sys.exc_info())) + logging.error(self.colored_string("Broke in %s:%d %s" % (f, self.current_nr, sys.exc_info()), self.RED)) for frame in traceback.format_tb(sys.exc_traceback): logging.info(" - %s " % frame.rstrip()) finally: self.stop_processes() if log: log.close() - return + logging.getLogger("").removeHandler(log) + if self.running_on_console(): + sys.stdout.write("\n") + return # }}} @@ -475,6 +567,9 @@ datefmt = default_log_datefmt) + # make sure no old processes are active anymore + os.system("killall boothd > /dev/null 2> /dev/null") + overview_log = ut.setup_log( filename = UT._filename('seq') ) overview_log.setLevel(logging.WARN) @@ -489,6 +584,9 @@ #sys.settrace(traceit) - ret = ut.run() + starting = "0" + if len(sys.argv) > 3: + starting = sys.argv[3] + ret = ut.run(starting) sys.exit(ret) # }}} diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/script/wireshark-dissector.lua new/booth/script/wireshark-dissector.lua --- old/booth/script/wireshark-dissector.lua 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/script/wireshark-dissector.lua 2014-02-17 13:11:17.000000000 +0100 @@ -17,36 +17,40 @@ else local hdr = tree:add(booth_proto, buffer(0, 24), "Booth header") - local cmd = buffer(16, 4) + local cmd = buffer(28, 4) local tcmd = T32(hdr, cmd, 0, "Cmd %08x, \"" .. cmd:string() .. "\""); - local from = buffer(8, 4) + local from = buffer(20, 4) local tfrom = T32(hdr, from, 0, "From %08x"); if bit.band(from:uint(), 0x80000000) > 0 then tfrom:add_expert_info(PI_PROTOCOL, PI_WARN, "Highest bit set") end - local len = buffer(12, 4) + local len = buffer(24, 4) local tlen = T32(hdr, len, 0, "Length %8d"); if len:uint() > 1000 then tlen:add_expert_info(PI_PROTOCOL, PI_WARN, "Length too big?") end - T32(hdr, buffer, 20, "Result %08x"); - T32(hdr, buffer, 0, "Magic %08x"); - T32(hdr, buffer, 4, "Version %08x"); + T32(hdr, buffer, 32, "Result %08x"); + T32(hdr, buffer, 12, "Magic %08x"); + T32(hdr, buffer, 16, "Version %08x"); + T32(hdr, buffer, 0, "IV %08x"); + T32(hdr, buffer, 4, "Auth1 %08x"); + T32(hdr, buffer, 8, "Auth2 %08x"); - if (endbuf > 24) then - local tick = tree:add(booth_proto, buffer(24, endbuf-24), "Booth data") - local name = buffer(24, 64) + + if (endbuf > 36) then + local tick = tree:add(booth_proto, buffer(36, endbuf-36), "Booth data") + local name = buffer(36, 64) tick:add(name, "Ticket name: ", name:string()) - T32(tick, buffer, 24+64 + 0, "Owner: %08x") - T32(tick, buffer, 24+64 + 4, "Ballot: %08x") - T32(tick, buffer, 24+64 + 8, "Prev. Ballot: %08x") - T32(tick, buffer, 24+64 + 12, "Expiry: %8d") + T32(tick, buffer, 36+64 + 0, "Owner: %08x") + T32(tick, buffer, 36+64 + 4, "Ballot: %08x") + T32(tick, buffer, 36+64 + 8, "Prev. Ballot: %08x") + T32(tick, buffer, 36+64 + 12, "Expiry: %8d") end pinfo.cols.info = "Booth, cmd " .. cmd:string() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/booth.h new/booth/src/booth.h --- old/booth/src/booth.h 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/booth.h 2014-02-17 13:11:17.000000000 +0100 @@ -66,6 +66,12 @@ struct boothc_header { + /** Authentication data; not used now. */ + uint32_t iv; + uint32_t auth1; + uint32_t auth2; + + /** BOOTHC_MAGIC */ uint32_t magic; /** BOOTHC_VERSION */ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/config.c new/booth/src/config.c --- old/booth/src/config.c 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/config.c 2014-02-17 13:11:17.000000000 +0100 @@ -218,6 +218,7 @@ strcpy(tk->name, name); tk->timeout = def->timeout; tk->expiry = def->expiry; + tk->retries = def->retries; memcpy(tk->weight, def->weight, sizeof(tk->weight)); tk->state = ST_INIT; @@ -320,6 +321,7 @@ defaults.expiry = DEFAULT_TICKET_EXPIRY; defaults.timeout = DEFAULT_TICKET_TIMEOUT; defaults.retries = DEFAULT_RETRIES; + defaults.acquire_after = 0; error = ""; @@ -507,7 +509,7 @@ if (strcmp(key, "acquire-after") == 0) { defaults.acquire_after = strtol(val, &s, 0); - if (*s || s == val || defaults.acquire_after<1) { + if (*s || s == val || defaults.acquire_after<0) { error = "Expected plain integer value >=1 for acquire-after"; goto err; } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/config.h new/booth/src/config.h --- old/booth/src/config.h 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/config.h 2014-02-17 13:11:17.000000000 +0100 @@ -148,7 +148,4 @@ const char *type_to_string(int type); -#define STATE_STRING(s_) ({ union { cmd_request_t s; char c[5]; } d; d.s = htonl(s_); d.c[4] = 0; d.c; }) - - #endif /* _CONFIG_H */ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/inline-fn.h new/booth/src/inline-fn.h --- old/booth/src/inline-fn.h 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/inline-fn.h 2014-02-17 13:11:17.000000000 +0100 @@ -62,6 +62,9 @@ h->magic = htonl(BOOTHC_MAGIC); h->version = htonl(BOOTHC_VERSION); h->from = htonl(local->site_id); + h->iv = htonl(0); + h->auth1 = htonl(0); + h->auth2 = htonl(0); } static inline void init_header(struct boothc_header *h, int cmd, @@ -113,14 +116,18 @@ static inline void disown_ticket(struct ticket_config *tk) { + /* ONLY the "current state" is changed; + * current paxos rounds should not be affected. + * tk->proposed_owner = NULL; + */ tk->owner = NULL; - tk->proposed_owner = NULL; time(&tk->expires); } static inline void disown_if_expired(struct ticket_config *tk) { - if (time(NULL) >= tk->expires || !tk->proposed_owner) + if (time(NULL) >= tk->expires || + (!tk->proposed_owner && !tk->owner)) disown_ticket(tk); } @@ -159,7 +166,14 @@ * current ballot * * This should be possible by using the same datatype and relying - * on the under/overflow semantics. */ + * on the under/overflow semantics. + * + * + * Having 30 bits available, and assuming an expire time of + * one minute and a (high) ballot step of 64 == 2^6 (because + * of weights), we get 2^24 minutes of range - which is ~750 + * years. "Should be enough for everybody." + */ static inline int ballot_is_higher_than(uint32_t b_high, uint32_t b_low) { uint32_t diff; @@ -260,7 +274,7 @@ return 0; time(&now); - return when >= now; + return when <= now; } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/main.c new/booth/src/main.c --- old/booth/src/main.c 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/main.c 2014-02-17 13:11:17.000000000 +0100 @@ -254,6 +254,9 @@ } + /* For CMD_GRANT and CMD_REVOKE: + * Don't close connection immediately, but send + * result a second later? */ switch (ntohl(msg.header.cmd)) { case CMD_LIST: ticket_answer_list(fd, &msg); @@ -1208,14 +1211,10 @@ cl_log_set_facility(HA_LOG_FACILITY); cl_inherit_logging_environment(0); - rv = limit_this_process(); - if (rv) - return rv; - - log_info("BOOTH %s daemon is starting, node id is %08X.", + log_info("BOOTH %s daemon is starting, node id is 0x%08X (%d).", type_to_string(local->type), - local->site_id); + local->site_id, local->site_id); signal(SIGUSR1, (__sighandler_t)tickets_log_info); @@ -1227,6 +1226,11 @@ local->addr_string, booth_conf->port); + rv = limit_this_process(); + if (rv) + return rv; + + rv = loop(lock_fd); out: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/pacemaker.c new/booth/src/pacemaker.c --- old/booth/src/pacemaker.c 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/pacemaker.c 2014-02-17 13:11:17.000000000 +0100 @@ -59,7 +59,7 @@ log_info("command: '%s' was executed", cmd); rv = system(cmd); if (rv != 0) - log_error("error: \"%s\" failed, rv %s", cmd, interpret_rv(rv)); + log_error("error: \"%s\" failed, %s", cmd, interpret_rv(rv)); } static void pcmk_revoke_ticket(struct ticket_config *tk) @@ -72,7 +72,7 @@ log_info("command: '%s' was executed", cmd); rv = system(cmd); if (rv != 0) - log_error("error: \"%s\" failed, rv %s", cmd, interpret_rv(rv)); + log_error("error: \"%s\" failed, %s", cmd, interpret_rv(rv)); } @@ -140,7 +140,7 @@ out: rv = pclose(p); - log_info("command \"%s\" returned rv %s, value %" PRIi64, cmd, interpret_rv(rv), v); + log_info("command \"%s\" returned %s, value %" PRIi64, cmd, interpret_rv(rv), v); return rv; } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/paxos.c new/booth/src/paxos.c --- old/booth/src/paxos.c 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/paxos.c 2014-02-17 13:11:17.000000000 +0100 @@ -88,13 +88,21 @@ static int retries_exceeded(struct ticket_config *tk) { - tk->retry_number ++; - if (tk->retry_number > tk->retries) { + int ret; + + if (tk->retry_number >= tk->retries) { log_info("ABORT %s for ticket \"%s\" - " - "not enough answers after %d retries", + "not enough answers after %d retries (of %d)", tk->state == OP_PREPARING ? "prepare" : "propose", - tk->name, tk->retry_number); + tk->name, tk->retry_number, tk->retries); abort_proposal(tk); + + + /* Keep on trying to refresh. */ + if (owner_and_valid(tk)) + tk->state = ST_STABLE; + + ret = EBUSY; } else { /* We ask others for a change; retry to get * consensus. @@ -102,10 +110,15 @@ * query, give the peers time to answer. */ if (timeval_in_past(tk->proposal_switch)) { ticket_broadcast_proposed_state(tk, tk->state); - ticket_activate_timeout(tk); } + ret = 0; } - return 0; + + + disown_if_expired(tk); + ticket_activate_timeout(tk); + + return ret; } @@ -429,9 +442,20 @@ /* We cannot check whether the packet is from an expected proposer - * perhaps this is the _only_ message of the whole handshake? */ - if (ballot > tk->new_ballot && + if (ballot == tk->last_ack_ballot && + new_owner == tk->owner) { + /* Ignored - just acknowledging the commit. */ + log_info("COMMIT message from \"%s\" received.", from->addr_string); + } else if (ballot > tk->new_ballot && ntohl(msg->ticket.prev_ballot) == tk->last_ack_ballot) { + /* Received a COMMIT, but no prior packets; + * might happen because the network just got + * connected again. */ + log_info("COMMIT message from \"%s\" for new ballot %d received.", + from->addr_string, ballot); change_ticket_owner(tk, ballot, new_owner); + } else { + log_info("commit message from \"%s\" discarded.", from->addr_string); } /* Send ack? */ @@ -458,26 +482,26 @@ * (see paxos_start_round()), which gets received * (below) from the others ... */ switch (cmd) { - case OP_PREPARING: - return answer_PREP(tk, from, msg, ballot, new_owner_p); + case OP_PREPARING: + return answer_PREP(tk, from, msg, ballot, new_owner_p); - case OP_REJECTED: - return handle_REJ(tk, from, msg, ballot, new_owner_p); + case OP_REJECTED: + return handle_REJ(tk, from, msg, ballot, new_owner_p); - case OP_PROMISING: - return got_a_PROM(tk, from, msg, ballot, new_owner_p); + case OP_PROMISING: + return got_a_PROM(tk, from, msg, ballot, new_owner_p); - case OP_PROPOSING: - return answer_PROP(tk, from, msg, ballot, new_owner_p); + case OP_PROPOSING: + return answer_PROP(tk, from, msg, ballot, new_owner_p); - case OP_ACCEPTING: - return got_an_ACC(tk, from, msg, ballot, new_owner_p); + case OP_ACCEPTING: + return got_an_ACC(tk, from, msg, ballot, new_owner_p); - case OP_COMMITTED: - return answer_COMM(tk, from, msg, ballot, new_owner_p); + case OP_COMMITTED: + return answer_COMM(tk, from, msg, ballot, new_owner_p); - default: - log_error("unprocessed message, cmd %x", cmd); - return -EINVAL; + default: + log_error("unprocessed message, cmd %x", cmd); + return -EINVAL; } } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/ticket.c new/booth/src/ticket.c --- old/booth/src/ticket.c 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/ticket.c 2014-02-17 13:11:17.000000000 +0100 @@ -290,14 +290,17 @@ if (!tk->owner) { log_info("client wants to revoke a free ticket \"%s\"", msg->ticket.id); + /* Return a different result code? */ rv = RLT_SUCCESS; goto reply; } rv = do_revoke_ticket(tk); + if (rv == 0) + rv = RLT_ASYNC; reply: - init_ticket_msg(msg, CMR_REVOKE, rv ?: RLT_ASYNC, tk); + init_ticket_msg(msg, CMR_REVOKE, rv, tk); return send_ticket_msg(fd, msg); } @@ -326,6 +329,14 @@ RLT_PROBABLY_SUCCESS : RLT_SUCCESS; init_ticket_msg(msg, CMR_CATCHUP, rv, tk); + + /* On catchup, don't tell about ongoing proposals; + * if we did, the other site might believe that the + * ballot numbers have already been used. + * Send the known ballot number, so that a PREPARE + * gets accepted. */ + msg->ticket.ballot = msg->ticket.prev_ballot; + return booth_udp_send(from, msg, sizeof(*msg)); } @@ -341,11 +352,14 @@ struct booth_site *new_owner) { int rv; + uint32_t prev_ballot; + time_t peer_expiry; log_info("got CATCHUP answer for \"%s\" from %s; says owner %s with ballot %d", tk->name, from->addr_string, ticket_owner_string(new_owner), ballot); + prev_ballot = ntohl(msg->ticket.prev_ballot); rv = ntohl(msg->header.result); if (rv != RLT_SUCCESS && @@ -369,7 +383,7 @@ disown_if_expired(tk); log_debug("catchup: peer ack 0x%" PRIx64 ", now state '%s'", tk->proposal_acknowledges, - STATE_STRING(tk->state)); + state_to_string(tk->state)); goto ex; } @@ -392,9 +406,11 @@ ballot, tk->new_ballot, tk->last_ack_ballot); accept: - tk->expires = ntohl(msg->ticket.expiry) + time(NULL); + peer_expiry = ntohl(msg->ticket.expiry) + time(NULL); + tk->expires = (tk->expires > peer_expiry) ? + tk->expires : peer_expiry; tk->new_ballot = ballot_max2(ballot, tk->new_ballot); - tk->last_ack_ballot = ballot_max2(ballot, tk->last_ack_ballot); + tk->last_ack_ballot = ballot_max2(prev_ballot, tk->last_ack_ballot); tk->owner = new_owner; tk->proposal_acknowledges = from->bitmask; @@ -425,6 +441,16 @@ goto ex; } + + if (ballot >= tk->last_ack_ballot && + local->type == SITE && + new_owner == tk->owner) { + /* We've got some information (local Pacemaker?), and a peer + * says same owner, with same or higher ballot number. */ + log_debug("catchup: peer agrees about owner."); + goto ex; + } + log_debug("catchup: unhandled situation!"); ex: @@ -436,6 +462,9 @@ time(&tk->expires); } + /* Allow further actions. */ + ticket_activate_timeout(tk); + return 0; } @@ -457,7 +486,7 @@ msg.ticket.owner = htonl(get_node_id(tk->proposed_owner)); log_debug("broadcasting '%s' for ticket \"%s\"", - STATE_STRING(state), tk->name); + state_to_string(state), tk->name); /* Switch state after one second, if the majority says ok. */ gettimeofday(&tk->proposal_switch, NULL); @@ -474,6 +503,35 @@ now = time(NULL); + + /* Has an owner, has an expiry date, and expiry date in the past? + * Losing the ticket must happen in _every_ state. */ + if (tk->expires && + tk->owner && + now > tk->expires) { + log_info("LOST ticket: \"%s\" no longer at %s", + tk->name, + ticket_owner_string(tk->owner)); + + /* Couldn't renew in time - ticket lost. */ + tk->owner = NULL; + disown_ticket(tk); + /* This gets us into ST_INIT again; we couldn't + * talk to a majority of sites, so we don't know + * whether somebody else has the ticket now. + * Keep asking until we know. */ + abort_proposal(tk); + + ticket_write(tk); + + ticket_activate_timeout(tk); + + /* May not try to re-acquire now, need to find out + * what others think. */ + return; + } + + switch(tk->state) { case ST_INIT: /* Unknown state, ask others. */ @@ -484,30 +542,6 @@ case OP_COMMITTED: case ST_STABLE: - /* Has an owner, has an expiry date, and expiry date in the past? */ - if (tk->expires && - tk->owner && - now > tk->expires) { - log_info("LOST ticket: \"%s\" no longer at %s", - tk->name, - ticket_owner_string(tk->owner)); - - /* Couldn't renew in time - ticket lost. */ - tk->owner = NULL; - disown_ticket(tk); - /* This gets us into ST_INIT again; we couldn't - * talk to a majority of sites, so we don't know - * whether somebody else has the ticket now. - * Keep asking until we know. */ - abort_proposal(tk); - - ticket_write(tk); - - /* May not try to re-acquire now, need to find out - * what others think. */ - break; - } - /* No matter whether the ticket just got lost by someone, * or whether is wasn't active anywhere - if automatic * acquiration is configured, try to get it active. @@ -521,6 +555,7 @@ !tk->proposed_owner && !tk->proposer && tk->expires && + tk->acquire_after && tk->expires + tk->acquire_after >= now && local->type == SITE) { log_info("ACQUIRE ticket \"%s\" after timeout", tk->name); @@ -607,7 +642,7 @@ "ballot %d (current %d) " "expires %-24.24s", tk->name, - STATE_STRING(tk->state), + state_to_string(tk->state), tk->proposal_acknowledges, booth_conf->site_bits, tk->last_ack_ballot, tk->new_ballot, @@ -692,11 +727,35 @@ ticket_next_cron_at(tk, tv); } else { - /* If there's some owner, check on her later on. - * If no owner - don't care. */ - if (tk->owner) + /* If there is (or should be) some owner, check on her later on. + * If no one is interested - don't care. */ + if ((tk->owner || tk->acquire_after) && + (local->type == SITE)) ticket_next_cron_in(tk, tk->expiry + tk->acquire_after); else ticket_next_cron_in(tk, 3600); } } + + + +/* Given a state (in host byte order), return a human-readable (char*). + * An array is used so that multiple states can be printed in a single printf(). */ +char *state_to_string(uint32_t state_ho) +{ + union mu { cmd_request_t s; char c[5]; }; + static union mu cache[6] = { { 0 } }, *cur; + static int current = 0; + + current ++; + if (current >= sizeof(cache)/sizeof(cache[0])) + current = 0; + + cur = cache + current; + + cur->s = htonl(state_ho); + /* Shouldn't be necessary, union array is initialized with zeroes, and + * these bytes never get written. */ + cur->c[4] = 0; + return cur->c; +} diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/src/ticket.h new/booth/src/ticket.h --- old/booth/src/ticket.h 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/src/ticket.h 2014-02-17 13:11:17.000000000 +0100 @@ -63,6 +63,7 @@ void process_tickets(void); void tickets_log_info(void); +char *state_to_string(uint32_t state_ho); static inline void ticket_next_cron_at(struct ticket_config *tk, struct timeval when) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/test/utils.py new/booth/test/utils.py --- old/booth/test/utils.py 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/test/utils.py 2014-02-17 13:11:17.000000000 +0100 @@ -1,6 +1,7 @@ #!/usr/bin/python import subprocess +import re def run_cmd(cmd): p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) @@ -11,4 +12,5 @@ (stdout, stderr, returncode) = run_cmd(['hostname', '-i']) if returncode != 0: raise RuntimeError, "Failed to run hostname -i:\n" + stderr - return stdout.replace('\n', '') + # in case multiple IP addresses are returned, use only the first. + return re.sub(r'\s.*', '', stdout) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/001_init-catchup.txt new/booth/unit-tests/001_init-catchup.txt --- old/booth/unit-tests/001_init-catchup.txt 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/unit-tests/001_init-catchup.txt 2014-02-17 13:11:17.000000000 +0100 @@ -31,7 +31,4 @@ finally: # should be overwritten last_ack_ballot 2052 -# should not be - a OP_PREPARING would fetch the new value - new_ballot 2012 -# too low-level -# proposal_acknowledges 5 + new_ballot 2062 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/002_bad_packets.txt new/booth/unit-tests/002_bad_packets.txt --- old/booth/unit-tests/002_bad_packets.txt 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/unit-tests/002_bad_packets.txt 2014-02-17 13:11:17.000000000 +0100 @@ -7,7 +7,7 @@ ticket: state ST_INIT last_ack_ballot 1 - new_ballot 2012 + new_ballot 12 # defaults @@ -15,17 +15,18 @@ header.cmd CMR_CATCHUP ticket.ballot 99 ticket.prev_ballot 98 + header.from booth_conf->site[2].site_id + header.result 0 -message0: "bad result code" - header.from booth_conf->site[2].site_id +message0: # bad result code header.result 243521741 outgoing0: state ST_INIT -message1: "bad sender" +message1: # bad sender header.from 71 outgoing1: @@ -33,36 +34,49 @@ -message2: "bad version" +message2: # bad version header.version 512 outgoing2: state ST_INIT -message3: "bad magic" +message3: # bad magic header.version 31 outgoing3: state ST_INIT -message4: "bad length" +message4: # bad length header.length 16 outgoing4: state ST_INIT -message5: "bad ticket ID" +message5: # bad ticket ID ticket.id "gibtsnich" outgoing5: state ST_INIT -message100: "should work" +## TODO: ballot number ranges, discarding invalid. +## see assert in ballot_is_higher_than() +##message6: # bad ballot number +## ticket.ballot -15 + (1 << 29) +## +##outgoing6: +## state ST_INIT + -outgoing100: +message100: # should work + ticket.ballot 199 + ticket.prev_ballot 198 + +# no outgoing message +finally: state ST_STABLE - ticket.ballot 99 + new_ballot 199 + last_ack_ballot 198 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/003_pacemaker.txt new/booth/unit-tests/003_pacemaker.txt --- old/booth/unit-tests/003_pacemaker.txt 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/unit-tests/003_pacemaker.txt 2014-02-17 13:11:17.000000000 +0100 @@ -1,7 +1,6 @@ # vim: ft=sh et : # -# This test is mostly concerned with ignoring invalid packets. -# We're expecting catchup queries. +# Checks whether Pacemaker gets correct command lines. ticket: @@ -10,15 +9,16 @@ new_ballot 50 -message0: "set state" +message0: header.cmd OP_COMMITTED header.from booth_conf->site[2].site_id header.result 0 ticket.ballot 99 - ticket.prev_ballot 98 +# must match last_ack_ballot above + ticket.prev_ballot 40 ticket.owner booth_conf->site[2].site_id - getenv("UNIT_TEST_AUX") "ballot 99 owner +" + #getenv("UNIT_TEST_AUX") "ballot 99 owner +" -outgoing0: +finally: last_ack_ballot 99 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/010_retries.txt new/booth/unit-tests/010_retries.txt --- old/booth/unit-tests/010_retries.txt 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/unit-tests/010_retries.txt 2014-02-17 13:11:17.000000000 +0100 @@ -12,34 +12,41 @@ timeout 1 # may keep ticket all the time expiry 3000 + # but shall start renewal now + expires time(0) + 1000 + outgoing0: - cmd OP_PREPARING + header.cmd OP_PREPARING outgoing1: - cmd OP_PREPARING + header.cmd OP_PREPARING outgoing2: - cmd OP_PREPARING + header.cmd OP_PREPARING outgoing3: - cmd OP_PREPARING + header.cmd OP_PREPARING message4: - cmd OP_PROMISING + header.cmd OP_PROMISING + header.from booth_conf->site[2].site_id + header.result 0 + ticket.prev_ballot 45 + ticket.ballot booth_conf->ticket[0].new_ballot outgoing5: - cmd OP_PROPOSING + header.cmd OP_PROPOSING outgoing6: - cmd OP_PROPOSING + header.cmd OP_PROPOSING outgoing7: - cmd OP_PROPOSING + header.cmd OP_PROPOSING outgoing8: - cmd OP_PROPOSING + header.cmd OP_PROPOSING outgoing9: - cmd OP_PROPOSING + header.cmd OP_PROPOSING outgoing10: - cmd OP_PROPOSING + header.cmd OP_PROPOSING # Now retry counter should trigger, and restart paxos outgoing11: - cmd OP_PREPARING + header.cmd OP_PREPARING diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/060_catchup_same_owner.txt new/booth/unit-tests/060_catchup_same_owner.txt --- old/booth/unit-tests/060_catchup_same_owner.txt 1970-01-01 01:00:00.000000000 +0100 +++ new/booth/unit-tests/060_catchup_same_owner.txt 2014-02-17 13:11:17.000000000 +0100 @@ -0,0 +1,37 @@ +# vim: ft=sh et : +# We've got the ticket; on catchup, peer agrees with us re. owner, but has a +# higher ballot number. +# We must not lose the ticket. + +ticket: + state ST_INIT + last_ack_ballot 100 + new_ballot 100 + owner local + expires time(0) + 35 + +gdb0: + watch booth_conf->ticket[0].owner § commands § bt § c § end + +# No message0 + +outgoing0: + header.cmd CMD_CATCHUP + header.result RLT_SUCCESS + ticket.ballot 100 + ticket.prev_ballot 100 + ticket.owner local->site_id + + +message1: # catchup, same owner + header.cmd CMR_CATCHUP + header.result RLT_SUCCESS + header.from booth_conf->site[2].site_id + ticket.expiry 5 + ticket.ballot 120 + ticket.prev_ballot 120 + ticket.owner local->site_id + +finally: + owner local + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/100_abort-after-retries.txt new/booth/unit-tests/100_abort-after-retries.txt --- old/booth/unit-tests/100_abort-after-retries.txt 1970-01-01 01:00:00.000000000 +0100 +++ new/booth/unit-tests/100_abort-after-retries.txt 2014-02-17 13:11:17.000000000 +0100 @@ -0,0 +1,40 @@ +# vim: ft=sh et : +# +# Testing whether retries are aborted at some time. + + +ticket: + state ST_STABLE + last_ack_ballot 40 + new_ballot 50 + retries 6 + timeout 1 + owner local + expiry 3000 + # but renewing is necessary + expires time(0) + 100 + next_cron time(0) + 1 + + +outgoing0: + header.cmd OP_PREPARING +outgoing1: + header.cmd OP_PREPARING +outgoing2: + header.cmd OP_PREPARING +outgoing3: + header.cmd OP_PREPARING + +# Now give cause to abort. +ticket4: + expires time(0) - 2 + retry_number 10 + timeout 2 +outgoing4: + header.cmd CMD_CATCHUP + + +# ticket must be lost +finally: + owner 0 + state ST_INIT diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/booth/unit-tests/_defaults.txt new/booth/unit-tests/_defaults.txt --- old/booth/unit-tests/_defaults.txt 2014-01-24 16:15:52.000000000 +0100 +++ new/booth/unit-tests/_defaults.txt 2014-02-17 13:11:17.000000000 +0100 @@ -9,7 +9,7 @@ ## these would matter if testing via GDB had high latencies #expiry 60 #timeout 10 - #acquire_after 30 + acquire_after 0 -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
