hey folks! I mentioned this to jskladan on IRC, but just for the permanent record, I'm working on optional crash report submission for openQA.
at first I had the workers clicking through the graphical report submission process, but that has several problems: a) needles and keypresses and blah b) workers don't actually know the job ID or URL, so can't include it in the bug report c) requires inventing some kind of way to get a BZ username and password into the workers without it being logged (doable, but just unnecessary work, when libreport-plugin-bugzilla already has this set up) so instead I'm doing it in report_job_results.py in openqa_fedora_tools. It actually builds off D310, Jan's improvement to upload the contents of /var/tmp after a crash. Given a job_id, we check if there's a var_tmp.tar.gz for that job, and if there is, we look for libreport 'problem directories' inside it. If we find any, we extract them from the tarball and run 'reporter- bugzilla -d (directory)' on them. That's really it in a nutshell, the rest is just error checks and glue and frills. There's an attempt to include the web UI job URL in the bug report for new crash reports (though so far I've been testing with a problem directory that shows up as a dupe of an existing report, so I haven't tested this yet), and we capture the IDs of the bugs reported. I also refactored the reporting functions a bit to avoid code duplication between calling report_job_results directly and using it from openqa_trigger, and made it possible to specify the openQA URL in a config file (so you can do result reporting from a system other than the openQA host itself - like, fr'instance, a Fedora system with libreport-plugin-bugzilla installed...) To test it out you need a job in some openQA instance which has a var_tmp.tar.gz with a crash directory inside it: I've been testing with https://openqa.happyassassin.net/tests/2736 . You also need to put a valid BZ username and password in /etc/libreport/plugins/bugzilla.conf and, unless you're running on the openQA host itself (there *are* libreport packages for openSUSE in some OBS repository, but I haven't tried them), you'll want to create /etc/openqa_fedora.conf with this content: [site] url = https://openqa.happyassassin.net (or whatever URL is appropriate). Then you can do this: python report_job_results.py --crashes 2736 (or whatever the job ID is). This probably still needs a bit more testing and polish before I submit it as a differential, but I wanted to give people a heads-up that I was working on it and explain the general design. My current patch (against 'develop' branch, to which I've merged the 'live' work now) is attached. In case you're wondering what happens with duplicate reports: I tested and it seems like 'not a lot'. When calling reporter-bugzilla in this way, if the crash has already been reported, it will only generate BZ activity if the BZ account in question isn't already on the CC list: it will add it. But if the BZ account is already on the CC list, it doesn't change the bug at all, it doesn't add the extra comment saying 'another user encountered this issue'. I checked libreport and it actually only does that when some comment text has been provided, and we aren't providing one, so it gets skipped. If we're still worried about noise on dupes it *is* possible to test if a bug is a dupe by checking the output of: reporter-bugzilla -h $(cat duphash) and completely skip the report submission step if it is, and I actually had that written, but took it out as it seemed unnecessary. Easy enough to put it back if we want to, though. In the current version of the patch things are set up so that openqa_trigger current or openqa_trigger all or openqa_trigger compose --submit-results runs will try and report all crashes, but it's absolutely trivial to change that if we only want to report crashes via a separate invocation. Comments welcome! -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net
From aa20bc32c79004269c2e10fdaf0fc69ed3ae4d81 Mon Sep 17 00:00:00 2001 From: Adam Williamson <[email protected]> Date: Wed, 18 Mar 2015 20:58:43 -0700 Subject: [PATCH] allow reporting of crashes This provides report_job_results with the ability to file bugs for crashes encountered during tests. It requires a pending change to openqa_fedora which uploads the contents of /var/tmp for failed tests; libreport 'problem directories' are located here. report_crash() grabs the var_tmp.gz for the given job (if it's available), extracts any problem directories found, and runs reporter-bugzilla (with a slightly tweaked config file) on them. It then finds the bug URL(s) and returns a list of them. --- tools/openqa_trigger/report_job_results.py | 186 +++++++++++++++++++++++++---- 1 file changed, 163 insertions(+), 23 deletions(-) diff --git a/tools/openqa_trigger/report_job_results.py b/tools/openqa_trigger/report_job_results.py index ef58fd7..d53c450 100644 --- a/tools/openqa_trigger/report_job_results.py +++ b/tools/openqa_trigger/report_job_results.py @@ -1,18 +1,123 @@ import requests import argparse +import ConfigParser import os +import pprint +import re +import subprocess32 as subprocess +import tarfile +import tempfile import time import conf_test_suites +# Allow openQA URL to be specified in a config file, for submitting +# reports from a system other than the openQA host. +CONFIG = ConfigParser.SafeConfigParser() +CONFIG.read('{0}/openqa_fedora.conf'.format(path) for path in ('/etc', os.path.expanduser('~'))) +try: + SITEURL = CONFIG.get('site', 'url') +except (ConfigParser.NoSectionError, ConfigParser.NoOptionError): + SITEURL = 'http://localhost' -API_ROOT = "http://localhost/api/v1" +API_ROOT = "{0}/api/v1".format(SITEURL) SLEEPTIME = 60 +def report_crash(job_id): + """ + job_id ~ int (job id) + Returns ~ list of int - bug IDs, if new reports are successfully + submitted or dupes are found. List will be empty if no dupes are + found and report submission fails. -def get_passed_testcases(job_ids): + Report each problem directory found in the var_tmp tarball for a + job to Bugzilla, via the command reporter-bugzilla. + """ + # reporter-bugzilla always prints a line identifying the 'final' bug + # ID (after dupe detection etc) in this form. Captures the bug ID. + bug_regex = re.compile(r'Status.*show_bug\.cgi\?id=(\d{6,8})') + # openqa_fedora uploads this when a crash is detected. + tarurl = "{0}/tests/{1}/file/var_tmp.tar.gz".format(SITEURL, job_id) + probdirs = [] + bugs = [] + tmpdir = tempfile.mkdtemp() + # Check BZ username and password are configured. + try: + user = pw = '' + bzconf = open('/etc/libreport/plugins/bugzilla.conf', 'r') + for line in bzconf: + if line.startswith('Login'): + user = line.split('=')[-1].strip() + if line.startswith('Password'): + pw = line.split('=')[-1].strip() + except IOError: + pass + if not user or not pw: + print("Bugzilla user name and password must be set in " + "/etc/libreport/plugins/bugzilla.conf and readable by the " + "user running this command! Cannot report crashes!") + return bugs + + try: + with open('{0}/var_tmp.gz'.format(tmpdir), 'w') as varfile: + varfile.write(requests.get(tarurl).content) + vartmp = tarfile.open('{0}/var_tmp.gz'.format(tmpdir)) + except IOError: + # This job has no var_tmp archive. Abort. + return bugs + members = vartmp.getmembers() + for member in members: + if member.isfile and member.name.endswith('/duphash'): + # The directory this file is inside is a 'problem directory'. + dirname = member.name.replace('/duphash', '') + try: + # Skip any problem dir that has somehow already been reported. + vartmp.getmember('{0}/reported_to'.format(dirname)) + continue + except KeyError: + probdirs.append(dirname) + if not probdirs: + # Job has a var_tmp archive, but we didn't find any problem dirs. + return bugs + + # Extract all problem directories and their contents from the archive. + toget = (mem for mem in members if + any(mem.name.startswith(pd) for pd in probdirs)) + vartmp.extractall(path=tmpdir, members=toget) + + for probdir in probdirs: + path = '{0}/{1}'.format(tmpdir, probdir) + # Write the job's URL into the problem directory, so reporter-bugzilla + # can include it in the bug report later. + jobfile = open('{0}/openqajob'.format(path), 'w') + jobfile.write('{0}/tests/{1}'.format(SITEURL, job_id)) + jobfile.close() + # This is our slightly tweaked bug format config file which marks + # the report as coming from openQA and includes the URL. + conf = '{0}/bugzilla_format.conf'.format(os.path.realpath(__file__)) + args = ('reporter-bugzilla', '-d', path, '-F', conf) + try: + output = subprocess.check_output(args, stderr=subprocess.STDOUT).decode().splitlines() + except OSError: + # probably means the command isn't available. + print("reporter-bugzilla not installed? Cannot report crashes!") + return bugs + # Find the bug ID from the output. Should only ever be one, but + # let's handle more just in case. + for line in output: + match = bug_regex.search(line) + if match: + bugs.append(int(match.group(1))) + + return bugs + +def _wait_for_jobs(job_ids): """ job_ids ~ list of int (job ids) - Returns ~ list of str - names of passed testcases + Returns ~ dict, keys int job id, values string job state + + Wait for all jobs to finish, then return a dict keyed on the job + IDs with the value for each job being the dict produced by parsing + the JSON state information provided by the API for that job. """ running_jobs = dict([(job_id, "%s/jobs/%s" % (API_ROOT, job_id)) for job_id in job_ids]) finished_jobs = {} @@ -27,7 +132,18 @@ def get_passed_testcases(job_ids): if running_jobs: time.sleep(SLEEPTIME) + return finished_jobs + +def get_passed_testcases(job_ids): + """ + job_ids ~ list of int (job ids) + Returns ~ list of str - names of passed testcases + + Wait for all jobs to finish, then derive a dict providing information + on which Wikitcms test cases / 'test instances' we have passes for. + """ passed_testcases = {} # key = VERSION_BUILD_ARCH + finished_jobs = _wait_for_jobs(job_ids) for job_id in job_ids: job = finished_jobs[job_id] if job['result'] =='passed': @@ -39,6 +155,16 @@ def get_passed_testcases(job_ids): passed_testcases[key] = sorted(list(set(value))) return passed_testcases +def get_failed_jobs(job_ids): + """ + job_ids ~ list of int (job ids) + Returns ~ list of int - ids of only jobs which failed + + Wait for all jobs to finish, then return a list of only the IDs of + jobs which failed. + """ + finished_jobs = _wait_for_jobs(job_ids) + return [jid for jid in job_ids if finished_jobs[jid]['result'] == 'failed'] def get_relval_commands(passed_testcases): relval_template = "relval report-auto" @@ -60,32 +186,46 @@ def get_relval_commands(passed_testcases): return commands +def report_crashes(job_ids): + """ + job_ids ~ list of int (job ids) -def report_results(job_ids): - commands = get_relval_commands(get_passed_testcases(job_ids)) - print "Running relval commands:" - for command in commands: - print command - os.system(command) + For each job specified, try and report any crashes that happened to + Bugzilla. + """ + bugs = [] + for job_id in job_ids: + bugs.extend(report_crash(job_id)) + if bugs: + print "Reported bugs:" + for bug in bugs: + print "https://bugzilla.redhat.com/show_bug.cgi?id={0}".format(bug) + +def report_passes(job_ids, printcases=False, report=True): + passed_testcases = get_passed_testcases(job_ids) + if printcases: + pprint.pprint(passed_testcases) + commands = get_relval_commands(passed_testcases) + if report: + print "Reporting test passes:" + for command in commands: + print command + os.system(command) + else: + print "\n\n### No reporting is done! ###\n\n" + pprint.pprint(commands) +def report_results(job_ids): + report_passes(job_ids) + report_crashes(job_ids) if __name__ == "__main__": parser = argparse.ArgumentParser(description="Evaluate per-testcase results from OpenQA job runs") parser.add_argument('jobs', type=int, nargs='+') parser.add_argument('--report', default=False, action='store_true') + parser.add_argument('--crashes', default=False, action='store_true') args = parser.parse_args() - - passed_testcases = get_passed_testcases(args.jobs) - commands = get_relval_commands(passed_testcases) - - import pprint - pprint.pprint(passed_testcases) - if not args.report: - print "\n\n### No reporting is done! ###\n\n" - pprint.pprint(commands) - else: - for command in commands: - print command - os.system(command) - + report_passes(args.jobs, printcases=True, report=args.report) + if args.crashes: + report_crashes(args.jobs) -- 2.3.2
_______________________________________________ qa-devel mailing list [email protected] https://admin.fedoraproject.org/mailman/listinfo/qa-devel
