Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16690
Change subject: IMPALA-9864: Produce a minidump when TestValidateMetrics fails ...................................................................... IMPALA-9864: Produce a minidump when TestValidateMetrics fails After running end-to-end tests, run-tests.py runs verifiers to check that a set of metrics are zero. When this fails, it can indicate a hung query fragment or other resource leak (see IMPALA-9842 for example). To track this down, it is useful to have a minidump, so this adds a step to have every Impalad generate a minidump (by sending SIGUSR1) when we hit the timeout. Also, the current error message dumps a bunch of unformatted JSON from our Web UI. This is hard to read and painful to cut/paste. This now dumps that JSON to files in a diagnostic directory under the logs directory. The JSON is formatted in a readable way. These files would be preserved along with the rest of the logs directory for automated runs. The new error message looks like this: E AssertionError: Metric impala-server.num-queries-registered did not reach value 0 in 60s. E Dumping debug webpages in JSON format... E Dumped memz JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/memz.json E Dumped metrics JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/metrics.json E Dumped queries JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/queries.json E Dumped sessions JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/sessions.json E Dumped threadz JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/threadz.json E Dumped rpcz JSON to $IMPALA_HOME/logs/metric_timeout_diags_1604359071/json/rpcz.json E Dumping minidumps for 3 running impalads... E Dumped minidump for PID 2709 E Dumped minidump for PID 2714 E Dumped minidump for PID 2721 Testing: - Tried out the dump function on my developer machine - Verified the minidumps exist - Verified the JSON is readable Change-Id: I16d26052d0664ee0b115e3611cd96047d8ada19d --- M tests/common/impala_service.py 1 file changed, 62 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/16690/1 -- To view, visit http://gerrit.cloudera.org:8080/16690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I16d26052d0664ee0b115e3611cd96047d8ada19d Gerrit-Change-Number: 16690 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <[email protected]>
