This is an automated email from the ASF dual-hosted git repository.
michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
The following commit(s) were added to refs/heads/master by this push:
new 76847fb03 IMPALA-13291: Filter dmesg messages by date
76847fb03 is described below
commit 76847fb03d9cc92530f97517a4481993392f331a
Author: Andrew Sherman <[email protected]>
AuthorDate: Thu Aug 8 17:20:46 2024 -0700
IMPALA-13291: Filter dmesg messages by date
At the end of a test run, one of the things finalize.sh does is to look
for interesting messages in the output of dmesg. Recently we had the
issue where it was reporting false positives. This was because the
dmesg output covers the history since the last machine reboot.
Add an optional parameter to finalize.sh which gives the start time of
the test run in the format "2012-10-30 18:17:16". This parameter is
optional until all callers have been updated, some of which may be in
different git repositories.
Switch to using journalctl to fetch the dmesg output. This allows use of
the --since option to filter the messages starting at the given
timestamp. When this is used we should not see the false positives form
earlier test runs on the same machine.
Change-Id: I7ac9c16dfe1c60f04e117dd634609f03faa3c3dc
Reviewed-on: http://gerrit.cloudera.org:8080/21705
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>
---
bin/jenkins/all-tests.sh | 5 ++++-
bin/jenkins/finalize.sh | 34 +++++++++++++++++++++++++++-------
2 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/bin/jenkins/all-tests.sh b/bin/jenkins/all-tests.sh
index 41eef97ac..6624a7dae 100644
--- a/bin/jenkins/all-tests.sh
+++ b/bin/jenkins/all-tests.sh
@@ -22,6 +22,9 @@ set -euo pipefail
. $IMPALA_HOME/bin/report_build_error.sh
setup_report_build_error
+# Start time of run.
+START_TIME=$(date +"%Y-%m-%d %H:%M:%S")
+
cd "${IMPALA_HOME}"
export IMPALA_MAVEN_OPTIONS="-U"
@@ -90,5 +93,5 @@ fi
# Always shutdown minicluster at the end and run finalize.sh
testdata/bin/kill-all.sh
-bin/jenkins/finalize.sh
+bin/jenkins/finalize.sh "${START_TIME}"
exit $RET_CODE
diff --git a/bin/jenkins/finalize.sh b/bin/jenkins/finalize.sh
index a0b3380b6..8af4b0123 100755
--- a/bin/jenkins/finalize.sh
+++ b/bin/jenkins/finalize.sh
@@ -21,6 +21,13 @@
set -euo pipefail
trap 'echo Error in $0 at line $LINENO: $(cd "'$PWD'" && awk "NR == $LINENO"
$0)' ERR
+START_TIME=""
+if [ "$#" -eq 1 ]
+then
+ # START_TIME is an optional parameter which gives the start time of the test
run.
+ START_TIME="$1"
+fi
+
if test -v CMAKE_BUILD_TYPE && [[ "${CMAKE_BUILD_TYPE}" =~ 'UBSAN' ]] \
&& [ "${UBSAN_FAIL}" = "error" ] \
&& { grep -rI ": runtime error: " "${IMPALA_HOME}/logs" 2>&1 | sort | uniq
\
@@ -32,17 +39,30 @@ fi
rm -rf "${IMPALA_HOME}"/logs_system
mkdir -p "${IMPALA_HOME}"/logs_system
-# Tolerate dmesg failures. dmesg can fail if there are insufficient
permissions.
-if dmesg > "${IMPALA_HOME}"/logs_system/dmesg 2>/dev/null ||
- sudo dmesg > "${IMPALA_HOME}"/logs_system/dmesg; then
- # Check dmesg for OOMs and generate a symptom if present.
- if [[ $(grep "Out of memory" "${IMPALA_HOME}"/logs_system/dmesg) ]]; then
+# Check dmesg output for OOMs and generate a symptom if present.
+DID_JOURNALCTL=false
+if [ -n "${START_TIME}" ]
+then
+ # Restrict the dmesg output by the start time of the test run.
+ if sudo journalctl --dmesg --since="${START_TIME}" > \
+ "${IMPALA_HOME}"/logs_system/journalctl 2>/dev/null; then
+ DID_JOURNALCTL=true
+ fi
+else
+ if sudo journalctl --dmesg > \
+ "${IMPALA_HOME}"/logs_system/journalctl 2>/dev/null; then
+ DID_JOURNALCTL=true
+ fi
+fi
+
+if [[ "${DID_JOURNALCTL}" == "true" ]]; then
+ if [[ $(grep "Out of memory" "${IMPALA_HOME}"/logs_system/journalctl) ]];
then
"${IMPALA_HOME}"/bin/generate_junitxml.py --phase finalize --step dmesg \
- --stdout "${IMPALA_HOME}"/logs_system/dmesg --error "Process was OOM
killed."
+ --stdout "${IMPALA_HOME}"/logs_system/journalctl --error "Process was
OOM killed."
fi
else
- echo "Failed to run dmesg, not checking for OOMs"
+ echo "Failed to run journalctl, not checking for OOMs"
fi
# Check for any minidumps and symbolize and dump them.