This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 19678ae65c96b09999cbe728cb0e7e1ae2301fd2
Author: Joe McDonnell <[email protected]>
AuthorDate: Sat Oct 12 10:25:39 2024 -0700

    IMPALA-13431: Deflake TestLogging.test_excessive_cerr_ignore_pid
    
    On a couple UBSAN runs, test_excessive_cerr_ignore_pid sometimes
    fails find the message providing the next path in the last line
    of the ERROR log file. The logs aren't preserved, so we don't know
    the exact contents of the log file.
    
    This does two things:
    1. It changes the test to preserve the log file on failure by
       copying from the temporary directory to a directory that
       will last past the end of the test. This gives us data to
       work with if we see this again.
    2. A theory is that an extra line or two of logging could go to
       the file after it writes the message with the next path.
       This changes the test to check the last 3 lines of the log
       file for the message providing the next path.
    
    Testing:
     - Ran test with UBSAN
    
    Change-Id: I4745184e983ee5669822059289aab18caf0b72a9
    Reviewed-on: http://gerrit.cloudera.org:8080/21926
    Reviewed-by: Michael Smith <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
---
 tests/custom_cluster/test_breakpad.py | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/tests/custom_cluster/test_breakpad.py 
b/tests/custom_cluster/test_breakpad.py
index c8348828e..2ed6438d8 100644
--- a/tests/custom_cluster/test_breakpad.py
+++ b/tests/custom_cluster/test_breakpad.py
@@ -433,12 +433,30 @@ class TestLoggingBase(TestBreakpadBase):
           if i < len(paths) - 1:
             # check that we print the next_path in last line of this log file
             next_path = paths[i + 1]
-            with open(curr_path, 'rb') as f:
-              f.seek(-2, os.SEEK_END)
-              while f.read(1) != b'\n':
-                f.seek(-2, os.SEEK_CUR)
-              last_line = f.readline().decode()
-              assert next_path in last_line
+            with open(curr_path, 'r') as f:
+              lines = f.readlines()
+              # There have been some test failures where the next_path is not 
in the last
+              # line of the output. One theory is that some output could have 
gone to the
+              # file after the next_path has been written. This tolerates 
having the
+              # next path in the last 3 lines. It also preserves the logs if 
there is a
+              # failure.
+              found_next_path = False
+              NUM_LINES_TO_CHECK = 3
+              for i in range(max(NUM_LINES_TO_CHECK, len(lines))):
+                if next_path in lines[-i]:
+                  found_next_path = True
+                  break
+              if not found_next_path:
+                # These logs are in a temporary directory. To improve 
debuggability,
+                # copy the logs to a location that would be preserved.
+                preserved_log_dir = os.getenv("LOG_DIR", "/tmp/")
+                preserved_path = os.path.join(preserved_log_dir,
+                    os.path.basename(curr_path))
+                shutil.copyfile(curr_path, preserved_path)
+                msg = "Did not find {0} in the last {1} lines of {2}.".format(
+                    next_path, NUM_LINES_TO_CHECK, curr_path)
+                msg += " Preserved the log contents at 
{0}".format(preserved_path)
+                assert False, msg
         except OSError:
           # The daemon might delete the log in the middle of assertion.
           # In that case, do nothing and move on.

Reply via email to