This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new 2f7323960 IMPALA-12045: Strip ANSI escape sequences for JUnitXML
2f7323960 is described below

commit 2f73239607b3497c8ae11cdb2fdbc4571aeb88bd
Author: Joe McDonnell <[email protected]>
AuthorDate: Wed Apr 5 16:55:03 2023 -0700

    IMPALA-12045: Strip ANSI escape sequences for JUnitXML
    
    ANSI escape sequences do a variety of actions in the
    terminal like adding color to compilation warnings.
    generate_junitxml.py currently hits an error when trying
    to generate JUnitXML for compilation output that contains
    ANSI escape sequences.
    
    This changes generate_junitxml.py to strip ANSI
    escape sequences from the strings incorporated into
    JUnitXML (e.g. the error output of a compiler).
    The solution is based off the discussion at:
    https://stackoverflow.com/questions/14693701
    
    Testing:
     - A case where generate_junitxml.py was failing to
       generate JUnitXML now generates valid JUnitXML.
       The output still contains all the compiler warnings
       and information needed to diagnose the issue.
    
    Change-Id: I9654a6b13350cb9582ec908b8807b630636a1ed0
    Reviewed-on: http://gerrit.cloudera.org:8080/19708
    Reviewed-by: Michael Smith <[email protected]>
    Reviewed-by: Wenzhe Zhou <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
---
 lib/python/impala_py_lib/jenkins/generate_junitxml.py | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/lib/python/impala_py_lib/jenkins/generate_junitxml.py 
b/lib/python/impala_py_lib/jenkins/generate_junitxml.py
index 1f6ba6b60..09d6ab7b4 100755
--- a/lib/python/impala_py_lib/jenkins/generate_junitxml.py
+++ b/lib/python/impala_py_lib/jenkins/generate_junitxml.py
@@ -26,6 +26,7 @@ import argparse
 import codecs
 import errno
 import os
+import re
 import textwrap
 from xml.dom import minidom
 from xml.etree import ElementTree as ET
@@ -170,6 +171,22 @@ class JunitReport(object):
 
     return junit_log_file
 
+  @staticmethod
+  def remove_ansi_escape_sequences(string):
+    """
+    Remove ANSI escape sequences from this string.
+
+    ANSI escape sequences customize terminal output by adding colors, etc.
+    Compilers use them to add color to error messages. ANSI escape
+    sequences interfere with producing the JUnitXML (and do not add any
+    value for JUnitXML), so this function strips them.
+
+    See https://stackoverflow.com/questions/14693701 for more information
+    on this solution.
+    """
+    ansi_escape = re.compile(r'\x1B(?:[@-Z\\-_]|\[[0-?]*[ -/]*[@-~])')
+    return ansi_escape.sub('', string)
+
   @staticmethod
   def get_xml_content(file_or_string=None):
     """
@@ -196,7 +213,7 @@ class JunitReport(object):
       # This is a string passed in on the command line. Make sure to return it 
as
       # a unicode string.
       content = unicode(file_or_string, encoding="UTF-8")
-    return content
+    return JunitReport.remove_ansi_escape_sequences(content)
 
   def __unicode__(self):
     """

Reply via email to