[jira] [Comment Edited] (IMPALA-12311) Extra newlines are produced when an end-to-end test is run with update_results

Fang-Yu Rao (Jira) Tue, 25 Jul 2023 18:27:04 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747223#comment-17747223
 ]


Fang-Yu Rao edited comment on IMPALA-12311 at 7/26/23 1:26 AM:
---------------------------------------------------------------

I have verified that for the subsections of ERRORS, TYPES, and LABELS, 
additional newlines in the end of a subsection are okay. However, any 
additional newline added to the subsection of RESULTS would fail the test case. 
According to what we have seen here, it should be safe to just not output a 
trailing newline in 
[join_section_lines()|https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py#L294]

+*Subsection of ERRORS*+

For the subsection of ERRORS, we found that the expected error message is 
post-processed by the following at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L328-L333],
 which removes extra newlines. Note that for a line 'expected_error' containing 
only a newline, 'expected_error' evaluates to false.
{code:java}
  for expected_error in expected_errors:
    if not expected_error: continue
    if ROW_REGEX_PREFIX.match(expected_error):
      converted_expected_errors.append(expected_error)
    else:
      converted_expected_errors.append("'%s'" % expected_error)
{code}
On the other hand, the actual results (exec_result.log at 
[https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py#L298])
 contain 2 newlines. Since we use the following to create the respective 
QueryTestResult in 
[verify_errors()|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L324],
 additional newlines are also removed. Note that a line 'l' evaluates to false 
if it's a newline.
{code:java}
  actual = QueryTestResult(["'%s'" % l for l in actual_errors if l], ['STRING'],
      ['DUMMY_LABEL'], order_matters=False)
{code}
That is, additional newlines are removed from both the expected error message 
and the actual error message. Hence, additional newlines added in the 
subsection of ERRORS are okay.

*+Subsection of TYPES+*

Moreover, additional newlines in the subsection of TYPES are okay since we 
remove additional newlines when constructing the expected line of types at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L406].
{code:java}
    expected_types = [c.strip().upper()
                      for c in remove_comments(section).rstrip('\n').split(',')]
{code}
*+Subsection of LABELS+*

Additional newlines in the subsection of LABELS are okay as well because we use 
the following to construct the expected line of labels at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L446].
{code:java}
      expected_labels = [c.strip().upper() for c in 
test_section['LABELS'].split(',')]
{code}


was (Author: fangyurao):
I have verified that for the subsections of ERRORS, TYPES, and LABELS, 
additional newlines in the end of a subsection are okay. However, any 
additional newline added to the subsection of RESULTS would fail the test case. 
According to what we have seen here, it should be safe to just not output a 
trailing newline in [join_section_lines()]

+*Subsection of ERRORS*+

For the subsection of ERRORS, we found that the expected error message is 
post-processed by the following at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L328-L333],
 which removes extra newlines. Note that for a line 'expected_error' containing 
only a newline, 'expected_error' evaluates to false.
{code:java}
  for expected_error in expected_errors:
    if not expected_error: continue
    if ROW_REGEX_PREFIX.match(expected_error):
      converted_expected_errors.append(expected_error)
    else:
      converted_expected_errors.append("'%s'" % expected_error)
{code}
On the other hand, the actual results (exec_result.log at 
[https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py#L298])
 contain 2 newlines. Since we use the following to create the respective 
QueryTestResult in 
[verify_errors()|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L324],
 additional newlines are also removed. Note that a line 'l' evaluates to false 
if it's a newline.
{code:java}
  actual = QueryTestResult(["'%s'" % l for l in actual_errors if l], ['STRING'],
      ['DUMMY_LABEL'], order_matters=False)
{code}
That is, additional newlines are removed from both the expected error message 
and the actual error message. Hence, additional newlines added in the 
subsection of ERRORS are okay.

*+Subsection of TYPES+*

Moreover, additional newlines in the subsection of TYPES are okay since we 
remove additional newlines when constructing the expected line of types at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L406].
{code:java}
    expected_types = [c.strip().upper()
                      for c in remove_comments(section).rstrip('\n').split(',')]
{code}
*+Subsection of LABELS+*

Additional newlines in the subsection of LABELS are okay as well because we use 
the following to construct the expected line of labels at 
[https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L446].
{code:java}
      expected_labels = [c.strip().upper() for c in 
test_section['LABELS'].split(',')]
{code}

> Extra newlines are produced when an end-to-end test is run with 
> update_results 
> -------------------------------------------------------------------------------
>
>                 Key: IMPALA-12311
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12311
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.1.2
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Minor
>              Labels: test-infra
>
> We found that extra newlines are produced in the updated golden file when the 
> actual results do not match the expected results specified in the original 
> golden file.
> Take 
> [TestDecimalExprs::test_exprs()|https://github.com/apache/impala/blob/master/tests/query_test/test_decimal_queries.py#L75]
>  for example, this test runs the test cases in 
> [decimal-exprs.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test].
> Suppose that we modify the expected error message at 
> [https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test#L107]
>  from "UDF WARNING: Decimal expression overflowed, returning NULL" to the 
> following (the original string with an additional "x").
> {noformat}
> UDF WARNING: Decimal expression overflowed, returning NULLx
> {noformat}
> Then we run this test using the following command with the command line 
> argument '--update_results'.
> {code:java}
> $IMPALA_HOME/bin/impala-py.test \
> --update_results \
> --junitxml=$IMPALA_EE_TEST_LOGS_DIR/results/test_decimal.xml \
> $IMPALA_HOME/tests/query_test/test_decimal_queries.py::TestDecimalExprs::test_exprs
> {code}
> In $IMPALA_HOME/logs/ee_tests/QueryTest_decimal-exprs.test, we will find that 
> the following subsection corresponding to the query. There are 3 additional 
> newlines in the subsection of 'ERRORS'.
> {noformat}
> ---- ERRORS
> UDF WARNING: Decimal expression overflowed, returning NULL
> ====
> {noformat}
> One of the newlines was produced in 
> [join_section_lines()|https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py#L298].
>  This function is called when the actual results do not match the expected 
> results in the following 4 places.
>  # [test_section['ERRORS'] = 
> join_section_lines(actual_errors)|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L398].
>  # [test_section['TYPES'] = join_section_lines(\[', 
> '.join(actual_types)\])|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L429].
>  # [test_section['LABELS'] = join_section_lines(\[', 
> '.join(actual_labels)\])|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L451].
>  # [test_section[result_section] = 
> join_section_lines(actual.result_list)|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L489].
> Thus, we also have the same issue for subsections like TYPES, LABELS, and 
> RESULTS in such a scenario (actual results do not match expected ones). It 
> would be good if a user/developer does not have to manually remove those 
> extra newlines when trying to generate the golden files for new test files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-12311) Extra newlines are produced when an end-to-end test is run with update_results

Reply via email to