[ 
https://issues.apache.org/jira/browse/IMPALA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876824#comment-17876824
 ] 

ASF subversion and git services commented on IMPALA-13317:
----------------------------------------------------------

Commit b6ca6ffb9cd9f69f6c903c0416cecbc60446097c in impala's branch 
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b6ca6ffb9 ]

IMPALA-13317: Enhance tpc_sort_key for wider name support

Currently, the tpc_sort_key function is used for sorting TPCH or
TPCDS files while running the TPCH or TPCDS tests, and only
used by test_tuple_cache_tpc_queries now. It is designed to
handle filenames in formats like "tpch-qx-y," "tpch-qx," or
"tpch-qxX." However, it doesn't support filenames in the format
"tpch-qx-yY," and attempting to sort these files results in an error.

This patch improves the robustness of the tpc_sort_key function
by adding more checks to prevent errors and extending support
for filenames in the "tpch-qxX-yY" format.

Tests:
Reran and passed tests with file name like "tpch-qxX-yY" format.
Seems no tests exist for test util functions, I tested the function
with following unit tests locally and passed
test_cases = {
    'tpcds-q1': (1, 0, '', ''),
    'tpcds-q1X': (1, 0, 'X', ''),
    'tpcds-q1-2Y': (1, 2, '', 'Y'),
    'tpcds-q1X-2Y': (1, 2, 'X', 'Y'),
    'tpcds-q2-3': (2, 3, '', ''),
    'tpcds-q10': (10, 0, '', ''),
    'tpcds-q10-20': (10, 20, '', ''),
    'tpcds-q10a-20': (10, 20, 'a', ''),
    'tpcds-q10-20b': (10, 20, '', 'b'),
    'tpcds-q10a-20b': (10, 20, 'a', 'b'),
    'tpcds-q0': (0, 0, '', ''),
    'tpcds-': (0, 0, '', ''),
    'tpcds--': (0, 0, '', ''),
    'tpcds-xx-xx': (0, 0, '', ''),
    'tpcds-x1-x1': (0, 0, '', ''),
    'tpcds-x1-x': (0, 0, '', ''),
    'tpcds-x-x1': (0, 0, '', ''),
    'tpcds': (0, 0, '', ''),
}
for input_str, expected in test_cases.items():
    result = tpc_sort_key(input_str)
    assert result == expected

Change-Id: Ib238ff09d5a2278c593f2759cf35f136b0ff1344
Reviewed-on: http://gerrit.cloudera.org:8080/21708
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Test_tuple_cache_tpc_queries failing
> ------------------------------------
>
>                 Key: IMPALA-13317
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13317
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: gaurav singh
>            Assignee: Yida Wu
>            Priority: Critical
>
> collection failure: ValueError: invalid literal for int() with base 10: '1a'
> Stack Trace:
> {noformat}
> query_test/test_tuple_cache_tpc_queries.py:60: in <module>
>     class TestTupleCacheTpcdsQuery(ImpalaTestSuite):
> query_test/test_tuple_cache_tpc_queries.py:74: in TestTupleCacheTpcdsQuery
>     @pytest.mark.parametrize("query", load_tpc_queries_name_sorted('tpcds'))
> util/test_file_parser.py:673: in load_tpc_queries_name_sorted
>     queries = sorted(queries, key=tpc_sort_key)
> util/test_file_parser.py:659: in tpc_sort_key
>     y = int(parts[2]) if len(parts) > 2 else 0
> E   ValueError: invalid literal for int() with base 10: '1a' {noformat}
> h4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to