[
https://issues.apache.org/jira/browse/IMPALA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876824#comment-17876824
]
ASF subversion and git services commented on IMPALA-13317:
----------------------------------------------------------
Commit b6ca6ffb9cd9f69f6c903c0416cecbc60446097c in impala's branch
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b6ca6ffb9 ]
IMPALA-13317: Enhance tpc_sort_key for wider name support
Currently, the tpc_sort_key function is used for sorting TPCH or
TPCDS files while running the TPCH or TPCDS tests, and only
used by test_tuple_cache_tpc_queries now. It is designed to
handle filenames in formats like "tpch-qx-y," "tpch-qx," or
"tpch-qxX." However, it doesn't support filenames in the format
"tpch-qx-yY," and attempting to sort these files results in an error.
This patch improves the robustness of the tpc_sort_key function
by adding more checks to prevent errors and extending support
for filenames in the "tpch-qxX-yY" format.
Tests:
Reran and passed tests with file name like "tpch-qxX-yY" format.
Seems no tests exist for test util functions, I tested the function
with following unit tests locally and passed
test_cases = {
'tpcds-q1': (1, 0, '', ''),
'tpcds-q1X': (1, 0, 'X', ''),
'tpcds-q1-2Y': (1, 2, '', 'Y'),
'tpcds-q1X-2Y': (1, 2, 'X', 'Y'),
'tpcds-q2-3': (2, 3, '', ''),
'tpcds-q10': (10, 0, '', ''),
'tpcds-q10-20': (10, 20, '', ''),
'tpcds-q10a-20': (10, 20, 'a', ''),
'tpcds-q10-20b': (10, 20, '', 'b'),
'tpcds-q10a-20b': (10, 20, 'a', 'b'),
'tpcds-q0': (0, 0, '', ''),
'tpcds-': (0, 0, '', ''),
'tpcds--': (0, 0, '', ''),
'tpcds-xx-xx': (0, 0, '', ''),
'tpcds-x1-x1': (0, 0, '', ''),
'tpcds-x1-x': (0, 0, '', ''),
'tpcds-x-x1': (0, 0, '', ''),
'tpcds': (0, 0, '', ''),
}
for input_str, expected in test_cases.items():
result = tpc_sort_key(input_str)
assert result == expected
Change-Id: Ib238ff09d5a2278c593f2759cf35f136b0ff1344
Reviewed-on: http://gerrit.cloudera.org:8080/21708
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Test_tuple_cache_tpc_queries failing
> ------------------------------------
>
> Key: IMPALA-13317
> URL: https://issues.apache.org/jira/browse/IMPALA-13317
> Project: IMPALA
> Issue Type: Bug
> Reporter: gaurav singh
> Assignee: Yida Wu
> Priority: Critical
>
> collection failure: ValueError: invalid literal for int() with base 10: '1a'
> Stack Trace:
> {noformat}
> query_test/test_tuple_cache_tpc_queries.py:60: in <module>
> class TestTupleCacheTpcdsQuery(ImpalaTestSuite):
> query_test/test_tuple_cache_tpc_queries.py:74: in TestTupleCacheTpcdsQuery
> @pytest.mark.parametrize("query", load_tpc_queries_name_sorted('tpcds'))
> util/test_file_parser.py:673: in load_tpc_queries_name_sorted
> queries = sorted(queries, key=tpc_sort_key)
> util/test_file_parser.py:659: in tpc_sort_key
> y = int(parts[2]) if len(parts) > 2 else 0
> E ValueError: invalid literal for int() with base 10: '1a' {noformat}
> h4.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]