[ 
https://issues.apache.org/jira/browse/IMPALA-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-9560:
----------------------------------
    Fix Version/s: Impala 3.4.0

> Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks 
> TestStatsExtrapolation
> -----------------------------------------------------------------------------------
>
>                 Key: IMPALA-9560
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9560
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.4.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Critical
>              Labels: broken-build
>             Fix For: Impala 4.0, Impala 3.4.0
>
>
> When working on the Impala 3.4 release, we changed the version on 
> branch-3.4.0 from 3.4.0-SNAPSHOT to 3.4.0-RELEASE. 
> metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation()
>  now fails with the following error:
> {noformat}
> metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation
>     self.run_test_case('QueryTest/stats-extrapolation', vector, 
> unique_database)
> common/impala_test_suite.py:690: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:246: in verify_query_result_is_subset
>     assert expected_literal_strings <= actual_literal_strings
> E   assert Items in expected results not found in actual results:
> E     '   tuple-ids=0 row-size=4B cardinality=17.91K'
> E     Items in actual results:
> E     '|  output exprs: id'
> E     ''
> E     '     table: rows=unavailable size=unavailable'
> E     '   stored statistics:'
> E     'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2'
> E     '     columns: unavailable'
> E     '     partitions: 0/24 rows=unavailable'
> E     '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]'
> E     '   tuple-ids=0 row-size=4B cardinality=17.90K'
> E     '|'
> E     'Analyzed query: SELECT id FROM 
> test_stats_extrapolation_5c6bdfd.alltypes'
> E     'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1'
> E     '   HDFS partitions=24/24 files=36 size=281.43KB'
> E     'test_stats_extrapolation_5c6bdfd.alltypes'
> E     'PLAN-ROOT SINK'
> E     '|  mem-estimate=0B mem-reservation=0B thread-reservation=0'
> E     '|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB 
> thread-reservation=2'
> E     '   in pipelines: 00(GETNEXT)'
> E     '   extrapolated-rows=unavailable max-scan-range-rows=unavailable'
> E     'Per-Host Resource Estimates: Memory=16MB'
> E     'WARNING: The following tables are missing relevant table and/or column 
> statistics.'
> E     '   mem-estimate=16.00MB mem-reservation=8.00KB 
> thread-reservation=1'{noformat}
> The output is expecting a cardinality of 17.91K, but instead the cardinality 
> is 17.90K.
> The RELEASE version has one character fewer than the SNAPSHOT version. The 
> version gets embedded in parquet files, so the parquet file is slightly 
> smaller than before. The test is estimating cardinality by looking at the 
> size of the parquet file. Apparently, this is right on the edge.
> This test should tolerate this difference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to