Quanlong Huang created IMPALA-11151:
---------------------------------------
Summary: PlannerTest.testResourceRequirements is flaky on parquet
alltypes table
Key: IMPALA-11151
URL: https://issues.apache.org/jira/browse/IMPALA-11151
Project: IMPALA
Issue Type: Bug
Reporter: Quanlong Huang
Saw the test failed in an unrelated patch:
[https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15830/testReport/junit/org.apache.impala.planner/PlannerTest/testResourceRequirements/]
{code:java}
Section PLAN of query:
select string_col from functional_parquet.alltypes;
Actual does not match expected result:
Max Per-Host Resource Reservation: Memory=4.01MB Threads=2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Per-Host Resource Estimates: Memory=26MB
WARNING: The following tables are missing relevant table and/or column
statistics.
functional_parquet.alltypes
Analyzed query: SELECT string_col FROM functional_parquet.alltypes
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
| Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.01MB
thread-reservation=2
PLAN-ROOT SINK
| output exprs: string_col
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB
thread-reservation=0
|
00:SCAN HDFS [functional_parquet.alltypes]
HDFS partitions=24/24 files=24 size=187.94KB
stored statistics:
table: rows=unavailable size=unavailable
partitions: 0/24 rows=unavailable
columns: unavailable
extrapolated-rows=disabled max-scan-range-rows=unavailable
mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
tuple-ids=0 row-size=12B cardinality=unavailable
in pipelines: 00(GETNEXT)
Expected:
Max Per-Host Resource Reservation: Memory=4.02MB Threads=2
Per-Host Resource Estimates: Memory=26MB
WARNING: The following tables are missing relevant table and/or column
statistics.
functional_parquet.alltypes
Analyzed query: SELECT string_col FROM functional_parquet.alltypes
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
| Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.02MB
thread-reservation=2
PLAN-ROOT SINK
| output exprs: string_col
| mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB
thread-reservation=0
|
00:SCAN HDFS [functional_parquet.alltypes]
HDFS partitions=24/24 files=24 size=201.80KB
stored statistics:
table: rows=unavailable size=unavailable
partitions: 0/24 rows=unavailable
columns: unavailable
extrapolated-rows=disabled max-scan-range-rows=unavailable
mem-estimate=16.00MB mem-reservation=16.00KB thread-reservation=1
tuple-ids=0 row-size=12B cardinality=unavailable
in pipelines: 00(GETNEXT) {code}
The difference is due to the file sizes.
{code:java}
$ diff actual.txt expected.txt
1c1
< Max Per-Host Resource Reservation: Memory=4.01MB Threads=2
---
> Max Per-Host Resource Reservation: Memory=4.02MB Threads=2
8c8
< | Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.01MB
thread-reservation=2
---
> | Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.02MB
> thread-reservation=2
14c14
< HDFS partitions=24/24 files=24 size=187.94KB
---
> HDFS partitions=24/24 files=24 size=201.80KB
20c20
< mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
---
> mem-estimate=16.00MB mem-reservation=16.00KB thread-reservation=1 {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)