Quanlong Huang created IMPALA-11151:
---------------------------------------

             Summary: PlannerTest.testResourceRequirements is flaky on parquet 
alltypes table
                 Key: IMPALA-11151
                 URL: https://issues.apache.org/jira/browse/IMPALA-11151
             Project: IMPALA
          Issue Type: Bug
            Reporter: Quanlong Huang


Saw the test failed in an unrelated patch: 
[https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15830/testReport/junit/org.apache.impala.planner/PlannerTest/testResourceRequirements/]
{code:java}
Section PLAN of query:
select string_col from functional_parquet.alltypes;

Actual does not match expected result:
Max Per-Host Resource Reservation: Memory=4.01MB Threads=2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Per-Host Resource Estimates: Memory=26MB
WARNING: The following tables are missing relevant table and/or column 
statistics.
functional_parquet.alltypes
Analyzed query: SELECT string_col FROM functional_parquet.alltypes

F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.01MB 
thread-reservation=2
PLAN-ROOT SINK
|  output exprs: string_col
|  mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
00:SCAN HDFS [functional_parquet.alltypes]
   HDFS partitions=24/24 files=24 size=187.94KB
   stored statistics:
     table: rows=unavailable size=unavailable
     partitions: 0/24 rows=unavailable
     columns: unavailable
   extrapolated-rows=disabled max-scan-range-rows=unavailable
   mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
   tuple-ids=0 row-size=12B cardinality=unavailable
   in pipelines: 00(GETNEXT)

Expected:
Max Per-Host Resource Reservation: Memory=4.02MB Threads=2
Per-Host Resource Estimates: Memory=26MB
WARNING: The following tables are missing relevant table and/or column 
statistics.
functional_parquet.alltypes
Analyzed query: SELECT string_col FROM functional_parquet.alltypes

F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.02MB 
thread-reservation=2
PLAN-ROOT SINK
|  output exprs: string_col
|  mem-estimate=10.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|
00:SCAN HDFS [functional_parquet.alltypes]
   HDFS partitions=24/24 files=24 size=201.80KB
   stored statistics:
     table: rows=unavailable size=unavailable
     partitions: 0/24 rows=unavailable
     columns: unavailable
   extrapolated-rows=disabled max-scan-range-rows=unavailable
   mem-estimate=16.00MB mem-reservation=16.00KB thread-reservation=1
   tuple-ids=0 row-size=12B cardinality=unavailable
   in pipelines: 00(GETNEXT) {code}
The difference is due to the file sizes.
{code:java}
$ diff actual.txt expected.txt 
1c1
< Max Per-Host Resource Reservation: Memory=4.01MB Threads=2
---
> Max Per-Host Resource Reservation: Memory=4.02MB Threads=2
8c8
< |  Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.01MB 
thread-reservation=2
---
> |  Per-Host Resources: mem-estimate=26.00MB mem-reservation=4.02MB 
> thread-reservation=2
14c14
<    HDFS partitions=24/24 files=24 size=187.94KB
---
>    HDFS partitions=24/24 files=24 size=201.80KB
20c20
<    mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
---
>    mem-estimate=16.00MB mem-reservation=16.00KB thread-reservation=1 {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to