[ 
https://issues.apache.org/jira/browse/DRILL-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527231#comment-14527231
 ] 

Victoria Markman commented on DRILL-2865:
-----------------------------------------

With the same settings as in the original description + without compression I 
was able to execute 41 CTAS statements before drill ran out of memory.
This time, we ran out of direct memory, which probably indicates a memory leak 
during query execution:

{code}
iteration: 41
1/1          create table temp_41 as select ss_sold_date_sk , ss_sold_time_sk , 
ss_item_sk , ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by 
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 1_0        | 124274                    |
| 1_7        | 125506                    |
| 1_13       | 124646                    |
| 1_16       | 124397                    |
| 1_19       | 124598                    |
| 1_11       | 124933                    |
| 1_15       | 124657                    |
| 1_18       | 125274                    |
| 1_2        | 125439                    |
| 1_5        | 124828                    |
| 1_6        | 124878                    |
| 1_14       | 125255                    |
| 1_4        | 124850                    |
| 1_8        | 125339                    |
| 1_9        | 125300                    |
| 1_22       | 124469                    |
| 1_17       | 125055                    |
| 1_12       | 125024                    |
| 1_3        | 125407                    |
| 1_10       | 125204                    |
| 1_1        | 124675                    |
| 1_21       | 124720                    |
| 1_20       | 125573                    |
+------------+---------------------------+
23 rows selected (6.393 seconds)
Closing: org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection
sqlline version 1.1.6



iteration: 42
1/1          create table temp_42 as select ss_sold_date_sk , ss_sold_time_sk , 
ss_item_sk , ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by 
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;
Query failed: RESOURCE ERROR: One or more nodes ran out of memory while 
executing the query.

Fragment 1:18

[51278c6c-34f1-46e0-9440-baca2a4ac13f on atsqa4-133.qa.lab:31010]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)
Aborting command set because "force" is false and command failed: "create table 
temp_42 as select ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , 
ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by 
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;"
Closing: org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection
sqlline version 1.1.6

{code}

Attaching drill-2865-no-compression.log

> Drillbit runs out of memory on multiple consecutive CTAS
> --------------------------------------------------------
>
>                 Key: DRILL-2865
>                 URL: https://issues.apache.org/jira/browse/DRILL-2865
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Victoria Markman
>            Assignee: Steven Phillips
>             Fix For: 1.0.0
>
>         Attachments: drillbit.log, scritpts.tar
>
>
> Hardware configuration:
>         - single node
>         - 64GB RAM
> Drill configuration
>         DRILL_MAX_DIRECT_MEMORY="8G"
>         DRILL_MAX_HEAP="4G"
>         `planner.enable_multiphase_agg` = false;
>         `store.parquet.block-size` = 134217728;
>         `planner.enable_mux_exchange` = false;
>         `exec.min_hash_table_size` = 67108864;
>         `planner.enable_hashagg` = true; 
>         `planner.width.max_per_node` = 23;
> Aggregation query on TPCDS scale factor 1: 
> {code}
>         select 
>                 ss_sold_date_sk , 
>                 ss_sold_time_sk , 
>                 ss_item_sk , 
>                 ss_customer_sk , 
>                 ss_cdemo_sk, 
>                 count(*) from store_sales
>         group by 
>                 ss_sold_date_sk , 
>                 ss_sold_time_sk , 
>                 ss_item_sk , 
>                 ss_customer_sk , 
>                 ss_cdemo_sk
> ;
> {code}
> 1. Executing CTAS with this query and store.format = 'parquet' fails on 
> iteration #9 with this configuration consistently
> 2. Ran query by itself: 47 iterations successfully
> 3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not 
> reproduce the problem
> Attached:
>       - drillbit.log
>       - scripts.tar (contains script that reproduces OOM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to