[
https://issues.apache.org/jira/browse/DRILL-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527231#comment-14527231
]
Victoria Markman commented on DRILL-2865:
-----------------------------------------
With the same settings as in the original description + without compression I
was able to execute 41 CTAS statements before drill ran out of memory.
This time, we ran out of direct memory, which probably indicates a memory leak
during query execution:
{code}
iteration: 41
1/1 create table temp_41 as select ss_sold_date_sk , ss_sold_time_sk ,
ss_item_sk , ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;
+------------+---------------------------+
| Fragment | Number of records written |
+------------+---------------------------+
| 1_0 | 124274 |
| 1_7 | 125506 |
| 1_13 | 124646 |
| 1_16 | 124397 |
| 1_19 | 124598 |
| 1_11 | 124933 |
| 1_15 | 124657 |
| 1_18 | 125274 |
| 1_2 | 125439 |
| 1_5 | 124828 |
| 1_6 | 124878 |
| 1_14 | 125255 |
| 1_4 | 124850 |
| 1_8 | 125339 |
| 1_9 | 125300 |
| 1_22 | 124469 |
| 1_17 | 125055 |
| 1_12 | 125024 |
| 1_3 | 125407 |
| 1_10 | 125204 |
| 1_1 | 124675 |
| 1_21 | 124720 |
| 1_20 | 125573 |
+------------+---------------------------+
23 rows selected (6.393 seconds)
Closing: org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection
sqlline version 1.1.6
iteration: 42
1/1 create table temp_42 as select ss_sold_date_sk , ss_sold_time_sk ,
ss_item_sk , ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;
Query failed: RESOURCE ERROR: One or more nodes ran out of memory while
executing the query.
Fragment 1:18
[51278c6c-34f1-46e0-9440-baca2a4ac13f on atsqa4-133.qa.lab:31010]
Error: exception while executing query: Failure while executing query.
(state=,code=0)
Aborting command set because "force" is false and command failed: "create table
temp_42 as select ss_sold_date_sk , ss_sold_time_sk , ss_item_sk ,
ss_customer_sk , ss_cdemo_sk, count(*) from store_sales group by
ss_sold_date_sk , ss_sold_time_sk , ss_item_sk , ss_customer_sk , ss_cdemo_sk ;"
Closing: org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection
sqlline version 1.1.6
{code}
Attaching drill-2865-no-compression.log
> Drillbit runs out of memory on multiple consecutive CTAS
> --------------------------------------------------------
>
> Key: DRILL-2865
> URL: https://issues.apache.org/jira/browse/DRILL-2865
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 0.9.0
> Reporter: Victoria Markman
> Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: drillbit.log, scritpts.tar
>
>
> Hardware configuration:
> - single node
> - 64GB RAM
> Drill configuration
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_MAX_HEAP="4G"
> `planner.enable_multiphase_agg` = false;
> `store.parquet.block-size` = 134217728;
> `planner.enable_mux_exchange` = false;
> `exec.min_hash_table_size` = 67108864;
> `planner.enable_hashagg` = true;
> `planner.width.max_per_node` = 23;
> Aggregation query on TPCDS scale factor 1:
> {code}
> select
> ss_sold_date_sk ,
> ss_sold_time_sk ,
> ss_item_sk ,
> ss_customer_sk ,
> ss_cdemo_sk,
> count(*) from store_sales
> group by
> ss_sold_date_sk ,
> ss_sold_time_sk ,
> ss_item_sk ,
> ss_customer_sk ,
> ss_cdemo_sk
> ;
> {code}
> 1. Executing CTAS with this query and store.format = 'parquet' fails on
> iteration #9 with this configuration consistently
> 2. Ran query by itself: 47 iterations successfully
> 3. Ran CTAS with this query and store.format = 'csv': - 30 iterations did not
> reproduce the problem
> Attached:
> - drillbit.log
> - scripts.tar (contains script that reproduces OOM)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)