[jira] [Commented] (DRILL-8478) HashPartition memory leak when exception

ASF GitHub Bot (Jira) Tue, 23 Jan 2024 18:08:04 -0800


    [ 
https://issues.apache.org/jira/browse/DRILL-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810162#comment-17810162
 ]


ASF GitHub Bot commented on DRILL-8478:
---------------------------------------

shfshihuafeng commented on code in PR #2875:
URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148


##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java:
##########
@@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, 
BufferAllocator allocator, Chained
         .build(logger);
     } catch (SchemaChangeException sce) {
       throw new IllegalStateException("Unexpected Schema Change while creating 
a hash table",sce);
-    }
-    this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator);
-    tmpBatchesList = new ArrayList<>();
-    if (numPartitions > 1) {
-      allocateNewCurrentBatchAndHV();
+    } catch (OutOfMemoryException oom) {
+      close();

Review Comment:
   ### 1. fix idea
   The design is any operator fails, the entire operator stack is closed. but 
partitions is array which is initialed by null。if hashPartition object is not 
created successfully, it throw exception. so partitions array  data after index 
which is null。
   
   `  for (int part = 0; part < numPartitions; part++) {
         partitions[part] = new HashPartition(context, allocator, baseHashTable,
             buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, 
part,
             spilledState.getCycle(), numPartitions);
       }`
   
   for example
   
   partitions array length is 32, numPartitions =32 when numPartitions =10 
,throw except. partitions[11-31]  will be null 
    object which index  numPartitions =10 was created  failed ,but it had 
allocater memory.
   
   when calling close() , hashpartion  object which numPartitions =10 can not 
call close,beacause it is null。
   
   
   ### 2. another fix idea
   
     we do  not  throw exception and do not call  close, but catch. we can 
create hash partiotn object . thus when calling close() , we can release。
   but if 
   
   ```
     } catch (OutOfMemoryException oom) {
        //do not call  close ,only throw except
         throw UserException.memoryError(oom)
             .message("Failed to allocate hash partition.")
             .build(logger);
       }
   
   AbstractHashBinaryRecordBatch#initializeBuild
       boolean isException = false;
       try {
         for (int part = 0; part < numPartitions; part++) {
           if (isException) {
             break;
           }
           partitions[part] = new HashPartition(context, allocator, 
baseHashTable,
               buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, 
part,
               spilledState.getCycle(), numPartitions);
         }
       } catch (Exception e) {
         isException = true;
       }
       if (isException ){
         throw UserException.memoryError(exceptions[0])
             .message("Failed to allocate hash partition.")
             .build(logger);
       }
   ```
   





> HashPartition memory leak when  exception
> -----------------------------------------
>
>                 Key: DRILL-8478
>                 URL: https://issues.apache.org/jira/browse/DRILL-8478
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.21.1
>            Reporter: shihuafeng
>            Priority: Major
>             Fix For: 1.21.2
>
>         Attachments: 
> 0001-DRILL-8478.-HashPartition-memory-leak-when-it-alloca.patch
>
>
> *Describe the bug*
> hashpartition leak when allocate memory exception with OutOfMemoryException
> *To Reproduce*
> Steps to reproduce the behavior:
>  # prepare data for tpch 1s
>  # 20 concurrent for tpch sql8
>  # set direct memory 5g
>  # when it had OutOfMemoryException , stopped all sql.
>  # finding memory leak
> *Expected behavior*
> (1)i set \{DRILL_MAX_DIRECT_MEMORY:-"5G"}
> (2) i run sql8 (sql detail as Additional context) with 20 concurrent
> (3) it had OutOfMemoryException when create hashPartion
> *Error detail, log output or screenshots*
> Unable to allocate buffer of size 262144 (rounded from 262140) due to memory 
> limit (41943040). Current allocation: 20447232
>  
> sql 
> {code:java}
> // code placeholder
> select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / 
> sum(volume) as mkt_share from ( select extract(year from o_orderdate) as 
> o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from 
> hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, 
> hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, 
> hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and 
> s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey 
> and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name 
> = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date 
> '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as 
> all_nations group by o_year order by o_year
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (DRILL-8478) HashPartition memory leak when exception

Reply via email to