[ 
https://issues.apache.org/jira/browse/IMPALA-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469237#comment-16469237
 ] 

Tim Armstrong commented on IMPALA-6996:
---------------------------------------

Just to add some context about why this is so slow, since I looked into it. We 
use this function from glog: 
https://github.com/google/glog/blob/d8cb47f77d1c31779f3ff890e1a5748483778d6a/src/utilities.cc#L119

{code}
static void DumpStackTrace(int skip_count, DebugWriter *writerfn, void *arg) {
  // Print stack trace
  void* stack[32];
  int depth = GetStackTrace(stack, ARRAYSIZE(stack), skip_count+1);
  for (int i = 0; i < depth; i++) {
#if defined(HAVE_SYMBOLIZE)
    if (FLAGS_symbolize_stacktrace) {
      DumpPCAndSymbol(writerfn, arg, stack[i], "    ");
    } else {
      DumpPC(writerfn, arg, stack[i], "    ");
    }
#else
    DumpPC(writerfn, arg, stack[i], "    ");
#endif
  }
{code}

The expensive part is the symbolisation (which we can disable with a 
command-line flag). For every stack frame, the symbolisation function does the 
following (based on 
https://github.com/google/glog/blob/d8cb47f77d1c31779f3ff890e1a5748483778d6a/src/symbolize.cc#L726):
# Does a linear search through /proc/self/maps for the address
# Opens the object file containing the address
# Reads the ELF header and resolves the address

> PartitionedAggregationNode::Close() should not dump stack trace
> ---------------------------------------------------------------
>
>                 Key: IMPALA-6996
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6996
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.12.0
>            Reporter: Zoram Thanga
>            Priority: Critical
>
> A CTAS query with MEM_LIMIT=6GB hit the limit while in an AggregationNode. 
> Upon processing the failure, the fragment instance that hit the MEM_LIMIT 
> tries to allocate more memory, and repeatedly dumps the following error 
> message and stack trace to Impalad log file, once per row.
> {noformat}
> I0508 01:32:10.611563 40614 status.cc:55] Memory limit exceeded: 
> FunctionContextImpl::AllocateLocal's allocations exceeded memory limits.
> Exprs could not allocate 3.00 B without exceeding limit.
> Error occurred on backend <redacted>.com:22000 by fragment 
> b429287bdabe184:9e7b5e960000002c
> Memory left in process limit: 384.47 GB
> Memory left in query limit: -6947009.00 B
> Query(b429287bdabe184:9e7b5e9600000000): memory limit exceeded. Limit=6.00 GB 
> Total=6.01 GB Peak=6.01 GB
>   Fragment b429287bdabe184:9e7b5e960000002c: Total=2.76 GB Peak=2.76 GB
>     AGGREGATION_NODE (id=3): Total=2.75 GB Peak=2.75 GB
>       Exprs: Total=751.94 MB Peak=751.94 MB
>     EXCHANGE_NODE (id=2): Total=0 Peak=0
>     DataStreamRecvr: Total=7.42 MB Peak=7.42 MB
>     CodeGen: Total=5.57 KB Peak=745.50 KB
>   Block Manager: Limit=4.80 GB Total=4.39 GB Peak=4.40 GB
>   Fragment b429287bdabe184:9e7b5e960000000c: Total=3.25 GB Peak=3.25 GB
>     AGGREGATION_NODE (id=1): Total=3.19 GB Peak=3.19 GB
>       Exprs: Total=816.06 MB Peak=816.06 MB
>     HDFS_SCAN_NODE (id=0): Total=59.96 MB Peak=200.09 MB
>     CodeGen: Total=5.96 KB Peak=876.00 KB
>     @           0x83c78a  impala::Status::Status()
>     @           0x83c98e  impala::Status::MemLimitExceeded()
>     @           0xa24344  impala::MemTracker::MemLimitExceeded()
>     @           0xa35ccd  impala::RuntimeState::SetMemLimitExceeded()
>     @           0xb6bd6d  impala::FunctionContextImpl::CheckAllocResult()
>     @           0xb6ae78  impala::FunctionContextImpl::AllocateLocal()
>     @           0xb6b10f  impala_udf::StringVal::StringVal()
>     @           0xb6b16a  impala_udf::StringVal::CopyFrom()
>     @           0x8a2641  impala::AggregateFunctions::StringValGetValue()
>     @           0x8a2661  
> impala::AggregateFunctions::StringValSerializeOrFinalize()
>     @           0xd411c5  impala::AggFnEvaluator::SerializeOrFinalize()
>     @           0xce6569  impala::PartitionedAggregationNode::CleanupHashTbl()
>     @           0xce693b  
> impala::PartitionedAggregationNode::Partition::Close()
>     @           0xce8152  
> impala::PartitionedAggregationNode::ClosePartitions()
>     @           0xcf163e  impala::PartitionedAggregationNode::Close()
>     @           0xa7ae35  impala::FragmentInstanceState::Close()
>     @           0xa7ebfb  impala::FragmentInstanceState::Exec()
>     @           0xa6aaf6  impala::QueryState::ExecFInstance()
>     @           0xbef9d9  impala::Thread::SuperviseThread()
>     @           0xbf0394  boost::detail::thread_data<>::run()
>     @           0xe588aa  (unknown)
>     @     0x7f8a54c32e25  start_thread
>     @     0x7f8a5496034d  __clone
> {noformat}
> Here't the profile summary:
> {noformat}
> Operator       #Hosts  Avg Time  Max Time    #Rows  Est. #Rows   Peak Mem  
> Est. Peak Mem  Detail                       
> -----------------------------------------------------------------------------------------------------------------------
> 03:AGGREGATE       32   0.000ns   0.000ns        0       1.06B    2.75 GB     
>  165.23 GB  FINALIZE                     
> 02:EXCHANGE        32   1s246ms   7s429ms  412.85M       1.06B          0     
>          0  HASH(<redacted>) 
> 01:AGGREGATE       32  22s539ms  27s631ms  422.16M       1.06B    3.09 GB     
>  165.23 GB  STREAMING                    
> 00:SCAN HDFS       32   1s221ms   2s093ms  810.12M       1.06B  222.22 MB     
>  616.00 MB  <redacted>      
>     Errors: Memory limit exceeded: Error occurred on backend 
> <redacted>.com:22000 by fragment b429287bdabe184:9e7b5e960000002c
> Memory left in process limit: 386.76 GB
> Memory left in query limit: -19080.00 B
> {noformat}
> And the query string:
> {noformat}
> create TABLE TT><redacted>
>         STORED AS PARQUET
>         TBLPROPERTIES('parquet.compression'='SNAPPY')
> T>AS 
> T>SELECT  T>        <redacted>,
>                                 <redacted>, 
>                                 <redacted>,
>                                 MAX(<redacted>) AS <redacted>,
>                                 MAX(<redacted>) AS <redacted>,
>                                 MAX(<redacted>) AS <redacted>,
>                                 MAX(<redacted>) AS  <redacted>               
> T>FROM TT>        <redacted>
>         GROUP BY                <redacted>,
>                                 <redacted>,
>                                 <redacted>
>     Coordinator: <redacted>.com:22000
>     Query Options (non default): 
> MEM_LIMIT=6442450944,REQUEST_POOL=<redacted>,SYNC_DDL=1,MT_DOP=0
>     DDL Type: CREATE_TABLE_AS_SELECT
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to