[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646754#comment-17646754
 ] 

Alexey Serbin commented on KUDU-3400:
-------------------------------------

[~laiyingchun],

The presence of the following stacks makes me to suspect that the actual reason 
behind this OOM issue was KUDU-3406.  It would be great if you could 
double-check that {{pprof}} that produced the memory profile has been run with 
the proper binary and devtoolset environment.

Thanks!

{noformat}
Thread 136 (Thread 0x7fabb0a39700 (LWP 94563)):
#0  0x0000000001fdd588 in kudu::cfile::CFileIterator::PrepareBatch(unsigned 
long*) ()
#1  0x0000000000c1229c in 
kudu::tablet::CFileSet::Iterator::PrepareColumn(kudu::ColumnMaterializationContext*)
 ()
#2  0x0000000000c12663 in 
kudu::tablet::CFileSet::Iterator::MaterializeColumn(kudu::ColumnMaterializationContext*)
 ()
#3  0x000000000218b31a in 
kudu::MaterializingIterator::MaterializeBlock(kudu::RowBlock*) ()
#4  0x000000000218b442 in 
kudu::MaterializingIterator::NextBlock(kudu::RowBlock*) ()
#5  0x0000000000ba1c2e in ?? ()
#6  0x0000000000ba1129 in ?? ()
#7  0x0000000000ba1991 in ?? ()
#8  0x0000000000ba4adb in 
kudu::tablet::ReupdateMissedDeltas(kudu::fs::IOContext const*, 
kudu::tablet::CompactionInput*, kudu::tablet::HistoryGcOpts const&, 
kudu::tablet::MvccSnapshot const&, kudu::tablet::MvccSnapshot const&, 
std::vector<std::shared_ptr<kudu::tablet::RowSet>, 
std::allocator<std::shared_ptr<kudu::tablet::RowSet> > > const&) ()
#9  0x0000000000b6529c in 
kudu::tablet::Tablet::DoMergeCompactionOrFlush(kudu::tablet::RowSetsInCompaction
 const&, long) ()
#10 0x0000000000b67712 in kudu::tablet::Tablet::Compact(int) ()
#11 0x0000000000b81403 in kudu::tablet::CompactRowSetsOp::Perform() ()
#12 0x0000000002243248 in 
kudu::MaintenanceManager::LaunchOp(kudu::MaintenanceOp*) ()
#13 0x000000000229e19e in kudu::ThreadPool::DispatchThread() ()
#14 0x000000000229707f in kudu::Thread::SuperviseThread(void*) ()
#15 0x00007fabbcb9dea5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007fabbae72b0d in clone () from /lib64/libc.so.6
{noformat}

{noformat}
Thread 134 (Thread 0x7fabafa37700 (LWP 94565)):
#0  0x0000000000ba61fc in kudu::Status kudu::CopyCell<kudu::ColumnBlockCell, 
kudu::ColumnBlockCell, kudu::Arena>(kudu::ColumnBlockCell const&, 
kudu::ColumnBlockCell*, kudu::Arena*) ()
#1  0x0000000000ba2894 in 
kudu::tablet::FlushCompactionInput(kudu::tablet::CompactionInput*, 
kudu::tablet::MvccSnapshot const&, kudu::tablet::HistoryGcOpts const&, 
kudu::tablet::RollingDiskRowSetWriter*) ()
#2  0x0000000000b6467a in 
kudu::tablet::Tablet::DoMergeCompactionOrFlush(kudu::tablet::RowSetsInCompaction
 const&, long) ()
#3  0x0000000000b67712 in kudu::tablet::Tablet::Compact(int) ()
#4  0x0000000000b81403 in kudu::tablet::CompactRowSetsOp::Perform() ()
#5  0x0000000002243248 in 
kudu::MaintenanceManager::LaunchOp(kudu::MaintenanceOp*) ()
#6  0x000000000229e19e in kudu::ThreadPool::DispatchThread() ()
#7  0x000000000229707f in kudu::Thread::SuperviseThread(void*) ()
#8  0x00007fabbcb9dea5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fabbae72b0d in clone () from /lib64/libc.so.6
{noformat}

> CompilationManager::RequestRowProjector consumed too much memory
> ----------------------------------------------------------------
>
>                 Key: KUDU-3400
>                 URL: https://issues.apache.org/jira/browse/KUDU-3400
>             Project: Kudu
>          Issue Type: Bug
>          Components: codegen
>    Affects Versions: 1.12.0
>            Reporter: Yingchun Lai
>            Priority: Major
>         Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
>                 CodeGenerator* generator)
>   : base_(base),
>     proj_(proj),
>     cache_(cache),
>     generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr<CompilationTask> task(make_shared<CompilationTask>(
>       *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>                   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to