[
https://issues.apache.org/jira/browse/IMPALA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltán Borók-Nagy updated IMPALA-11105:
---------------------------------------
Description:
Currently PhjBuilder::Close() looks like the following:
{noformat}
void PhjBuilder::Close(RuntimeState* state) {
if (closed_) return;
CloseAndDeletePartitions(nullptr);
if (ht_ctx_ != nullptr) {
ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
ht_ctx_->Close(state);
ht_ctx_.reset();
}
...
{noformat}
When 'ht_ctx_' is not null, we invoke
ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
But in Prepare() we create 'ht_ctx_' first, then after a couple operations
which might fail we create 'ht_stats_profile_'.
This means if an operation fails in Prepare(), between the creation of
'ht_ctx_' and 'ht_stast_profile_', then later we'll get a SEGFAULT in Close().
The stack trace would look like the following:
{noformat}
#0 impala::HashTableCtx::StatsCountersAdd (this=0x3ea4ddc80, profile=0x0) at
hash-table.cc:229
#1 0x0000000001545c4c in impala::PhjBuilder::Close (this=0xbf6523c0,
state=0x2eb586c00) at partitioned-hash-join-builder.cc:395
#2 0x00000000015e4056 in impala::JoinBuilder::CloseFromProbe (this=0xbf6523c0,
join_node_state=join_node_state@entry=0x2eb586c00) at join-builder.cc:64
#3 0x0000000001550fdb in impala::PartitionedHashJoinNode::Close
(this=0x214654480, state=0x2eb586c00) at partitioned-hash-join-node.cc:306
#4 0x00000000014ab5e1 in impala::ExecNode::Close (this=0xc34db5680,
state=0x2eb586c00) at exec-node.cc:313
#5 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x1980bdcd80,
state=0x2eb586c00) at exec-node.cc:313
...
#16 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x6faf3e000,
state=0x2eb586c00) at exec-node.cc:313
#17 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x86bc5e000,
state=0x2eb586c00) at exec-node.cc:313
#18 0x0000000001198ebe in impala::FragmentInstanceState::Close
(this=this@entry=0x12183fb180) at fragment-instance-state.cc:411
#19 0x000000000119ba47 in impala::FragmentInstanceState::Exec
(this=this@entry=0x12183fb180) at fragment-instance-state.cc:104
{noformat}
The solution could be to initialize the counters first, like we do in
grouping-aggregators.cc.
was:
Currently PhjBuilder::Close() looks like the following:
{noformat}
void PhjBuilder::Close(RuntimeState* state) {
if (closed_) return;
CloseAndDeletePartitions(nullptr);
if (ht_ctx_ != nullptr) {
ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
ht_ctx_->Close(state);
ht_ctx_.reset();
}
...
{noformat}
When 'ht_ctx_' is not null, we invoke
ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
But in Prepare() we create 'ht_ctx_' first, then after a couple operations
which might fail we create 'ht_stats_profile_'.
This means if an operation fails in Prepare(), between the creation of
'ht_ctx_' and 'ht_stast_profile_', then later we'll get a SEGFAULT in Close().
The solution could be to initialize the counters first, like we do in
grouping-aggregators.cc.
> Impala crashes in PhjBuilder::Close() when Prepare() fails
> ----------------------------------------------------------
>
> Key: IMPALA-11105
> URL: https://issues.apache.org/jira/browse/IMPALA-11105
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Zoltán Borók-Nagy
> Assignee: Zoltán Borók-Nagy
> Priority: Major
>
> Currently PhjBuilder::Close() looks like the following:
> {noformat}
> void PhjBuilder::Close(RuntimeState* state) {
> if (closed_) return;
> CloseAndDeletePartitions(nullptr);
> if (ht_ctx_ != nullptr) {
> ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
> ht_ctx_->Close(state);
> ht_ctx_.reset();
> }
> ...
> {noformat}
> When 'ht_ctx_' is not null, we invoke
> ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
> But in Prepare() we create 'ht_ctx_' first, then after a couple operations
> which might fail we create 'ht_stats_profile_'.
> This means if an operation fails in Prepare(), between the creation of
> 'ht_ctx_' and 'ht_stast_profile_', then later we'll get a SEGFAULT in Close().
> The stack trace would look like the following:
> {noformat}
> #0 impala::HashTableCtx::StatsCountersAdd (this=0x3ea4ddc80, profile=0x0) at
> hash-table.cc:229
> #1 0x0000000001545c4c in impala::PhjBuilder::Close (this=0xbf6523c0,
> state=0x2eb586c00) at partitioned-hash-join-builder.cc:395
> #2 0x00000000015e4056 in impala::JoinBuilder::CloseFromProbe
> (this=0xbf6523c0, join_node_state=join_node_state@entry=0x2eb586c00) at
> join-builder.cc:64
> #3 0x0000000001550fdb in impala::PartitionedHashJoinNode::Close
> (this=0x214654480, state=0x2eb586c00) at partitioned-hash-join-node.cc:306
> #4 0x00000000014ab5e1 in impala::ExecNode::Close (this=0xc34db5680,
> state=0x2eb586c00) at exec-node.cc:313
> #5 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x1980bdcd80,
> state=0x2eb586c00) at exec-node.cc:313
> ...
> #16 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x6faf3e000,
> state=0x2eb586c00) at exec-node.cc:313
> #17 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x86bc5e000,
> state=0x2eb586c00) at exec-node.cc:313
> #18 0x0000000001198ebe in impala::FragmentInstanceState::Close
> (this=this@entry=0x12183fb180) at fragment-instance-state.cc:411
> #19 0x000000000119ba47 in impala::FragmentInstanceState::Exec
> (this=this@entry=0x12183fb180) at fragment-instance-state.cc:104
> {noformat}
> The solution could be to initialize the counters first, like we do in
> grouping-aggregators.cc.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]