[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666270#comment-15666270
 ] 

Todd Lipcon commented on KUDU-1745:
---

figured out the crash scenario, patch available here: 
https://gerrit.cloudera.org/#/c/5089/

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hs_err_pid7761.log, hs_err_pid9275.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-14 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666107#comment-15666107
 ] 

Matthew Jacobs commented on KUDU-1745:
--

Thanks! This is a single master. Let me know if you'd like access to the 
cluster.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, hs_err_pid9275.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666101#comment-15666101
 ] 

Todd Lipcon commented on KUDU-1745:
---

Thanks, this stack makes a lot more sense. Just to confirm, is the stress test 
using multi-master or just a single kudu master? We'll prioritize this one.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, hs_err_pid9275.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-14 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665902#comment-15665902
 ] 

Matthew Jacobs commented on KUDU-1745:
--

Wow... finally got the right stack:

{code}
Stack: [0x7f9961c1f000,0x7f996262],  sp=0x7f996261d9e0,  free 
space=10234k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libstdc++.so.6+0xc5018]  std::basic_string::basic_string(std::string const&)+0x8
C  [libkudu_client.so.0+0xf4dc2]  
kudu::client::internal::RemoteTablet::MarkReplicaFailed(kudu::client::internal::RemoteTabletServer*,
 kudu::Status const&)+0x1d0
C  [libkudu_client.so.0+0xf6fcf]  
kudu::client::internal::MetaCacheServerPicker::MarkServerFailed(kudu::client::internal::RemoteTabletServer*,
 kudu::Status const&)+0x2f
C  [libkudu_client.so.0+0xb5a53]  
kudu::rpc::RetriableRpc::RetryIfNeeded(kudu::rpc::RetriableRpcStatus 
const&, kudu::client::internal::RemoteTabletServer*)+0x207
C  [libkudu_client.so.0+0xb3ac0]  
kudu::rpc::RetriableRpc::ReplicaFoundCb(kudu::Status const&, 
kudu::client::internal::RemoteTabletServer*)+0x70
C  [libkudu_client.so.0+0xb97bc]  kudu::internal::RunnableAdapter::*)(kudu::Status 
const&, 
kudu::client::internal::RemoteTabletServer*)>::Run(kudu::rpc::RetriableRpc*, kudu::Status 
const&, kudu::client::internal::RemoteTabletServer* const&)+0x94
C  [libkudu_client.so.0+0xb8d85]  kudu::internal::InvokeHelper, void 
()(kudu::rpc::RetriableRpc*, kudu::Status 
const&, kudu::client::internal::RemoteTabletServer* 
const&)>::MakeItSo(kudu::internal::RunnableAdapter::*)(kudu::Status 
const&, kudu::client::internal::RemoteTabletServer*)>, 
kudu::rpc::RetriableRpc*, kudu::Status 
const&, kudu::client::internal::RemoteTabletServer* const&)+0x6d
C  [libkudu_client.so.0+0xb7d5b]  kudu::internal::Invoker<1, 
kudu::internal::BindState, void 
()(kudu::rpc::RetriableRpc*, kudu::Status 
const&, kudu::client::internal::RemoteTabletServer*), void 
()(kudu::internal::UnretainedWrapper >)>, void 
()(kudu::rpc::RetriableRpc*, kudu::Status 
const&, 
kudu::client::internal::RemoteTabletServer*)>::Run(kudu::internal::BindStateBase*,
 kudu::Status const&, kudu::client::internal::RemoteTabletServer* const&)+0x7d
C  [libkudu_client.so.0+0xfed1d]  kudu::Callback::Run(kudu::Status const&, 
kudu::client::internal::RemoteTabletServer* const&) const+0x5f
C  [libkudu_client.so.0+0xf70df]  
kudu::client::internal::MetaCacheServerPicker::LookUpTabletCb(kudu::Callback const&, 
kudu::MonoTime const&, kudu::Status const&)+0x7d
C  [libkudu_client.so.0+0x1074b1]  kudu::internal::RunnableAdapter const&, 
kudu::MonoTime const&, kudu::Status 
const&)>::Run(kudu::client::internal::MetaCacheServerPicker*, 
kudu::Callback const&, kudu::MonoTime const&, 
kudu::Status const&)+0xa9
C  [libkudu_client.so.0+0x1057a9]  kudu::internal::InvokeHelper, void 
()(kudu::client::internal::MetaCacheServerPicker*, kudu::Callback const&, 
kudu::MonoTime const&, kudu::Status 
const&)>::MakeItSo(kudu::internal::RunnableAdapter const&, 
kudu::MonoTime const&, kudu::Status const&)>, 
kudu::client::internal::MetaCacheServerPicker*, kudu::Callback const&, 
kudu::MonoTime const&, kudu::Status const&)+0x85
C  [libkudu_client.so.0+0x1035a3]  kudu::internal::Invoker<3, 
kudu::internal::BindState, void 
()(kudu::client::internal::MetaCacheServerPicker*, kudu::Callback const&, 
kudu::MonoTime const&, kudu::Status const&), void 
()(kudu::internal::UnretainedWrapper,
 kudu::Callback, kudu::MonoTime)>, void 
()(kudu::client::internal::MetaCacheServerPicker*, kudu::Callback const&, 
kudu::MonoTime const&, 

[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664426#comment-15664426
 ] 

Todd Lipcon commented on KUDU-1745:
---

I'm not following the OOM theory -- looks like there was 5GB in buffer cache at 
the time of the crash, so it woudl have purged some of that before refusing a 
malloc. I also think an exception would propagate back and cause 
std::terminate() to be called, but I don't see that in the stack anywhere.

As for the stack itself, it doesn't make any sense still:
{code}
 0  libstdc++.so.6!std::basic_string::basic_string(std::string const&) + 0x8
 1  libkudu_client.so.0!void std::vector 
>::_M_emplace_back_aux(kudu::client::KuduError*&&) + 
0x1300
 2  libkudu_client.so.0!void std::vector 
>::_M_emplace_back_aux(kudu::client::KuduError*&&) + 
0x1dd2
 3  libkudu_client.so.0!_init + 0xd022
 4  libkudu_client.so.0!_init + 0xd186
 5  libkudu_client.so.0!void std::vector 
>::_M_emplace_back_aux(kudu::client::KuduError*&&) + 
0x2ee5
 6  libkudu_client.so.0!void std::vector 
>::_M_emplace_back_aux(kudu::client::KuduError*&&) + 
0x7c49
 7  libkudu_client.so.0!kudu::client::KuduUpsert::~KuduUpsert() + 0x2ae30
 8  libkudu_client.so.0!kudu::client::KuduUpsert::~KuduUpsert() + 0x2b102
 9  libkudu_client.so.0!void std::vector::_M_emplace_back_aux(kudu::Slice&&) 
+ 0x123d
10  libkudu_client.so.0!kudu::client::KuduUpsert::~KuduUpsert() + 0x1f655
11  libkudu_client.so.0!void std::vector::_M_emplace_back_aux(kudu::Slice&&) 
+ 0x5cb6
12  libkudu_client.so.0!std::_Rb_tree_iterator std::_Rb_tree, 
std::less, std::allocator
13  libkudu_client.so.0!std::_Rb_tree_iterator std::_Rb_tree, 
std::less, std::allocator
14  libkudu_client.so.0!kudu::client::KuduUpsert::~KuduUpsert() + 0x1b34b
15  libkudu_client.so.0!void std::vector 
>::_M_emplace_back_aux(char*&&) + 0x558a
{code}

note that this shows std::vector::emplace_back calling a KuduUpsert 
destructor, which is definitely nonsense. The rest of the stack makes just as 
little sense (eg the KuduUpsert destructor claims to be calling 
vector::emplace_back, which itself calls the shared library's 
constructor _init?

You think there is any possibility that the shared library got replaced in the 
middle of the run or something? If you can reproduce, can you reproduce using a 
kudu client that doesn't have debug symbols stripped? (I'm not sure if the 
toolchain builds strip debug symbols, but we do enable them by default, even in 
release builds)

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  

[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-11 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658185#comment-15658185
 ] 

Matthew Jacobs commented on KUDU-1745:
--

FYI, I filed https://issues.cloudera.org/browse/IMPALA-4478 to track work for 
Impala to account for Kudu client memory in its own memory accounting 
mechanism. If we do that, we probably shouldn't run out of memory. We can 
easily account for the mutation buffer space. I think we may be in trouble now 
though since the memory required for errors is currently unbounded. That's just 
one thing that comes to mind, we may need to think about other sources of 
potentially large amounts of memory we need to bound and/or account for.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-11 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658099#comment-15658099
 ] 

Matthew Jacobs commented on KUDU-1745:
--

Though it's very suspicious that 4 nodes crashed at the same time with stacks 
that look the same. If it were just an OOM I'd expect to have seen crashes w/ 
stacks elsewhere too.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-11 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657991#comment-15657991
 ] 

Matthew Jacobs commented on KUDU-1745:
--

Hm, yes that makes sense. That raises another concern: I need to make sure I 
can bound the memory allocated by the Kudu client, or at least know how much it 
is using/going to use. We should fail queries when we discover there isn't 
enough memory available.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-10 Thread Alexey Serbin (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656315#comment-15656315
 ] 

Alexey Serbin commented on KUDU-1745:
-

Probably, that was out-of-memory case -- if I'm not mistaken, the attached log 
shows
{noformat}
MemTotal:   15238768 kB
MemFree:  247332 kB
Buffers:  211372 kB
Cached:  4779972 kB
SwapCached:0 kB
Active: 12002096 kB
Inactive:2406436 kB
Active(anon):9417892 kB
Inactive(anon):  140 kB
Active(file):2584204 kB
Inactive(file):  2406296 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal: 0 kB
SwapFree:  0 kB
{noformat}

If so, then most likely that was std::bad_alloc thrown and then the C++ 
run-time called the standard handler for unhandled exceptions.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C  [libkudu_client.so.0+0x754d9]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x7c49
> C  [libkudu_client.so.0+0xc9be0]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2ae30
> C  [libkudu_client.so.0+0xc9eb2]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x2b102
> C  [libkudu_client.so.0+0xcc73d]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x123d
> C  [libkudu_client.so.0+0xbe405]  
> kudu::client::KuduUpsert::~KuduUpsert()+0x1f655
> C  [libkudu_client.so.0+0xcdc0c]  
> _ZNSt6vectorIN4kudu5SliceESaIS1_EE19_M_emplace_back_auxIIS1_EEEvDpOT_+0x270c
> C  [libkudu_client.so.0+0x25ec1b]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x311b
> C  [libkudu_client.so.0+0x262324]  
> _ZNSt8_Rb_treeISsSt4pairIKSsSsESt10_Select1stIS2_ESt4lessISsESaIS2_EE22_M_emplace_hint_uniqueIJRKSt21piecewise_construct_tSt5tupleIJRS1_EESD_IJESt17_Rb_tree_iteratorIS2_ESt23_Rb_tree_const_iteratorIS2_EDpOT_+0x6824
> T_+0x558a
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1745) Kudu causes Impala to crash under stress

2016-11-10 Thread Matthew Jacobs (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655933#comment-15655933
 ] 

Matthew Jacobs commented on KUDU-1745:
--

I found a core on kudu-stress-4.vpc.cloudera.com, though unfortunately I can't 
get debugging symbols. Unless something is broken with the core, I don't see 
Impala calling the client, so maybe a Kudu client thread (it does async work on 
its own threads, right)?

{code}
(gdb) bt
#0  0x0030e2832625 in raise () from /lib64/libc.so.6
#1  0x0030e2833e05 in abort () from /lib64/libc.so.6
#2  0x7f1cdfdd5a55 in os::abort(bool) () from 
/usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
#3  0x7f1cdff55f87 in VMError::report_and_die() () from 
/usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
#4  0x7f1cdfdda96f in JVM_handle_linux_signal () from 
/usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
#5  
#6  0x7f1cded39018 in std::basic_string::basic_string(std::basic_string const&) () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libstdc++.so.6
#7  0x7f1cdeff5b90 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#8  0x7f1cdeff6662 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#9  0x7f1cdefd3872 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#10 0x7f1cdefd39d6 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#11 0x7f1cdeff7775 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#12 0x7f1cdeffc4d9 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#13 0x7f1cdf050be0 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#14 0x7f1cdf050eb2 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#15 0x7f1cdf05373d in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#16 0x7f1cdf045405 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#17 0x7f1cdf0581b6 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#18 0x7f1cdf1e5c1b in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#19 0x7f1cdf1e9324 in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#20 0x7f1cdf0410fb in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#21 0x7f1cdf10fcea in ?? () from 
/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.235/lib/impala/lib/libkudu_client.so.0
#22 0x0030e2c079d1 in start_thread () from /lib64/libpthread.so.0
#23 0x0030e28e88fd in clone () from /lib64/libc.so.6
{code}

I also have a minidump that I decoded (attached stacks.out), but it more or 
less shows the same thing.

Given that the output from the hs_err jvm dump shows this is in KuduError and 
KuduUpsert's destructor, I wonder if the operations could be mismanaged (i.e. 
not deleted properly) when living in the KuduError. It's possible that Impala 
isn't doing something correctly with the KuduUpserts on another thread, but I 
think our interaction with them is minimal.

> Kudu causes Impala to crash under stress
> 
>
> Key: KUDU-1745
> URL: https://issues.apache.org/jira/browse/KUDU-1745
> Project: Kudu
>  Issue Type: Bug
>Reporter: Taras Bobrovytsky
>Priority: Critical
> Attachments: hs_err_pid7761.log, stacks.out
>
>
> There were over 200 queries running, about half of which were selects and the 
> rest were upsert and delete queries.
> There was a crash after a few minutes with the following stack trace:
> {code}
> Stack: [0x7f1629c93000,0x7f162a694000],  sp=0x7f162a6922b0,  free 
> space=10236k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libstdc++.so.6+0xc5018]  std::basic_string std::allocator >::basic_string(std::string const&)+0x8
> C  [libkudu_client.so.0+0x6f662]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x1dd2
> C  [libkudu_client.so.0+0x4c872]  _init+0xd022
> C  [libkudu_client.so.0+0x4c9d6]  _init+0xd186
> C  [libkudu_client.so.0+0x70775]  
> _ZNSt6vectorIPN4kudu6client9KuduErrorESaIS3_EE19_M_emplace_back_auxIIS3_EEEvDpOT_+0x2ee5
> C