[ 
https://issues.apache.org/jira/browse/KUDU-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144407#comment-16144407
 ] 

Mike Percy commented on KUDU-2113:
----------------------------------

I caught this in gdb.

This is what I'm seeing:

{code}
(gdb) bt
#0  0x0000000001439e93 in kudu::tools::Ksck::VerifyTablet (this=0x6307220, 
tablet=std::shared_ptr (count 3, weak 0) 0x68a1d00, table_num_replicas=3) at 
../../src/kudu/tools/ksck.cc:888
#1  0x0000000001436c95 in kudu::tools::Ksck::VerifyTable (this=0x6307220, 
table=std::shared_ptr (count 1, weak 0) 0x62ab320, ts=0x7ffdf3b09460) at 
../../src/kudu/tools/ksck.cc:600
#2  0x0000000001433d9d in kudu::tools::Ksck::CheckTablesConsistency 
(this=0x6307220) at ../../src/kudu/tools/ksck.cc:232
#3  0x0000000001311e96 in kudu::tools::(anonymous namespace)::RunKsck 
(context=...) at ../../src/kudu/tools/tool_action_cluster.cc:95
#4  0x00000000013168ad in std::_Function_handler<kudu::Status 
(kudu::tools::RunnerContext const&), kudu::Status 
(*)(kudu::tools::RunnerContext const&)>::_M_invoke(std::_Any_data const&, 
kudu::tools::RunnerContext const&) (__functor=..., __args#0=...) at 
/usr/include/c++/5/functional:1857
#5  0x0000000001939f16 in std::function<kudu::Status 
(kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext 
const&) const (this=0x635ad68, __args#0=...) at 
/usr/include/c++/5/functional:2267
#6  0x00000000019367f9 in kudu::tools::Action::Run (this=0x635ad00, 
chain=std::vector of length 2, capacity 2 = {...}, 
required_args=std::unordered_map with 1 elements = {...}, 
variadic_args=std::vector of length 0, capacity 0) at 
../../src/kudu/tools/tool_action.cc:257
#7  0x0000000001377a73 in kudu::tools::DispatchCommand (chain=std::vector of 
length 2, capacity 2 = {...}, action=0x635ad00, remaining_args=std::deque with 
1 elements = {...}) at ../../src/kudu/tools/tool_main.cc:129
#8  0x00000000013785e3 in kudu::tools::RunTool (argc=4, argv=0x7ffdf3b0a058, 
show_help=false) at ../../src/kudu/tools/tool_main.cc:201
#9  0x0000000001378d38 in main (argc=4, argv=0x7ffdf3b0a058) at 
../../src/kudu/tools/tool_main.cc:261
(gdb) list
883         vector<string> indexes{""};
884         vector<string> committed{"Yes"};
885
886         // Fill out the columns with info from the replicas.
887         for (const auto& replica : replica_infos) {
888           char label = FindOrDie(peer_uuid_mapping, replica.ts->uuid());
889           sources.emplace_back(1, label);
890           if (!replica.consensus_state) {
891             voters.emplace_back("[config not available]");
892             terms.emplace_back("");
(gdb) p peer_uuid_mapping
$1 = std::map with 3 elements = {
  ["32b570014d3a488bbd3cc44ac560aa96"] = 65 'A',
  ["c0182dfc628c4763b0e629d964228d47"] = 66 'B',
  ["fbae0a6394944d078b2a9af85af9d461"] = 67 'C'
}
(gdb) p replica.ts
$2 = (kudu::tools::KsckTabletServer *) 0x0
{code}

> SEGV running ksck while cluster was starting up
> -----------------------------------------------
>
>                 Key: KUDU-2113
>                 URL: https://issues.apache.org/jira/browse/KUDU-2113
>             Project: Kudu
>          Issue Type: Bug
>          Components: ksck
>    Affects Versions: 1.5.0
>            Reporter: Todd Lipcon
>            Assignee: Will Berkeley
>            Priority: Critical
>
> I just ran ksck against a cluster while it was in the process of starting up 
> and got a SEGV. By the time I hooked up gdb to a debug build it was no longer 
> SEGVing so I don't have a lot of info, but I did catch one stack trace and 
> saw the crash is in VerifyTablet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to