[
https://issues.apache.org/jira/browse/KUDU-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144407#comment-16144407
]
Mike Percy commented on KUDU-2113:
----------------------------------
I caught this in gdb.
This is what I'm seeing:
{code}
(gdb) bt
#0 0x0000000001439e93 in kudu::tools::Ksck::VerifyTablet (this=0x6307220,
tablet=std::shared_ptr (count 3, weak 0) 0x68a1d00, table_num_replicas=3) at
../../src/kudu/tools/ksck.cc:888
#1 0x0000000001436c95 in kudu::tools::Ksck::VerifyTable (this=0x6307220,
table=std::shared_ptr (count 1, weak 0) 0x62ab320, ts=0x7ffdf3b09460) at
../../src/kudu/tools/ksck.cc:600
#2 0x0000000001433d9d in kudu::tools::Ksck::CheckTablesConsistency
(this=0x6307220) at ../../src/kudu/tools/ksck.cc:232
#3 0x0000000001311e96 in kudu::tools::(anonymous namespace)::RunKsck
(context=...) at ../../src/kudu/tools/tool_action_cluster.cc:95
#4 0x00000000013168ad in std::_Function_handler<kudu::Status
(kudu::tools::RunnerContext const&), kudu::Status
(*)(kudu::tools::RunnerContext const&)>::_M_invoke(std::_Any_data const&,
kudu::tools::RunnerContext const&) (__functor=..., __args#0=...) at
/usr/include/c++/5/functional:1857
#5 0x0000000001939f16 in std::function<kudu::Status
(kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext
const&) const (this=0x635ad68, __args#0=...) at
/usr/include/c++/5/functional:2267
#6 0x00000000019367f9 in kudu::tools::Action::Run (this=0x635ad00,
chain=std::vector of length 2, capacity 2 = {...},
required_args=std::unordered_map with 1 elements = {...},
variadic_args=std::vector of length 0, capacity 0) at
../../src/kudu/tools/tool_action.cc:257
#7 0x0000000001377a73 in kudu::tools::DispatchCommand (chain=std::vector of
length 2, capacity 2 = {...}, action=0x635ad00, remaining_args=std::deque with
1 elements = {...}) at ../../src/kudu/tools/tool_main.cc:129
#8 0x00000000013785e3 in kudu::tools::RunTool (argc=4, argv=0x7ffdf3b0a058,
show_help=false) at ../../src/kudu/tools/tool_main.cc:201
#9 0x0000000001378d38 in main (argc=4, argv=0x7ffdf3b0a058) at
../../src/kudu/tools/tool_main.cc:261
(gdb) list
883 vector<string> indexes{""};
884 vector<string> committed{"Yes"};
885
886 // Fill out the columns with info from the replicas.
887 for (const auto& replica : replica_infos) {
888 char label = FindOrDie(peer_uuid_mapping, replica.ts->uuid());
889 sources.emplace_back(1, label);
890 if (!replica.consensus_state) {
891 voters.emplace_back("[config not available]");
892 terms.emplace_back("");
(gdb) p peer_uuid_mapping
$1 = std::map with 3 elements = {
["32b570014d3a488bbd3cc44ac560aa96"] = 65 'A',
["c0182dfc628c4763b0e629d964228d47"] = 66 'B',
["fbae0a6394944d078b2a9af85af9d461"] = 67 'C'
}
(gdb) p replica.ts
$2 = (kudu::tools::KsckTabletServer *) 0x0
{code}
> SEGV running ksck while cluster was starting up
> -----------------------------------------------
>
> Key: KUDU-2113
> URL: https://issues.apache.org/jira/browse/KUDU-2113
> Project: Kudu
> Issue Type: Bug
> Components: ksck
> Affects Versions: 1.5.0
> Reporter: Todd Lipcon
> Assignee: Will Berkeley
> Priority: Critical
>
> I just ran ksck against a cluster while it was in the process of starting up
> and got a SEGV. By the time I hooked up gdb to a debug build it was no longer
> SEGVing so I don't have a lot of info, but I did catch one stack trace and
> saw the crash is in VerifyTablet.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)