[
https://issues.apache.org/jira/browse/KUDU-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mahesh Reddy resolved KUDU-3532.
--------------------------------
Resolution: Fixed
> Unable to place replicas using range aware logic with multiple locations
> ------------------------------------------------------------------------
>
> Key: KUDU-3532
> URL: https://issues.apache.org/jira/browse/KUDU-3532
> Project: Kudu
> Issue Type: Bug
> Components: master
> Affects Versions: 1.17.0
> Reporter: Mahesh Reddy
> Assignee: Mahesh Reddy
> Priority: Major
> Fix For: 1.18.0
>
>
> When multiple locations exist, it's possible an std::length_error will be
> thrown when ReservoirSample is called within
> PlacementPolicy::SelectReplica().
> Look at this file for reference:
> https://github.com/apache/kudu/blob/master/src/kudu/master/placement_policy.cc
> There's an error in the logic of the code that assumes an improper relation
> between two sets, one set being the tablet servers to choose from and the
> other set being the tablet servers not to choose from. This error manifests
> itself as an implicit conversion from unsigned long to int. If "choices_size"
> is negative, the implicit conversion to int will make the value larger than
> the the max size allowed to reserve a vector and an error will be thrown
> within ReservoirSample().
> Below is a stack trace from a master crash due to this bug:
> SIGABRT (@0x1da00007b60) received by PID 31584 (TID 0x7fdf9644f700) from PID
> 31584; stack trace: ***
> @ 0xe48496 google::(anonymous namespace)::FailureSignalHandler()
> @ 0x7fdfb9a90630 (unknown)
> @ 0x7fdfb7c95387 __GI_raise
> @ 0x7fdfb7c96a78 __GI_abort
> @ 0x7fdfb85a5a95 {_}{{_}}gnu_cxx::\{_}_verbose_terminate_handler()
> @ 0x7fdfb85a3a06 (unknown)
> @ 0x7fdfb85a3a33 std::terminate()
> @ 0x7fdfb85a3c53 __cxa_throw
> @ 0x7fdfb85f8a67 std::__throw_length_error()
> @ 0xe01fcf kudu::ReservoirSample<>()
> @ 0xdfce0f kudu::master::PlacementPolicy::SelectReplica()
> @ 0xdff386 kudu::master::PlacementPolicy::PlaceExtraTabletReplica()
> @ 0xd873bf kudu::master::AsyncAddReplicaTask::SendRequest()
> @ 0xd7912c kudu::master::RetryingTSRpcTask::Run()
> @ 0xda5412 kudu::master::CatalogManager::ProcessTabletReport()
> @ 0xdf7018 kudu::master::MasterServiceImpl::TSHeartbeat()
> @ 0x2fea455 kudu::rpc::GeneratedServiceIf::Handle()
> @ 0x2feb44a kudu::rpc::ServicePool::RunThread()
> @ 0x31d2e1e kudu::Thread::SuperviseThread()
> @ 0x7fdfb9a88ea5 start_thread
> @ 0x7fdfb7d5db0d __clone
--
This message was sent by Atlassian Jira
(v8.20.10#820010)