Alexey Serbin created KUDU-3254:
-----------------------------------
Summary: Crash in Kudu C++ client when working with stale scan
tokens containing tablet location info
Key: KUDU-3254
URL: https://issues.apache.org/jira/browse/KUDU-3254
Project: Kudu
Issue Type: Bug
Components: client
Affects Versions: 1.13.0
Reporter: Alexey Serbin
Fix For: 1.14.0
With KUDU-1802 implemented, the meta-cache in Kudu C++ client might crash if
using a scan token with information on tablet location in scenarios like below:
# Scan tokens were generated for table with multiple ranges (e.g., with two
ranges: [-100, 0), [0, 100)).
# First range was dropped (e.g., range [-100, 0) is dropped).
# A client was fed a set of tokens generated at step 1 to read from the table
(now with one stale token corresponding to the dropped range).
# The same client instance was used to write into the table.
# The same client instance fed the original set of tokens once more to read
from the table again.
The client would crash at step 5 of the sequence above.
The stack trace on crash might look like this (captured on macOS):
{noformat}
* frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill + 430
frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail() at
logging.cc:1474:3
frame #4: 0x000000010ca19121
libglog.0.dylib`google::LogMessage::SendToLog() [inlined]
google::LogMessage::Fail() at logging.cc:
1488:3
frame #5: 0x000000010ca1911b
libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
frame #6: 0x000000010ca19815
libglog.0.dylib`google::LogMessage::Flush() at logging.cc:1311:5
frame #7: 0x000000010ca1d76f
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at logging.cc:2023:5
frame #8: 0x000000010ca1a5f9
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at
logging.cc:2022:37
frame #9: 0x0000000103e365e3 libkudu_client.dylib`std::__1::map<std::__
1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >,
kudu::client::internal::MetaCacheEntry,
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > >,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const,
kudu::client::internal::MetaCacheEntry> > >::mapped_type&
FindOrDie<std::__1::map<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >,
kudu::client::internal::MetaCacheEntry,
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > >,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const,
kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
frame #10: 0x0000000103e34cbb
libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
at meta_cache.cc:943:23
frame #11: 0x0000000103e86166
libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at
scan_token-internal.cc:192:35
frame #12: 0x0000000103e88051
libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
at scan_token-internal.cc:111:10
frame #13: 0x0000000103d55d3c
libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at
client.cc:1879:10
{noformat}
The issue is fixed in Kudu 1.14 with [this
changelist|https://github.com/apache/kudu/commit/2a558768f8aa00068e72ccd1327081f07ba46b03].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)