[
https://issues.apache.org/jira/browse/KUDU-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297628#comment-17297628
]
ASF subversion and git services commented on KUDU-3254:
-------------------------------------------------------
Commit 074e7e9a08d3aa392df068e23c425326161cf38b in kudu's branch
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=074e7e9 ]
KUDU-3254 fix bug in meta-cache exposed by KUDU-1802
This patch fixes an issue resulting in a SIGABRT crash in Kudu client
when working with stale scan tokens which contain information about
tablet locations for a table (see KUDU-1802) whose range partition
was dropped. The patch also adds a test scenario reproducing the crash;
now it passes and can catch future regressions.
This patch is a follow-up to d23ee5d38ddc4317f431dd65df0c825c00cc968a.
Prior the change in src/kudu/client/meta_cache.cc was back-ported from
Kudu 1.14 as part of this fix, the scenario crashed with SIGABRT when
running with the stack trace similar to the following (this one below
was captured on macOS):
* frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill + 430
frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail() at
logging.cc:1474:3
frame #4: 0x000000010ca19121
libglog.0.dylib`google::LogMessage::SendToLog() [inlined]
google::LogMessage::Fail() at logging.cc:1488:3
frame #5: 0x000000010ca1911b
libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
frame #6: 0x000000010ca19815 libglog.0.dylib`google::LogMessage::Flush() at
logging.cc:1311:5
frame #7: 0x000000010ca1d76f
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at logging.cc:2023:5
frame #8: 0x000000010ca1a5f9
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at
logging.cc:2022:37
frame #9: 0x0000000103e365e3
libkudu_client.dylib`std::__1::map<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >,
kudu::client::internal::MetaCacheEntry,
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > >,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const,
kudu::client::internal::MetaCacheEntry> > >::mapped_type&
FindOrDie<std::__1::map<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >,
kudu::client::internal::MetaCacheEntry,
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char> > >,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const,
kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
frame #10: 0x0000000103e34cbb
libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
at meta_cache.cc:943:23
frame #11: 0x0000000103e86166
libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at
scan_token-internal.cc:192:35
frame #12: 0x0000000103e88051
libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
at scan_token-internal.cc:111:10
frame #13: 0x0000000103d55d3c
libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at
client.cc:1879:10
Change-Id: I5b8370290c13b1e496f461ed5bc2e0193bdf4b19
Reviewed-on: http://gerrit.cloudera.org:8080/17152
Tested-by: Alexey Serbin <[email protected]>
Reviewed-by: Andrew Wong <[email protected]>
(cherry picked from commit 7c8dca60d15b560017ef7e726a379788727502ba)
Conflicts:
src/kudu/client/meta_cache.cc
src/kudu/client/scan_token-test.cc
Reviewed-on: http://gerrit.cloudera.org:8080/17158
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke <[email protected]>
> Crash in Kudu C++ client when working with stale scan tokens containing
> tablet location info
> --------------------------------------------------------------------------------------------
>
> Key: KUDU-3254
> URL: https://issues.apache.org/jira/browse/KUDU-3254
> Project: Kudu
> Issue Type: Bug
> Components: client
> Affects Versions: 1.13.0
> Reporter: Alexey Serbin
> Assignee: Alexey Serbin
> Priority: Major
> Fix For: 1.14.0
>
>
> With KUDU-1802 implemented, the meta-cache in Kudu C++ client might crash if
> using a scan token with information on tablet location in scenarios like
> below:
> # Scan tokens were generated for table with multiple ranges (e.g., with two
> ranges: [-100, 0), [0, 100)).
> # First range was dropped (e.g., range [-100, 0) is dropped).
> # A client was fed a set of tokens generated at step 1 to read from the table
> (now with one stale token corresponding to the dropped range).
> # The same client instance was used to write into the table.
> # The same client instance fed the original set of tokens once more to read
> from the table again.
> The client would crash at step 5 of the sequence above.
> The stack trace on crash might look like this (captured on macOS):
> {noformat}
> * frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill +
> 10
> frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill +
> 430
> frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
> frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail()
> at logging.cc:1474:3
> frame #4: 0x000000010ca19121
> libglog.0.dylib`google::LogMessage::SendToLog() [inlined]
> google::LogMessage::Fail() at logging.cc:
> 1488:3
> frame #5: 0x000000010ca1911b
> libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
> frame #6: 0x000000010ca19815
> libglog.0.dylib`google::LogMessage::Flush() at logging.cc:1311:5
> frame #7: 0x000000010ca1d76f
> libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at
> logging.cc:2023:5
> frame #8: 0x000000010ca1a5f9
> libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at
> logging.cc:2022:37
> frame #9: 0x0000000103e365e3
> libkudu_client.dylib`std::__1::map<std::__
> 1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>
> >, kudu::client::internal::MetaCacheEntry,
> std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
> std::__1::allocator<char> > >,
> std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
> std::__1::char_traits<char>, std::__1::allocator<char> > const,
> kudu::client::internal::MetaCacheEntry> > >::mapped_type&
> FindOrDie<std::__1::map<std::__1::basic_string<char,
> std::__1::char_traits<char>, std::__1::allocator<char> >,
> kudu::client::internal::MetaCacheEntry,
> std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>,
> std::__1::allocator<char> > >,
> std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
> std::__1::char_traits<char>, std::__1::allocator<char> > const,
> kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
> frame #10: 0x0000000103e34cbb
> libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
> at meta_cache.cc:943:23
> frame #11: 0x0000000103e86166
> libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at
> scan_token-internal.cc:192:35
> frame #12: 0x0000000103e88051
> libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
> at scan_token-internal.cc:111:10
> frame #13: 0x0000000103d55d3c
> libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at
> client.cc:1879:10
> {noformat}
> The issue is fixed in Kudu 1.14 with [this
> changelist|https://github.com/apache/kudu/commit/2a558768f8aa00068e72ccd1327081f07ba46b03].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)