[
https://issues.apache.org/jira/browse/IMPALA-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863987#comment-17863987
]
Quanlong Huang commented on IMPALA-13202:
-----------------------------------------
Debug in gdb, I can verify libkudu_client.so is using its own methods and flags.
There are two variables of FLAGS_rpc_max_message_size:
{code:cpp}
(gdb) info variables FLAGS_rpc_max_message_size$
All variables matching regular expression "FLAGS_rpc_max_message_size$":
File /home/quanlong/workspace/Impala/be/src/kudu/rpc/transfer.cc:
48: google::int64 fLI64::FLAGS_rpc_max_message_size;
File /mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/transfer.cc:
46: google::int64 fLI64::FLAGS_rpc_max_message_size;{code}
The second one comes from libkudu_client.so. The current Kudu version used in
the Impala master branch is e742f86f6d (corresponding to kudu-1.17 release).
Here is where the flag is used:
{code:cpp}
102 Status InboundTransfer::ReceiveBuffer(Socket* socket, faststring* extra_4) {
...
130 if (PREDICT_FALSE(total_length_ > FLAGS_rpc_max_message_size)) {
131 return Status::NetworkError(Substitute(
132 "RPC frame had a length of $0, but we only support messages up to
$1 bytes "
133 "long.", total_length_, FLAGS_rpc_max_message_size));
134 }{code}
[https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/rpc/transfer.cc#L130]
Add a breakpoint in that source file where the code uses this flag.
{noformat}
(gdb) b /mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/transfer.cc:130{noformat}
Continue in gdb and run the query in Impala. When the breakpoint is hitted:
{code:cpp}
Thread 276 "rpc reactor-250" hit Breakpoint 1,
kudu::rpc::InboundTransfer::ReceiveBuffer (this=0xd03cfc0, socket=0x14b4ed20,
extra_4=0x7fc72c74e8e0) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/transfer.cc:130
130 if (PREDICT_FALSE(total_length_ > FLAGS_rpc_max_message_size)) {
(gdb) x/i $pc
=> 0x7fc7dd68bf49 <kudu::rpc::InboundTransfer::ReceiveBuffer(kudu::Socket*,
kudu::faststring*)+745>: cmp %r9,%rdx
(gdb) p $rdx
$1 = 53477464
(gdb) p $r9
$2 = 52428800{code}
The assembly code is comparing two registers. Their values match what we see in
the error message. 52428800 is the unmodified default value of
FLAGS_rpc_max_message_size.
Looking into the assembly codes, register r9 is loaded from memory address
0x7fc7ddd631d8 which is the hidden variable FLAGS_rpc_max_message_size:
{code:java}
lea 0x6d729e(%rip),%rdx # 0x7fc7ddd631d8
<_ZN5fLI6426FLAGS_rpc_max_message_sizeE>
mov (%rax),%ecx
mov (%rdx),%r9
bswap %ecx
lea 0x4(%rcx),%edi
mov %edi,%edx
mov %edi,0x38(%rbx)
cmp %r9,%rdx {code}
Print the variable shows the global one used in impalad. But print the value
used by libkudu_client.so shows 52428800:
{noformat}
(gdb) p FLAGS_rpc_max_message_size
$25 = 2147483647
(gdb) p *((int64_t*)0x7fc7ddd631d8)
$26 = 52428800{noformat}
> KRPC flags used by libkudu_client.so can't be configured
> --------------------------------------------------------
>
> Key: IMPALA-13202
> URL: https://issues.apache.org/jira/browse/IMPALA-13202
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Quanlong Huang
> Priority: Critical
> Attachments: data.parquet
>
>
> The way Impala integrates with KRPC is porting the KRPC codes into the Impala
> code base. Flags and methods of KRPC are defined as GLOBAL in the impalad
> executable. libkudu_client.so also compiles from the same KRPC codes and have
> duplicate flags and methods defined as HIDDEN.
> To be specifit, both the impalad executable and libkudu_client.so have the
> symbol for kudu::rpc::InboundTransfer::ReceiveBuffer()
> {noformat}
> $ readelf -s --wide be/build/latest/service/impalad | grep ReceiveBuffer
> 11118: 00000000022f5c88 1936 FUNC GLOBAL DEFAULT 13
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> 81380: 00000000022f5c88 1936 FUNC GLOBAL DEFAULT 13
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> $ readelf -s --wide
> toolchain/toolchain-packages-gcc10.4.0/kudu-e742f86f6d/debug/lib/libkudu_client.so
> | grep ReceiveBuffer
> 1601: 0000000000086e4a 108 FUNC LOCAL DEFAULT 12
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE.cold
> 11905: 00000000001fec60 2076 FUNC LOCAL HIDDEN 12
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> $ c++filt
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> kudu::rpc::InboundTransfer::ReceiveBuffer(kudu::Socket*, kudu::faststring*)
> {noformat}
> KRPC flags like rpc_max_message_size are also defined in both the impalad
> executable and libkudu_client.so:
> {noformat}
> $ readelf -s --wide be/build/latest/service/impalad | grep
> FLAGS_rpc_max_message_size
> 14380: 0000000006006738 8 OBJECT GLOBAL DEFAULT 30
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> 80396: 0000000006006741 1 OBJECT GLOBAL DEFAULT 30
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 81399: 0000000006006741 1 OBJECT GLOBAL DEFAULT 30
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 117873: 0000000006006738 8 OBJECT GLOBAL DEFAULT 30
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> $ readelf -s --wide
> toolchain/toolchain-packages-gcc10.4.0/kudu-e742f86f6d/debug/lib/libkudu_client.so
> | grep FLAGS_rpc_max_message_size
> 11882: 00000000008d61e1 1 OBJECT LOCAL HIDDEN 27
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 11906: 00000000008d61d8 8 OBJECT LOCAL DEFAULT 27
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> $ c++filt _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> fLI64::FLAGS_rpc_max_message_size {noformat}
> libkudu_client.so uses its own methods and flags. The flags are HIDDEN so
> can't be modified by Impala codes. E.g. IMPALA-4874 bumps
> FLAGS_rpc_max_message_size to 2GB in RpcMgr::Init(), but the HIDDEN variable
> FLAGS_rpc_max_message_size used in libkudu_client.so is still the default
> value 50MB (52428800). We've seen error messages like this in the master
> branch:
> {code:java}
> I0708 10:23:31.784974 2943 meta_cache.cc:294]
> c243bda4702a5ab9:0ba93d2400000001] tablet 0c8f3446538449ee9d3df5056afe775e:
> replica e0e1db54dab74f208e37ea1b975595e5 (127.0.0.1:31202) has failed:
> Network error: TS failed: RPC frame had a length of 53477464, but we only
> support messages up to 52428800 bytes long.{code}
> CC [~joemcdonnell] [~wzhou] [~aserbin]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]