[
https://issues.apache.org/jira/browse/IMPALA-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865069#comment-17865069
]
Quanlong Huang commented on IMPALA-13202:
-----------------------------------------
[~araina]
{quote}what is the purpose of setting rpc_max_message_size to 2GB here? Are you
trying to change the rpc max message size limit on kudu server or client?
{quote}
I'm setting that since the code in RpcMgr::Init() doesn't affect the flag used
in libkudu_client.so. So I hope setting it explicitly can affect all flags
globally. But unfortunately, it doesn't work for libkudu_client.so
[https://github.com/apache/impala/blob/c53987480726b114e0c3537c71297df2834a4962/be/src/rpc/rpc-mgr.cc#L141-L145]
Impala is a client of Kudu. So this is trying to change the flag on kudu client
side.
For 2, I forgot to mention I actually added two more flags in the config file
of tserver. Just edited my previous comment about these:
{code:java}
--rpc_max_message_size=200000000
--tablet_transaction_memory_limit_mb=200{code}
So the limit is hitted at client side.
For 3, the difference between impalad and kudu CLI is that Impala uses
libkudu_client.so directly and also compiled from the same KRPC codes. The same
flag is defined in the codebase of Impala:
[https://github.com/apache/impala/blob/c53987480726b114e0c3537c71297df2834a4962/be/src/kudu/rpc/transfer.cc#L48-L50]
When we explicitly set the flag or setting the flag in Impala codes, only the
GLOBAL variable of the flag is set. The HIDDEN variable of the flag defined in
libkudu_client.so is not affected and keep using the default value (52428800).
In gdb, I can see the thread that shows the error is launched by KRPC codes in
libkudu_client.so since the source code files are from the kudu repo (i.e.
/mnt/source/kudu/kudu-e742f86f6d):
{noformat}
Thread 297 "rpc reactor-164" hit Breakpoint 1,
kudu::rpc::InboundTransfer::ReceiveBuffer (this=0x14aaca40, socket=0x160d6b50,
extra_4=0x7fe88e0c68e0) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/transfer.cc:130
130 if (PREDICT_FALSE(total_length_ > FLAGS_rpc_max_message_size)) {
(gdb) bt
#0 kudu::rpc::InboundTransfer::ReceiveBuffer (this=0x14aaca40,
socket=0x160d6b50, extra_4=0x7fe88e0c68e0) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/transfer.cc:130
#1 0x00007fe93f9eef95 in kudu::rpc::Connection::ReadHandler (this=0x10496700,
watcher=..., revents=<optimized out>) at
/mnt/build/gcc-10.4.0/include/c++/10.4.0/bits/unique_ptr.h:421
#2 0x00007fe93fcf0fbb in ev_invoke_pending (loop=0x153ec880) at
/mnt/source/kudu/kudu-e742f86f6d/thirdparty/src/libev-4.20/ev.c:3155
#3 0x00007fe93f9cacf8 in kudu::rpc::ReactorThread::InvokePendingCb
(loop=0x153ec880) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/reactor.cc:202
#4 0x00007fe93fcf43b7 in ev_run (flags=0, loop=0x153ec880) at
/mnt/source/kudu/kudu-e742f86f6d/thirdparty/src/libev-4.20/ev.c:3555
#5 ev_run (loop=0x153ec880, flags=0) at
/mnt/source/kudu/kudu-e742f86f6d/thirdparty/src/libev-4.20/ev.c:3402
#6 0x00007fe93f9cbc09 in ev::loop_ref::run (flags=0, this=0x14604ae0) at
/mnt/source/kudu/kudu-e742f86f6d/thirdparty/installed/uninstrumented/include/ev++.h:211
#7 kudu::rpc::ReactorThread::RunThread (this=0x14604ad8) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/rpc/reactor.cc:503
#8 0x00007fe93fb6189c in std::function<void ()>::operator()() const
(this=0x13c6a058) at
/mnt/build/gcc-10.4.0/include/c++/10.4.0/bits/std_function.h:617
#9 kudu::Thread::SuperviseThread (arg=0x13c6a000) at
/mnt/source/kudu/kudu-e742f86f6d/src/kudu/util/thread.cc:691
#10 0x00007fe9413956db in start_thread (arg=0x7fe88e0c7700) at
pthread_create.c:463
#11 0x00007fe93e0f961f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95 {noformat}
Printing these two variables looks expected:
{noformat}
(gdb) p total_length_
$2 = 53477463
(gdb) p FLAGS_rpc_max_message_size
$3 = 2147483647{noformat}
However, when examing the actual asm code and the register values. The flag
value that used to compare is actually 52428800:
{noformat}
(gdb) x/i $pc
=> 0x7fe93fa04f49 <kudu::rpc::InboundTransfer::ReceiveBuffer(kudu::Socket*,
kudu::faststring*)+745>: cmp %r9,%rdx
(gdb) p $r9
$4 = 52428800
(gdb) p $rdx
$5 = 53477463{noformat}
The main problem here is due to both Impala and Kudu codes defined the same
KRPC flags.
> KRPC flags used by libkudu_client.so can't be configured
> --------------------------------------------------------
>
> Key: IMPALA-13202
> URL: https://issues.apache.org/jira/browse/IMPALA-13202
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Quanlong Huang
> Priority: Critical
> Attachments: data.parquet
>
>
> The way Impala integrates with KRPC is porting the KRPC codes into the Impala
> code base. Flags and methods of KRPC are defined as GLOBAL in the impalad
> executable. libkudu_client.so also compiles from the same KRPC codes and have
> duplicate flags and methods defined as HIDDEN.
> To be specifit, both the impalad executable and libkudu_client.so have the
> symbol for kudu::rpc::InboundTransfer::ReceiveBuffer()
> {noformat}
> $ readelf -s --wide be/build/latest/service/impalad | grep ReceiveBuffer
> 11118: 00000000022f5c88 1936 FUNC GLOBAL DEFAULT 13
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> 81380: 00000000022f5c88 1936 FUNC GLOBAL DEFAULT 13
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> $ readelf -s --wide
> toolchain/toolchain-packages-gcc10.4.0/kudu-e742f86f6d/debug/lib/libkudu_client.so
> | grep ReceiveBuffer
> 1601: 0000000000086e4a 108 FUNC LOCAL DEFAULT 12
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE.cold
> 11905: 00000000001fec60 2076 FUNC LOCAL HIDDEN 12
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> $ c++filt
> _ZN4kudu3rpc15InboundTransfer13ReceiveBufferEPNS_6SocketEPNS_10faststringE
> kudu::rpc::InboundTransfer::ReceiveBuffer(kudu::Socket*, kudu::faststring*)
> {noformat}
> KRPC flags like rpc_max_message_size are also defined in both the impalad
> executable and libkudu_client.so:
> {noformat}
> $ readelf -s --wide be/build/latest/service/impalad | grep
> FLAGS_rpc_max_message_size
> 14380: 0000000006006738 8 OBJECT GLOBAL DEFAULT 30
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> 80396: 0000000006006741 1 OBJECT GLOBAL DEFAULT 30
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 81399: 0000000006006741 1 OBJECT GLOBAL DEFAULT 30
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 117873: 0000000006006738 8 OBJECT GLOBAL DEFAULT 30
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> $ readelf -s --wide
> toolchain/toolchain-packages-gcc10.4.0/kudu-e742f86f6d/debug/lib/libkudu_client.so
> | grep FLAGS_rpc_max_message_size
> 11882: 00000000008d61e1 1 OBJECT LOCAL HIDDEN 27
> _ZN3fLB44FLAGS_rpc_max_message_size_enable_validationE
> 11906: 00000000008d61d8 8 OBJECT LOCAL DEFAULT 27
> _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> $ c++filt _ZN5fLI6426FLAGS_rpc_max_message_sizeE
> fLI64::FLAGS_rpc_max_message_size {noformat}
> libkudu_client.so uses its own methods and flags. The flags are HIDDEN so
> can't be modified by Impala codes. E.g. IMPALA-4874 bumps
> FLAGS_rpc_max_message_size to 2GB in RpcMgr::Init(), but the HIDDEN variable
> FLAGS_rpc_max_message_size used in libkudu_client.so is still the default
> value 50MB (52428800). We've seen error messages like this in the master
> branch:
> {code:java}
> I0708 10:23:31.784974 2943 meta_cache.cc:294]
> c243bda4702a5ab9:0ba93d2400000001] tablet 0c8f3446538449ee9d3df5056afe775e:
> replica e0e1db54dab74f208e37ea1b975595e5 (127.0.0.1:31202) has failed:
> Network error: TS failed: RPC frame had a length of 53477464, but we only
> support messages up to 52428800 bytes long.{code}
> CC [~joemcdonnell] [~wzhou] [~aserbin]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]