Alexey Serbin has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/20270 )
Change subject: KUDU-3461 [client] Avoid impala crash by returning error if
invalid tablet id found
......................................................................
KUDU-3461 [client] Avoid impala crash by returning error if invalid tablet id
found
Kudu C++ clients maintain per client level metacache. So, if one client is
issuing
insert ops on a partition and another client issues a 'cache invalidating
worth' DDL
op on the same partition, first client's cache won't get invalidated. In some
workflows, this could potentially lead to an infinite recursion situation in
C++ client code that can eventually end up crashing impala daemon, due to stack
overflow.
The same situation can happen if it is a mix of C++ and Java clients as long as
there is atleast one C++ client involved in the workflow.
The short-term fix is to avoid crash by detecting the invalid tablet id
condition and return error from kudu c++ client to impala daemon.
Following are the steps to reproduce the issue from impala-shell:
+++
1. drop table if exists impala_crash;
2. create table if not exists impala_crash \
( dt string, col string, primary key(dt) ) \
partition by range(dt) ( partition values <= '00000000' ) \
stored as kudu;
3. alter table impala_crash drop if exists range partition value='20230301';
4. alter table impala_crash add if not exists range partition value='20230301';
5. insert into impala_crash values ('20230301','abc');
6. alter table impala_crash drop if exists range partition value='20230301';
7. alter table impala_crash add if not exists range partition value='20230301';
8. insert into impala_crash values ('20230301','abc');
+++
The last statement i.e. #8 causes impalad (connected to impala-shell) to crash
With this change, last statement query fails and throws
"Status::InvalidArgument()" error.
This change also includes unit test to test both scenarios:
1. Reproduce the infinite recursion case without a fix, expect it to crash
2. Reproduce the infinite recursion case with fix, expect it to return
"Status::InvalidArgument()" error instead of crashing due to stack overflow.
Inserting the row again after last step should succeed as expected
as the stale cache entry for the tablet is cleared by now.
Change-Id: Ia09cf6fb1b1d10f1ad13a62b5c863bcd1e3ab26a
Reviewed-on: http://gerrit.cloudera.org:8080/20270
Reviewed-by: Alexey Serbin <[email protected]>
Tested-by: Kudu Jenkins
---
M src/kudu/client/batcher.cc
M src/kudu/client/client-test.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/meta_cache.h
4 files changed, 233 insertions(+), 22 deletions(-)
Approvals:
Alexey Serbin: Looks good to me, approved
Kudu Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/20270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia09cf6fb1b1d10f1ad13a62b5c863bcd1e3ab26a
Gerrit-Change-Number: 20270
Gerrit-PatchSet: 15
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>