[
https://issues.apache.org/jira/browse/KUDU-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin resolved KUDU-2376.
---------------------------------
Fix Version/s: 1.18.0
Resolution: Duplicate
> SIGSEGV while adding and dropping the same range partition and concurrently
> writing
> -----------------------------------------------------------------------------------
>
> Key: KUDU-2376
> URL: https://issues.apache.org/jira/browse/KUDU-2376
> Project: Kudu
> Issue Type: Bug
> Affects Versions: 1.7.0
> Reporter: William Berkeley
> Priority: Major
> Fix For: 1.18.0
>
> Attachments: alter_table-test.patch
>
>
> While adding a test to https://gerrit.cloudera.org/#/c/9393/, I ran into the
> problem that writing while doing a replace tablet operation caused the client
> to segfault. After inspecting the client code, it looked like the same
> problem could occur if the same range partition was added and dropped with
> concurrent writes.
> Attached is a patch that adds a test to alter_table-test that reliably
> reproduces the segmentation fault.
> I don't totally understand what's happening, but here's what I think I have
> figured out:
> Suppose the range partition P=[0, 100) is dropped and re-added in a single
> alter. This causes the tablet X for hash bucket 0 and range partition P to be
> dropped, and a new one Y created for the same partition. There is a batch
> pending to X which the client attempts to send to each of the replicas of X
> in turn. Once the replicas are exhausted, the client attempts to find a new
> leader with MetaCacheServerPicker::PickLeader, which triggers a master lookup
> to get the latest consensus info for X (#5 in the big comment in PickLeader).
> This calls LookupTabletByKey, which attempts a fast path lookup. Assuming
> other metadata operations have already cached a tablet for Y, the tablet for
> X will have been removed from the by-table-and-by-key map, and the fast
> lookup with return an entry for Y. The client code doesn't know the
> difference because the code paths just look at partition boundaries, which
> match for X and Y. The lookup doesn't happen, and the client ends up in a
> pretty tight loop repeating the above process, until the segfault.
> I'm not sure exactly what the segmentation fault is. I looked at it a bit in
> gdb and the segfault was a few calls deep into STL maps in release mode and
> inside a refcount increment in debug mode. I'll try to attach some gdb output
> showing that later.
> The problem is also hinted at in a TODO in PickLeader:
> {noformat}
> // TODO: When we support tablet splits, we should let the lookup shift
> // the write to another tablet (i.e. if it's since been split).
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)