[ 
https://issues.apache.org/jira/browse/KUDU-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412895#comment-16412895
 ] 

Will Berkeley commented on KUDU-2376:
-------------------------------------

Forgot to mention something very important: the client doing the writing needs 
to be a different one than the one that does the alter.

> SIGSEGV while adding and dropping the same range partition and concurrently 
> writing
> -----------------------------------------------------------------------------------
>
>                 Key: KUDU-2376
>                 URL: https://issues.apache.org/jira/browse/KUDU-2376
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 1.7.0
>            Reporter: Will Berkeley
>            Priority: Major
>         Attachments: alter_table-test.patch
>
>
> While adding a test to https://gerrit.cloudera.org/#/c/9393/, I ran into the 
> problem that writing while doing a replace tablet operation caused the client 
> to segfault. After inspecting the client code, it looked like the same 
> problem could occur if the same range partition was added and dropped with 
> concurrent writes.
> Attached is a patch that adds a test to alter_table-test that reliably 
> reproduces the segmentation fault.
> I don't totally understand what's happening, but here's what I think I have 
> figured out:
> Suppose the range partition P=[0, 100) is dropped and re-added in a single 
> alter. This causes the tablet X for hash bucket 0 and range partition P to be 
> dropped, and a new one Y created for the same partition. There is a batch 
> pending to X which the client attempts to send to each of the replicas of X 
> in turn. Once the replicas are exhausted, the client attempts to find a new 
> leader with MetaCacheServerPicker::PickLeader, which triggers a master lookup 
> to get the latest consensus info for X (#5 in the big comment in PickLeader). 
> This calls LookupTabletByKey, which attempts a fast path lookup. Assuming 
> other metadata operations have already cached a tablet for Y, the tablet for 
> X will have been removed from the by-table-and-by-key map, and the fast 
> lookup with return an entry for Y. The client code doesn't know the 
> difference because the code paths just look at partition boundaries, which 
> match for X and Y. The lookup doesn't happen, and the client ends up in a 
> pretty tight loop repeating the above process, until the segfault.
> I'm not sure exactly what the segmentation fault is. I looked at it a bit in 
> gdb and the segfault was a few calls deep into STL maps in release mode and 
> inside a refcount increment in debug mode. I'll try to attach some gdb output 
> showing that later.
> The problem is also hinted at in a TODO in PickLeader:
> {noformat}
> // TODO: When we support tablet splits, we should let the lookup shift
> // the write to another tablet (i.e. if it's since been split).
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to