Alexey Serbin created KUDU-2100:
-----------------------------------
Summary: Verify Java client's behavior for tserver and master
fail-over scenario
Key: KUDU-2100
URL: https://issues.apache.org/jira/browse/KUDU-2100
Project: Kudu
Issue Type: Test
Reporter: Alexey Serbin
This is to introduce a scenario where both the leader tserver and leader master
'unexpectedly crash' during the run. The idea is to verify that the client
automatically updates its metacache even if the leader master changes and
manages to send the data to the destination server eventually.
Mike suggested the following test scenario:
# Have a configuration with 3 master servers, 6 tablet servers, and a table
consisting of 1 tablet with replication factor of 3. Let's assume the tablet
are hosted by tablet servers TS1, TS2, and TS3.
# Start the Kudu cluster.
# Run the client to insert at least one row into the table.
# Stop the client's activity, but keep the client object alive to keep it ready
for the next steps.
# 3 times: permanently kill the leader of the tablet, so the tablet eventually
migrates to and is hosted by tablet servers TS4, TS5, TS6.
# Kill the leader master (after the configuration change is committed).
# Run the pre-warmed client to insert some data into the table again. Doing
so, the client should refresh its metadata from the new leader master and be
able to send the data to the right destination.
# Count the number of rows in the table to make sure it matches the expectation.
There was a discussion on when to kill the leader master: prior or after moving
the table to the new set of tablet servers. It seems the latter case (the
sequence suggested above) allows covering a situation when the no master server
recognizes itself as a leader. The client should retry in that case as well
and eventually receive the tablet location info from the established leader
master. If possible, the former case should be covered by the test as well.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)