[kudu-CR] KUDU-2233 Add a test case for compactions in the past
David Ribeiro Alves has removed a vote on this change. Change subject: KUDU-2233 Add a test case for compactions in the past .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 12 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/8885 ) Change subject: KUDU-2233 Add a test case for compactions in the past .. Patch Set 12: Verified+1 Overriding, flake -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 12 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 09 Feb 2018 23:29:57 + Gerrit-HasComments: No
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8885 ) Change subject: KUDU-2233 Add a test case for compactions in the past .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc File src/kudu/integration-tests/fuzz-itest.cc: http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@287 PS7, Line 287: it's nit: its http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@299 PS7, Line 299: DCHECK(last_op); I think CHECK is probably fine here since it's not a perf issue and would make a crash easier to follow http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@302 PS7, Line 302: while (attempts < max_attempts) { can you use ASSERT_EVENTUALLY for this loop? http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@1083 PS7, Line 1083: TEST_F(FuzzTest, DISABLED_TestNoCompactionsInThePast) { can you add the JIRA number to a comment here? -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 7 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 09 Jan 2018 20:36:06 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
Hello Tidy Bot, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8885 to look at the new patch set (#7). Change subject: KUDU-2233 Add a test case for compactions in the past .. KUDU-2233 Add a test case for compactions in the past In some cases we're performing compactions before the clock has been updated at all, making the compaction snapshot an empty one. This makes it so the compaction runs on a snapshot that discards all redo deltas and applies all undos. When a compaction with such a snapshot finds two duplicate rows (a ghost and live row) it can't distinguish the two because the redo DELETE of the ghost row has been discarded, causing the crash reported in KUDU-2233. This is a small subset of a broader set of problems that happen when compactions are performed under such snapshots, the most obvious one being that updates are discarded silently. This state is hard to reach as it requires a restart of a tablet server with a considerable amount of data (i.e. not empty) with a single wal segment (previous segments having been gc'd in the previous incarnation), with a single op in it. In particular this single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are the only kinds of operations whose timestamps aren't used to update the clock. In order to get to this and similar states in fuzz-itest this patch adds three new types of operations: - Roll log - Rolls the log if possible. - GC log - Garbage collects log segments, if possible. - Run election - Runs an (emulated) leader election increasing the term and adding a NO_OP to the log. These operations are presently not part of the operations considered when generating random test sequences as those might trigger the bug. This patch adds a narrow, exact repro of the issue that is described in KUDU-2233 in the form of an additional test that is presently disabled. The repro is narrow because it attempts to mimic the crash from the original ticket exactly, instead of trying to catch the broader set of problems. Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c --- M src/kudu/consensus/log.cc M src/kudu/integration-tests/fuzz-itest.cc 2 files changed, 93 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/7 -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 7 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8885 to look at the new patch set (#6). Change subject: KUDU-2233 Add a test case for compactions in the past .. KUDU-2233 Add a test case for compactions in the past In some cases we're performing compactions before the clock has been updated at all, making the compaction snapshot an empty one. This makes it so the compaction runs on a snapshot that discards all redo deltas and applies all undos. When a compaction with such a snapshot finds two duplicate rows (a ghost and live row) it can't distinguish the two because the redo DELETE of the ghost row has been discarded, causing the crash reported in KUDU-2233. This is a small subset of a broader set of problems that happen when compactions are performed under such snapshots, the most obvious one being that updates are discarded silently. This state is hard to reach as it requires a restart of a tablet server with a considerable amount of data (i.e. not empty) with a single wal segment (previous segments having been gc'd in the previous incarnation), with a single op in it. In particular this single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are the only kinds of operations whose timestamps aren't used to update the clock. In order to get to this and similar states in fuzz-itest this patch adds three new types of operations: - Roll log - Rolls the log if possible. - GC log - Garbage collects log segments, if possible. - Run election - Runs an (emulated) leader election increasing the term and adding a NO_OP to the log. These operations are presently not part of the operations considered when generating random test sequences as those might trigger the bug. This patch adds a narrow, exact repro of the issue that is described in KUDU-2233 in the form of an additional test that is presently disabled. The repro is narrow because it attempts to mimic the crash from the original ticket exactly, instead of trying to catch the broader set of problems. Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c --- M src/kudu/consensus/log.cc M src/kudu/integration-tests/fuzz-itest.cc 2 files changed, 93 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/6 -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 6 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8885 to look at the new patch set (#4). Change subject: KUDU-2233 Add a test case for compactions in the past .. KUDU-2233 Add a test case for compactions in the past In some cases we're performing compactions before the clock has been updated at all, making the compaction snapshot an empty one. This makes it so the compaction runs on a snapshot that discards all redo deltas and applies all undos. When a compaction with such a snapshot finds two duplicate rows (a ghost and live row) it can't distinguish the two because the redo DELETE of the ghost row has been discarded, causing the crash reported in KUDU-2233. This is a small subset of a broader set of problems that happen when compactions are performed under such snapshots, the most obvious one being that updates are discarded silently. This state is hard to reach as it requires a restart of a tablet server with a considerable amount of data (i.e. not empty) with a single wal segment (previous segments having been gc'd in the previous incarnation), with a single op in it. In particular this single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are the only kinds of operations whose timestamps aren't used to update the clock. In order to get to this and similar states in fuzz-itest this patch adds three new types of operations: - Roll log - Rolls the log if possible. - GC log - Garbage collects log segments, if possible. - Run election - Runs an (emulated) leader election increasing the term and adding a NO_OP to the log. These operations are presently not part of the operations considered when generating random test sequences as those might trigger the bug. This patch adds a narrow, exact repro of the issue that is described in KUDU-2233 in the form of an additional test that is presently disabled. The repro is narrow because it attempts to mimic the crash from the original ticket exactly, instead of trying to catch the broader set of problems. Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c --- M src/kudu/consensus/log.cc M src/kudu/integration-tests/fuzz-itest.cc 2 files changed, 89 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/4 -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 4 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8885 to look at the new patch set (#3). Change subject: KUDU-2233 Add a test case for compactions in the past .. KUDU-2233 Add a test case for compactions in the past In some cases we're performing compactions before the clock has been updated at all, making the compaction snapshot an empty one. This makes it so the compaction runs on a snapshot that discards all redo deltas and applies all undos. When a compaction with such a snapshot finds two duplicate rows (a ghost and live row) it can't distinguish the two because the redo DELETE of the ghost row has been discarded, causing the crash reported in KUDU-2233. This is a small subset of a broader set of problems that happen when compactions are performed under such snapshots, the most obvious one being that updates are discarded silently. This state is hard to reach as it requires a restart of a tablet server with a considerable amount of data (i.e. not empty) with a single wal segment (previous segments having been gc'd in the previous incarnation), with a single op in it. In particular this single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are the only kinds of operations whose timestamps aren't used to update the clock. This patch adds a narrow, exact repro of the issue that is described in KUDU-2233. The repro is narrow because it attempts to mimic the crash from the original ticket exactly, instead of trying to catch the broader set of problems. Follow up patches will add broader checks and a fix. Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c --- M src/kudu/integration-tests/fuzz-itest.cc 1 file changed, 19 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/3 -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 3 Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-2233 Add a test case for compactions in the past
David Ribeiro Alves has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8885 Change subject: KUDU-2233 Add a test case for compactions in the past .. KUDU-2233 Add a test case for compactions in the past This patch adds a "hard" repro of the issue that is described in KUDU-2233. The repro is "hard" as the reason why the bug happens is hard to track from this patch alone, as the bug manifests as particular problem merging duplicate rows and not as the more general problem of having compactions happen in the past. Follow up patches will add the debugging niceties that make the problem more apparent. Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c --- M src/kudu/integration-tests/fuzz-itest.cc 1 file changed, 19 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/1 -- To view, visit http://gerrit.cloudera.org:8080/8885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c Gerrit-Change-Number: 8885 Gerrit-PatchSet: 1 Gerrit-Owner: David Ribeiro Alves