[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-02-09 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has removed a vote on this change.

Change subject: KUDU-2233 Add a test case for compactions in the past
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 12
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-02-09 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8885 )

Change subject: KUDU-2233 Add a test case for compactions in the past
..


Patch Set 12: Verified+1

Overriding, flake


--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 12
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 09 Feb 2018 23:29:57 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-01-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8885 )

Change subject: KUDU-2233 Add a test case for compactions in the past
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc
File src/kudu/integration-tests/fuzz-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@287
PS7, Line 287:  it's
nit: its


http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@299
PS7, Line 299: DCHECK(last_op);
I think CHECK is probably fine here since it's not a perf issue and would make 
a crash easier to follow


http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@302
PS7, Line 302: while (attempts < max_attempts) {
can you use ASSERT_EVENTUALLY for this loop?


http://gerrit.cloudera.org:8080/#/c/8885/7/src/kudu/integration-tests/fuzz-itest.cc@1083
PS7, Line 1083: TEST_F(FuzzTest, DISABLED_TestNoCompactionsInThePast) {
can you add the JIRA number to a comment here?



--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 7
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:36:06 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-01-09 Thread David Ribeiro Alves (Code Review)
Hello Tidy Bot, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8885

to look at the new patch set (#7).

Change subject: KUDU-2233 Add a test case for compactions in the past
..

KUDU-2233 Add a test case for compactions in the past

In some cases we're performing compactions before the clock
has been updated at all, making the compaction snapshot an empty one.
This makes it so the compaction runs on a snapshot that discards
all redo deltas and applies all undos.

When a compaction with such a snapshot finds two duplicate rows
(a ghost and live row) it can't distinguish the two because the redo
DELETE of the ghost row has been discarded, causing the crash
reported in KUDU-2233.

This is a small subset of a broader set of problems that happen
when compactions are performed under such snapshots, the most
obvious one being that updates are discarded silently.

This state is hard to reach as it requires a restart of a tablet
server with a considerable amount of data (i.e. not empty) with
a single wal segment (previous segments having been gc'd in the
previous incarnation), with a single op in it. In particular this
single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are
the only kinds of operations whose timestamps aren't used to
update the clock.

In order to get to this and similar states in fuzz-itest this
patch adds three new types of operations:
- Roll log - Rolls the log if possible.
- GC log - Garbage collects log segments, if possible.
- Run election - Runs an (emulated) leader election increasing the
term and adding a NO_OP to the log.

These operations are presently not part of the operations
considered when generating random test sequences as those might
trigger the bug.

This patch adds a narrow, exact repro of the issue that is described
in KUDU-2233 in the form of an additional test that is presently
disabled. The repro is narrow because it attempts to mimic
the crash from the original ticket exactly, instead of trying to
catch the broader set of problems.

Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
---
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/fuzz-itest.cc
2 files changed, 93 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/7
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 7
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-01-08 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8885

to look at the new patch set (#6).

Change subject: KUDU-2233 Add a test case for compactions in the past
..

KUDU-2233 Add a test case for compactions in the past

In some cases we're performing compactions before the clock
has been updated at all, making the compaction snapshot an empty one.
This makes it so the compaction runs on a snapshot that discards
all redo deltas and applies all undos.

When a compaction with such a snapshot finds two duplicate rows
(a ghost and live row) it can't distinguish the two because the redo
DELETE of the ghost row has been discarded, causing the crash
reported in KUDU-2233.

This is a small subset of a broader set of problems that happen
when compactions are performed under such snapshots, the most
obvious one being that updates are discarded silently.

This state is hard to reach as it requires a restart of a tablet
server with a considerable amount of data (i.e. not empty) with
a single wal segment (previous segments having been gc'd in the
previous incarnation), with a single op in it. In particular this
single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are
the only kinds of operations whose timestamps aren't used to
update the clock.

In order to get to this and similar states in fuzz-itest this
patch adds three new types of operations:
- Roll log - Rolls the log if possible.
- GC log - Garbage collects log segments, if possible.
- Run election - Runs an (emulated) leader election increasing the
term and adding a NO_OP to the log.

These operations are presently not part of the operations
considered when generating random test sequences as those might
trigger the bug.

This patch adds a narrow, exact repro of the issue that is described
in KUDU-2233 in the form of an additional test that is presently
disabled. The repro is narrow because it attempts to mimic
the crash from the original ticket exactly, instead of trying to
catch the broader set of problems.

Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
---
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/fuzz-itest.cc
2 files changed, 93 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/6
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 6
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-01-08 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8885

to look at the new patch set (#4).

Change subject: KUDU-2233 Add a test case for compactions in the past
..

KUDU-2233 Add a test case for compactions in the past

In some cases we're performing compactions before the clock
has been updated at all, making the compaction snapshot an empty one.
This makes it so the compaction runs on a snapshot that discards
all redo deltas and applies all undos.

When a compaction with such a snapshot finds two duplicate rows
(a ghost and live row) it can't distinguish the two because the redo
DELETE of the ghost row has been discarded, causing the crash
reported in KUDU-2233.

This is a small subset of a broader set of problems that happen
when compactions are performed under such snapshots, the most
obvious one being that updates are discarded silently.

This state is hard to reach as it requires a restart of a tablet
server with a considerable amount of data (i.e. not empty) with
a single wal segment (previous segments having been gc'd in the
previous incarnation), with a single op in it. In particular this
single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are
the only kinds of operations whose timestamps aren't used to
update the clock.

In order to get to this and similar states in fuzz-itest this
patch adds three new types of operations:
- Roll log - Rolls the log if possible.
- GC log - Garbage collects log segments, if possible.
- Run election - Runs an (emulated) leader election increasing the
term and adding a NO_OP to the log.

These operations are presently not part of the operations
considered when generating random test sequences as those might
trigger the bug.

This patch adds a narrow, exact repro of the issue that is described
in KUDU-2233 in the form of an additional test that is presently
disabled. The repro is narrow because it attempts to mimic
the crash from the original ticket exactly, instead of trying to
catch the broader set of problems.

Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
---
M src/kudu/consensus/log.cc
M src/kudu/integration-tests/fuzz-itest.cc
2 files changed, 89 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/4
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 4
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2018-01-08 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8885

to look at the new patch set (#3).

Change subject: KUDU-2233 Add a test case for compactions in the past
..

KUDU-2233 Add a test case for compactions in the past

In some cases we're performing compactions before the clock
has been updated at all, making the compaction snapshot an empty one.
This makes it so the compaction runs on a snapshot that discards
all redo deltas and applies all undos.

When a compaction with such a snapshot finds two duplicate rows
(a ghost and live row) it can't distinguish the two because the redo
DELETE of the ghost row has been discarded, causing the crash
reported in KUDU-2233.

This is a small subset of a broader set of problems that happen
when compactions are performed under such snapshots, the most
obvious one being that updates are discarded silently.

This state is hard to reach as it requires a restart of a tablet
server with a considerable amount of data (i.e. not empty) with
a single wal segment (previous segments having been gc'd in the
previous incarnation), with a single op in it. In particular this
single op needs to be a NO_OP or a CHANGE_CONFIG op, as these are
the only kinds of operations whose timestamps aren't used to
update the clock.

This patch adds a narrow, exact repro of the issue that is described
in KUDU-2233. The repro is narrow because it attempts to mimic
the crash from the original ticket exactly, instead of trying to
catch the broader set of problems.

Follow up patches will add broader checks and a fix.

Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
---
M src/kudu/integration-tests/fuzz-itest.cc
1 file changed, 19 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/3
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 3
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2233 Add a test case for compactions in the past

2017-12-19 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8885


Change subject: KUDU-2233 Add a test case for compactions in the past
..

KUDU-2233 Add a test case for compactions in the past

This patch adds a "hard" repro of the issue that is described
in KUDU-2233. The repro is "hard" as the reason why the bug
happens is hard to track from this patch alone, as the bug manifests
as particular problem merging duplicate rows and not as the more general
problem of having compactions happen in the past.

Follow up patches will add the debugging niceties that make the
problem more apparent.

Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
---
M src/kudu/integration-tests/fuzz-itest.cc
1 file changed, 19 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/8885/1
--
To view, visit http://gerrit.cloudera.org:8080/8885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibf5685897ef7580dd743faac4d68690a25663e4c
Gerrit-Change-Number: 8885
Gerrit-PatchSet: 1
Gerrit-Owner: David Ribeiro Alves