[kudu-CR] Fix flakiness in tablet replacement-itest

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged.

Change subject: Fix flakiness in tablet_replacement-itest
..


Fix flakiness in tablet_replacement-itest

This test was flaky because we were trying to perform a config change
when it was possible that the leader had not yet committed an operation
in its term.

Without the fix, I looped it 100 times with --stress_cpu_threads=8 and
it failed twice due to this issue. With the fix, I looped it 400 times
with the stress threads and only saw one failure, which seemed like a
straight timeout due to the stress threads monopolizing the CPU.

Change-Id: I15b361179f5e20f848f5b346e202b9217a35b61d
Reviewed-on: http://gerrit.cloudera.org:8080/4342
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/integration-tests/tablet_replacement-itest.cc
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I15b361179f5e20f848f5b346e202b9217a35b61d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] Fix flakiness in tablet replacement-itest

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: Fix flakiness in tablet_replacement-itest
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15b361179f5e20f848f5b346e202b9217a35b61d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] rpc: print nicer errors when mixing up HTTP and KRPC

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged.

Change subject: rpc: print nicer errors when mixing up HTTP and KRPC
..


rpc: print nicer errors when mixing up HTTP and KRPC

Users often accidentally try to navigate to a Kudu server's RPC port in
their browser, or try to send an RPC to the HTTP port. This patch
improves the error messages that are reported as follows:

* Clients trying to speak RPC to the HTTP port now see an error like:

F0908 16:08:15.823751 21988 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: received
  invalid RPC message which appears to be an HTTP response. Verify that
  you have specified a valid RPC port and not an HTTP port.

instead of

F0908 16:09:42.489558 22016 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: Received
  invalid message of size 1213486160 which exceeds the
  rpc_max_message_size of 52428800 bytes

* Servers which receive an HTTP request to their RPC port now report an
  error like:

0908 16:06:22.975401 (+68us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56866: invalid negotation, appears to be an
  HTTP client on the RPC port

instead of:

0908 16:04:47.236063 (+88us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56832: Connection must begin with magic
  number: hrpc

Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Reviewed-on: http://gerrit.cloudera.org:8080/4338
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/serialization.cc
2 files changed, 18 insertions(+), 2 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] ksck: add a note about spurious checksum mismatches

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged.

Change subject: ksck: add a note about spurious checksum mismatches
..


ksck: add a note about spurious checksum mismatches

Safe-time isn't properly respected on the server side, which means that
running ksck in 'checksum' mode against an active table generates
spurious checksum mismatches. This just adds a note to that effect so
that users aren't confused.

Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
Reviewed-on: http://gerrit.cloudera.org:8080/4341
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/tools/ksck.cc
1 file changed, 5 insertions(+), 1 deletion(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] ksck: add a note about spurious checksum mismatches

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: ksck: add a note about spurious checksum mismatches
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] ksck: add a note about spurious checksum mismatches

2016-09-08 Thread Todd Lipcon (Code Review)
Hello Mike Percy,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/4341

to review the following change.

Change subject: ksck: add a note about spurious checksum mismatches
..

ksck: add a note about spurious checksum mismatches

Safe-time isn't properly respected on the server side, which means that
running ksck in 'checksum' mode against an active table generates
spurious checksum mismatches. This just adds a note to that effect so
that users aren't confused.

Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
---
M src/kudu/tools/ksck.cc
1 file changed, 5 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/4341/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] Fix flakiness in tablet replacement-itest

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Fix flakiness in tablet_replacement-itest
..


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/3298/

-- 
To view, visit http://gerrit.cloudera.org:8080/4342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15b361179f5e20f848f5b346e202b9217a35b61d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] rpc: print nicer errors when mixing up HTTP and KRPC

2016-09-08 Thread Todd Lipcon (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4338

to look at the new patch set (#2).

Change subject: rpc: print nicer errors when mixing up HTTP and KRPC
..

rpc: print nicer errors when mixing up HTTP and KRPC

Users often accidentally try to navigate to a Kudu server's RPC port in
their browser, or try to send an RPC to the HTTP port. This patch
improves the error messages that are reported as follows:

* Clients trying to speak RPC to the HTTP port now see an error like:

F0908 16:08:15.823751 21988 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: received
  invalid RPC message which appears to be an HTTP response. Verify that
  you have specified a valid RPC port and not an HTTP port.

instead of

F0908 16:09:42.489558 22016 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: Received
  invalid message of size 1213486160 which exceeds the
  rpc_max_message_size of 52428800 bytes

* Servers which receive an HTTP request to their RPC port now report an
  error like:

0908 16:06:22.975401 (+68us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56866: invalid negotation, appears to be an
  HTTP client on the RPC port

instead of:

0908 16:04:47.236063 (+88us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56832: Connection must begin with magic
  number: hrpc

Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
---
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/serialization.cc
2 files changed, 18 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/38/4338/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] rpc: print nicer errors when mixing up HTTP and KRPC

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: rpc: print nicer errors when mixing up HTTP and KRPC
..


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/3296/

-- 
To view, visit http://gerrit.cloudera.org:8080/4338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] ksck: add a note about spurious checksum mismatches

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: ksck: add a note about spurious checksum mismatches
..


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/3297/

-- 
To view, visit http://gerrit.cloudera.org:8080/4341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85babafb8d264be88dc71c9c878b53dae9c73901
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 20:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

Line 599:   if (history_gc_opts.gc_enabled()) {
> style: do an early return here instead of indenting the whole function
Done


PS20, Line 699: history_gc_opts.gc_enabled() && 
> now that the 'disabled' GC opts actually sets the ancient history mark to 0
ah, sounds good


Line 765: history_gc_opts.IsAncientHistory(redo_mut->timestamp())) {
> same
Done


Line 799:   rowid_t input_row_offset = -1;
> hrm, your comment says 'removed' but still see this
removed now


Line 804: int num_rows = rows.size();
> why this extra variable here? why not just use rows.size() in the loop befo
Done


Line 954:   history_gc_opts.IsAncientHistory(mut->timestamp())) {
> and here
Done


http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tablet/compaction.h
File src/kudu/tablet/compaction.h:

PS20, Line 39: std::move(ahm
> not a big deal (no need to rev the patch for it) but I don't really like th
Done


PS20, Line 63:   enum GcHistoryEnabled {
 : GC_DISABLED = 0,
 : GC_ENABLED = 1,
 :   };
> now that this is a private detail, let's just use bool? not adding a lot si
Done


http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 1535:   history_gc_opts.IsAncientHistory(*snap_timestamp)) {
> same
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 21:

Build Started http://104.196.14.100/job/kudu-gerrit/3295/

-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Mike Percy (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3076

to look at the new patch set (#21).

Change subject: KUDU-236 (part 1). Implement tablet history GC
..

KUDU-236 (part 1). Implement tablet history GC

This patch defines the concept of an "ancient history mark" (AHM) and a
background job that removes history from disk that represents changes
occurring prior to the ancient history mark.

There is also a mechanism to reject requests for scans if they attempt
to open a snapshot scan at a timestamp prior to the AHM.

Included tests:

* Test for Redo GC via major delta compaction
* Test for history GC via tablet flush
* Test for Undo GC via merge compaction
* Test for entire-row GC via merge compaction
* Test for ghost rows in multiple RS reinsert case
* Test for reupdating missed deltas
* Test for partial rowset history GC using alternating rows
* Test for major delta compaction against a subset of columns
* Test for reinsert after delete that is prior to AHM
* Test for opening a scanner at TS < AHM and rejecting that scanner at
  the RPC level

Tests coming in a follow-up commit:

* Randomized test

Missing features coming in follow-up commits:

* Background task that schedules and performs undo GC
* GC history on delta flush

Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
---
M src/kudu/common/timestamp.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/tablet_history_gc-itest.cc
M src/kudu/server/clock.h
M src/kudu/server/hybrid_clock.cc
M src/kudu/server/hybrid_clock.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tablet/compaction.cc
M src/kudu/tablet/compaction.h
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/mutation.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-base.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
A src/kudu/tablet/tablet_history_gc-test.cc
M src/kudu/tserver/tablet_service.cc
24 files changed, 1,038 insertions(+), 170 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3076/21
-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 704:   *ancient_history_mark = 
HybridClock::AddPhysicalTimeToTimestamp(clock_->Now(), neg_hist_age);
> Duh, yeah. Still OK:
still think the math's wrong, because you're subtracting 2**31*100 from the 
hybrid timestamp rather than its physical portion. Here's proof in the form of 
a faiing unit test:


TEST_F(HybridClockTest, TestClockCanSubtractMaxSignedIntSeconds) {
  Timestamp now = clock_->Now();
  Timestamp before = HybridClock::AddPhysicalTimeToTimestamp(
  now,
  MonoDelta::FromSeconds(std::numeric_limits::min()));
  ASSERT_EQ(before.CompareTo(now), -1);
}


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Add RegexpKuduOperationsProducer class

2016-09-08 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3883

to look at the new patch set (#4).

Change subject: Add RegexpKuduOperationsProducer class
..

Add RegexpKuduOperationsProducer class

This patch adds the RegexpKuduOperationsProducer class. This class
serializes Event objects to Kudu inserts or upserts by decoding
the body into a string, parsing the string using a regular
expression, and finally mapping match groups to columns by
matching the name of the match group to the name of the column.
Parsed values are naively coerced to the proper type.

This provides an easy-to-use but flexible way to ingest data with
varying schemas into Kudu from Flume.

Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
---
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducer.java
A 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducerTest.java
2 files changed, 471 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/3883/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] Add RegexpKuduOperationsProducer class

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Add RegexpKuduOperationsProducer class
..


Patch Set 4:

Build Started http://104.196.14.100/job/kudu-gerrit/3293/

-- 
To view, visit http://gerrit.cloudera.org:8080/3883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 20:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

Line 599:   if (history_gc_opts.gc_enabled()) {
style: do an early return here instead of indenting the whole function


PS20, Line 699: history_gc_opts.gc_enabled() && 
now that the 'disabled' GC opts actually sets the ancient history mark to 0, I 
think the 'enabled' check is a redundant branch. (ie IsAncientHistory already 
returns false if it's disabled)


Line 765: history_gc_opts.IsAncientHistory(redo_mut->timestamp())) {
same


Line 799:   rowid_t input_row_offset = -1;
hrm, your comment says 'removed' but still see this


Line 804: int num_rows = rows.size();
why this extra variable here? why not just use rows.size() in the loop before? 
with optimizations on i'm pretty sure the compiler's smart enough to hoist it


Line 954:   history_gc_opts.IsAncientHistory(mut->timestamp())) {
and here


http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tablet/compaction.h
File src/kudu/tablet/compaction.h:

PS20, Line 39: std::move(ahm
not a big deal (no need to rev the patch for it) but I don't really like the 
use of std::move() for objects that are just POD and don't have a special move 
constructor... it's clearly not a performance gain here and I don't think it 
makes the code clearer (rather it makes it seem like something fancy might be 
going on)

The GSG is kind of quiet on move, but the Chromium style guide seems to agree 
with the above: https://www.chromium.org/rvalue-references (see point 10)


PS20, Line 63:   enum GcHistoryEnabled {
 : GC_DISABLED = 0,
 : GC_ENABLED = 1,
 :   };
now that this is a private detail, let's just use bool? not adding a lot since 
it's such a small class and the 'enabled_' name of the bool makes it obvious


http://gerrit.cloudera.org:8080/#/c/3076/20/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 1535:   history_gc_opts.IsAncientHistory(*snap_timestamp)) {
same


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 704:   *ancient_history_mark = 
HybridClock::AddPhysicalTimeToTimestamp(clock_->Now(), neg_hist_age);
> It looks to me like HybridTime has enough bits that this isn't possible, un
shouldn't you be multiplying by 100 in your math instead of 1000, since 
it's storing microseconds in hybrid time and not millis?


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Create base class for MiniCluster and ExternalMiniCluster

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Create base class for MiniCluster and ExternalMiniCluster
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3974/6/src/kudu/integration-tests/external_mini_cluster.h
File src/kudu/integration-tests/external_mini_cluster.h:

Line 143:   void Shutdown(ClusterNodes nodes) override;
> I feel your pain, but since this C++ is a pain about default arguments with
how about doing this by keeping 'Shutdown' as a final superclass method with no 
arguments which forwards to 'virtual void Shutdown(ClusterNodes nodes)' 
(required argument) so that you can still call Shutdown(), but don't run into 
the default argument fiasco?


-- 
To view, visit http://gerrit.cloudera.org:8080/3974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I62256168a9245c845e99f7253f07e69a225954bf
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..


KUDU-1304 - [python] Enable listing of tablet servers

Kudu's python client currently doesn't expose the ListTabletServers
method. To date, this has presented issues with unit tests executing
before the tservers have started.  This patch exposes the method and
integrates it into the unit tests so that race conditions no longer
occur.  This patch also includes an additional test to confirm the
integrity of the data returned from the method.

Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Reviewed-on: http://gerrit.cloudera.org:8080/4328
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M python/kudu/client.pyx
M python/kudu/libkudu_client.pxd
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
4 files changed, 77 insertions(+), 0 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Add RegexpKuduOperationsProducer class

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Add RegexpKuduOperationsProducer class
..


Patch Set 3:

Build Started http://104.196.14.100/job/kudu-gerrit/3292/

-- 
To view, visit http://gerrit.cloudera.org:8080/3883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] Add RegexpKuduOperationsProducer class

2016-09-08 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3883

to look at the new patch set (#3).

Change subject: Add RegexpKuduOperationsProducer class
..

Add RegexpKuduOperationsProducer class

This patch adds the RegexpKuduOperationsProducer class. This class
serializes Event objects to Kudu inserts or upserts by decoding
the body into a string, parsing the string using a regular
expression, and finally mapping match groups to columns by
matching the name of the match group to the name of the column.
Parsed values are naively coerced to the proper type.

This provides an easy-to-use but flexible way to ingest data with
varying schemas into Kudu from Flume.

Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
---
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducer.java
A 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducerTest.java
2 files changed, 473 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/3883/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3883
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd64542095f0064a21635ec76c46d6a1be98b7ea
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1135 (part 1): avoid flushing cmeta to disk twice when voting

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1135 (part 1): avoid flushing cmeta to disk twice when 
voting
..


Patch Set 1: Code-Review+1

(1 comment)

lgtm

http://gerrit.cloudera.org:8080/#/c/4333/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 1362:   // our vote.
> nit: no need to wrap
I personally prefer wrapping comments at 80 chars unless there is a reason not 
to, and that's what I usually do. It's slightly easier to read. But 80 chars is 
often too short for code...


-- 
To view, visit http://gerrit.cloudera.org:8080/4333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iecc55bc9e96dcdc918ede1190b7cbac719f95506
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1500: Data race in RaftConsensusITest.TestCorruptReplicaMetadata

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1500: Data race in 
RaftConsensusITest.TestCorruptReplicaMetadata
..


Patch Set 17:

Do you guys think we should pull this in for 1.0 even though we don't have an 
upgrade test yet? Maybe it's worth filing a JIRA and committing this, since I 
think Adar saw it in a test environment.

-- 
To view, visit http://gerrit.cloudera.org:8080/3823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If57617e22b41296b8d4e8ad131220f1ebb235019
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Create base class for MiniCluster and ExternalMiniCluster

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: Create base class for MiniCluster and ExternalMiniCluster
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3974/6/src/kudu/integration-tests/external_mini_cluster.h
File src/kudu/integration-tests/external_mini_cluster.h:

Line 143:   void Shutdown(ClusterNodes nodes) override;
> I feel like this is a case where a default argument (ALL) would be nice, si
I feel your pain, but since this C++ is a pain about default arguments with 
polymorphism. The default argument is associated with the static type of the 
pointer: 
http://stackoverflow.com/questions/5273144/c-polymorphism-and-default-argument

It's quite gross, so either we set the default = ALL in the superclass and base 
classes, and just make sure that nobody makes them inconsistent over time, or 
we live with having to pass an argument. I chose the latter since I thought the 
former was kind of nerve wracking.


http://gerrit.cloudera.org:8080/#/c/3974/6/src/kudu/integration-tests/mini_cluster_base.h
File src/kudu/integration-tests/mini_cluster_base.h:

PS6, Line 36: Base class for MiniCluster implementations
> I think we should also make interfaces for Master and TabletServer (and may
sounds alright, but I don't have a use case for it so let's defer


Line 39: class MiniClusterBase {
> I don't mean to bikeshed, but I think it would be great if this were just c
I would tend to agree but I don't want to make this change as part of this 
commit since this is pretty yak shavey.


Line 54: 
> I think you can add GetLeaderMasterIndex() too; it's common across both clu
would prefer to defer this to when we need it


Line 61:   virtual Status CreateClient(client::KuduClientBuilder* builder,
> style: I don't think using a pointer to denote 'optional' is the best. This
This is tough to do because KuduClientBuilder is a PIMPLed class and is not 
copyable or movable, and needs to be modified by the callee (can't be const). 
So I don't think I can pass it in via an optional, unless I make it a 
ptr-to-optional, which would be painful.


-- 
To view, visit http://gerrit.cloudera.org:8080/3974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I62256168a9245c845e99f7253f07e69a225954bf
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has posted comments on this change.

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4328/2/python/kudu/client.pyx
File python/kudu/client.pyx:

PS2, Line 368: ts = TabletServer()
 : result.append(ts._init(tservers[i])
> I assume this 'init' thing is because it's not possible to do a normal __in
Yep, annoying, seemed like the only other option was to deconstruct/reconstruct 
the class altogether but that seemed like a worse idea.


PS2, Line 499: Repres
> typo
Done


PS2, Line 500: TabletSe
> expand out the abbreviation here 'TabletServers'
Done


PS2, Line 507: _init
> does this show up as a method in the generated python class? If so, maybe i
Done


Line 510: 
> I think we need some __dealloc__ here which takes care of deleting _tserver
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..


Patch Set 3:

Build Started http://104.196.14.100/job/kudu-gerrit/3291/

-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Jordan Birdsell (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4328

to look at the new patch set (#3).

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..

KUDU-1304 - [python] Enable listing of tablet servers

Kudu's python client currently doesn't expose the ListTabletServers
method. To date, this has presented issues with unit tests executing
before the tservers have started.  This patch exposes the method and
integrates it into the unit tests so that race conditions no longer
occur.  This patch also includes an additional test to confirm the
integrity of the data returned from the method.

Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
---
M python/kudu/client.pyx
M python/kudu/libkudu_client.pxd
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
4 files changed, 77 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/4328/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Mike Percy (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3076

to look at the new patch set (#20).

Change subject: KUDU-236 (part 1). Implement tablet history GC
..

KUDU-236 (part 1). Implement tablet history GC

This patch defines the concept of an "ancient history mark" (AHM) and a
background job that removes history from disk that represents changes
occurring prior to the ancient history mark.

There is also a mechanism to reject requests for scans if they attempt
to open a snapshot scan at a timestamp prior to the AHM.

Included tests:

* Test for Redo GC via major delta compaction
* Test for history GC via tablet flush
* Test for Undo GC via merge compaction
* Test for entire-row GC via merge compaction
* Test for ghost rows in multiple RS reinsert case
* Test for reupdating missed deltas
* Test for partial rowset history GC using alternating rows
* Test for major delta compaction against a subset of columns
* Test for reinsert after delete that is prior to AHM
* Test for opening a scanner at TS < AHM and rejecting that scanner at
  the RPC level

Tests coming in a follow-up commit:

* Randomized test

Missing features coming in follow-up commits:

* Background task that schedules and performs undo GC
* GC history on delta flush

Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
---
M src/kudu/common/timestamp.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/tablet_history_gc-itest.cc
M src/kudu/server/clock.h
M src/kudu/server/hybrid_clock.cc
M src/kudu/server/hybrid_clock.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/compaction-test.cc
M src/kudu/tablet/compaction.cc
M src/kudu/tablet/compaction.h
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/mutation.h
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-base.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
A src/kudu/tablet/tablet_history_gc-test.cc
M src/kudu/tserver/tablet_service.cc
24 files changed, 1,047 insertions(+), 170 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3076/20
-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Create base class for MiniCluster and ExternalMiniCluster

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Create base class for MiniCluster and ExternalMiniCluster
..


Patch Set 7:

Build Started http://104.196.14.100/job/kudu-gerrit/3289/

-- 
To view, visit http://gerrit.cloudera.org:8080/3974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I62256168a9245c845e99f7253f07e69a225954bf
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-236 (part 1). Implement tablet history GC

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
..


Patch Set 19:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction-test.cc
File src/kudu/tablet/compaction-test.cc:

Line 66:   return { HistoryGcOpts::GC_DISABLED, Timestamp::kInvalidTimestamp };
> is there special handling for kInvalidTimestamp? I think kInvalidTimestamp 
Replaced with Timestamp(0)


Line 217: 
> nit: no need for blank lines
Done


Line 387:   vector gced_input_rows;
> unused
Done


Line 585:   HistoryGcOpts history_gc_opts = { HistoryGcOpts::GC_DISABLED, 
Timestamp(0) };
> use NoHistoryGC()?
Done


Line 639:   HistoryGcOpts history_gc_opts = { HistoryGcOpts::GC_DISABLED, 
Timestamp(0) };
> same
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

Line 638: DVLOG(2) << "Ancient history mark: " << 
history_gc_opts.ancient_history_mark.ToString()
> given this is a per-row log message, i think DVLOG(3) or DVLOG(4) is probab
Done


Line 647:   prev_undo->set_next(nullptr);
> hrm, the comment on line 629-631 is no longer correct, then. (we were const
Oh, good point. Hmm. I factored this into its own function and added a call to 
it at call sites that also call ApplyMutationsAndGenerateUndos() prior to 
passing the row into ApplyMutationsAndGenerateUndos(). Alternatively, we could 
change ApplyMutationsAndGenerateUndos() to take a non-const input row. I think 
the approach I took is probably better, though.


Line 800:   rowid_t input_row_offset = -1;
> is this variable necessary? seems like it's only used in the DVLOG below, b
Removed


Line 838:   DVLOG(2) << "Output Row: " << 
dst_row.schema()->DebugRow(dst_row) <<
> bump to a higher DVLOG level since this is a big message and every row
Done


PS19, Line 863:   DVLOG(2) << "Output Row: " << 
dst_row.schema()->DebugRow(dst_row)
  :<< "; RowId: " << index_in_current_drs;
> this vlog is redundant with the new one on 838-844
Bumped to DVLOG(5) since it shows the index in the current drs and we can't get 
that for GCed rows


PS19, Line 954: history_gc_opts.enabled == HistoryGcOpts::GC_ENABLED &&
  :   
mut->timestamp().ComesBefore(history_gc_opts.ancient_history_mark
> this condition shows up a lot. what about adding a method like bool History
added a method IsAncientHistory()


Line 956: DVLOG(2) << "Skipping GCed input row: " << 
schema->DebugRow(row.row)
> this log message isn't accurate since it could still be reinserted later. p
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction.h
File src/kudu/tablet/compaction.h:

PS19, Line 38:  GC_ENABLED,
 : GC_DISABLED
> reorder these and make GC_DISABLED have the value 0 so that if this is acci
Did the class thing.


Line 46:   // A timestamp prior to which no history will be preserved.
> add "Ignored if enabled != GC_ENABLED"
Done


Line 155: // 'is_history_truncated': Set to true if this row had a "reinsert 
after
> update (it's an int, not a bool)
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

PS19, Line 564:   // TODO: Should we be reopening the tablet in a different way 
to persist the
  :   // clock / timestamps?
> does this need to be addressed here?
I don't think this needs to be addressed as part of this patch.


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 704:   *ancient_history_mark = 
HybridClock::AddPhysicalTimeToTimestamp(clock_->Now(), neg_hist_age);
> slightly worried about underflow here if the user says "I want to keep all 
It looks to me like HybridTime has enough bits that this isn't possible, unless 
their clock says it's 1970 or something. If that was the case, NTP would be 
messed up, and I don't think Kudu would start. HybridTime uses 52 bits for the 
physical time in microseconds and 12 bits for the logical time. Here's the 
napkin math:

  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print((time() * 1000) << 12, "\n");'
  603497342976
  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print(2**31 * 1000, "\n");'
 2147483648000
  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print(((time() * 1000) << 12) - 
(2**31 * 1000), "\n");'
  6032826847232000


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet_history_gc-test.cc
File src/kudu/tablet/tablet_history_gc-test.cc:

PS19, Line 193: // Test that no UNDO DELETE is generated after deleting a row 
from the MRS and
  : // waiting for AHM to pass, then flushing the MRS.
  : T
> this test actually tests two separate cases. can you expand this comment to
Done


PS19, Line 243: represents the insert. In the UNDO
  : // representation, 

[kudu-CR] KUDU-1065: [java client] Flexible Partition Pruning

2016-09-08 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-1065: [java client] Flexible Partition Pruning
..


KUDU-1065: [java client] Flexible Partition Pruning

This commit introduces an internal utility ByteVec class which is a
mashup of the C++ std::string / Rust Vec types. KeyEncoder has been
transitioned to use this type instead of ByteArrayOutputStream. The
partition pruning algorithm incrementally builds up partition keys from
predicates, and requires cloning the keys as they are being built in
order to multiply over hash partition buckets. ByteArrayOutputStream
doesn't have a clone method. ByteArrayOutputStream is also synchronized
internally, which is dumb. Thus begat ByteVec.

This version of partition pruning only looks at predicates when
determining which partitions to prune. Constraints in the primary key
bounds are not considered, unless the table is range partitioned over
the primary key columns and not hash partitioned (simple partitioning).
This limits the pruned partitions in some pretty rare cases, but the
workaround of explicitly setting the predicate is not too onerous.

Finally, this commit changes the default test flags to remove mini
cluster verbose logging, since it is extremely noisy.

Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Reviewed-on: http://gerrit.cloudera.org:8080/4299
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo 
---
M 
java/kudu-client/src/main/java/org/apache/kudu/client/AbstractKuduScannerBuilder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduPredicate.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
A java/kudu-client/src/main/java/org/apache/kudu/client/PartitionPruner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartitionSchema.java
A java/kudu-client/src/main/java/org/apache/kudu/util/ByteVec.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
A java/kudu-client/src/test/java/org/apache/kudu/client/TestPartitionPruner.java
A java/kudu-client/src/test/java/org/apache/kudu/util/TestByteVec.java
M java/kudu-client/src/test/resources/flags
M src/kudu/common/partition_pruner-test.cc
17 files changed, 1,722 insertions(+), 198 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1065: [java client] Flexible Partition Pruning

2016-09-08 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1065: [java client] Flexible Partition Pruning
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1304 - [python] Enable listing of tablet servers

2016-09-08 Thread Jordan Birdsell (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4328

to look at the new patch set (#2).

Change subject: KUDU-1304 - [python] Enable listing of tablet servers
..

KUDU-1304 - [python] Enable listing of tablet servers

Kudu's python client currently doesn't expose the ListTabletServers
method. To date, this has presented issues with unit tests executing
before the tservers have started.  This patch exposes the method and
integrates it into the unit tests so that race conditions no longer
occur.  This patch also includes an additional test to confirm the
integrity of the data returned from the method.

Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
---
M python/kudu/client.pyx
M python/kudu/libkudu_client.pxd
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
4 files changed, 73 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/4328/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51794168750b358d8c1b371ac2bd08b5c8f15395
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc: print nicer errors when mixing up HTTP and KRPC

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: rpc: print nicer errors when mixing up HTTP and KRPC
..


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/3287/

-- 
To view, visit http://gerrit.cloudera.org:8080/4338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: No


[kudu-CR] rpc: print nicer errors when mixing up HTTP and KRPC

2016-09-08 Thread Todd Lipcon (Code Review)
Hello Mike Percy,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/4338

to review the following change.

Change subject: rpc: print nicer errors when mixing up HTTP and KRPC
..

rpc: print nicer errors when mixing up HTTP and KRPC

Users often accidentally try to navigate to a Kudu server's RPC port in
their browser, or try to send an RPC to the HTTP port. This patch
improves the error messages that are reported as follows:

* Clients trying to speak RPC to the HTTP port now see an error like:

F0908 16:08:15.823751 21988 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: received
  invalid RPC message which appears to be an HTTP response. Verify that
  you have specified a valid RPC port and not an HTTP port.

instead of

F0908 16:09:42.489558 22016 kudu-admin.cc:168] Check failed: _s.ok() Bad
  status: IO error: Could not locate the leader master: Client connection
  negotiation failed: client connection to 127.0.0.1:8051: Received
  invalid message of size 1213486160 which exceeds the
  rpc_max_message_size of 52428800 bytes

* Servers which receive an HTTP request to their RPC port now report an
  error like:

0908 16:06:22.975401 (+68us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56866: invalid negotation, appears to be an
  HTTP client on the RPC port

instead of:

0908 16:04:47.236063 (+88us) negotiation.cc:234] Negotiation
  complete: Invalid argument: Server connection negotiation failed: server
  connection from 127.0.0.1:56832: Connection must begin with magic
  number: hrpc

Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
---
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/serialization.cc
2 files changed, 16 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/38/4338/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I77ce8c8903ddf18592ba634a90d7efc4083d572a
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] Tie log retention to consensus watermarks

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Tie log retention to consensus watermarks
..


Patch Set 6:

Build Started http://104.196.14.100/job/kudu-gerrit/3286/

-- 
To view, visit http://gerrit.cloudera.org:8080/4177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icfc071270510f3dc3c65f88d615e93c6ffb26b12
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Tie log retention to consensus watermarks

2016-09-08 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4177

to look at the new patch set (#6).

Change subject: Tie log retention to consensus watermarks
..

Tie log retention to consensus watermarks

This changes the calculation of log retention to consult consensus.
Consensus returns a struct which indicates the watermark necessary for
durability (the committed index) as well as the watermark necessary to
catch up other peers. The durability watermark is then further
constrained by the LogAnchorRegistry, as before, to ensure that no entry
corresponding to yet-unflushed data can be GCed.

This new struct containing the "for-durability" and "for-peers"
watermarks replaces the old single "must be retained" watermark that the
log GC code used before.

The new struct is passed down into the log, and we use the following
policy:

- we always maintain any logs necessary for durability (the minimum of
  those entries needed by consensus, and those entries needed by the
  tablet itself)
- beyond that, we try to retain logs to catch up lagging peers, however
  we never maintain more than --log_max_segments_to_retain (a new
  configuration)

I removed the old flag --log_min_seconds_to_retain, since its main
purpose was for dealing with lagging peers, and that is now handled by
directly consulting consensus.

The one tricky bit of the policy is that, even though the peer catch-up
figures into log retention, we do _not_ want it to impact the
calculation of flush priority. In other words, even if the user is OK
retaining 10GB of logs to catch up trailing peers, they probably still
want to flush more aggressively than that so they can avoid very long
startup times. So, the peer-based watermark is not used during the
mapping of log anchors to retention amounts.

Note that the above is only relevant once we have implemented KUDU-38:
we currently will replay all of the retained logs even though we are
aggressively flushing to keep the durability-related retention bounded.

In practice, even without KUDU-38, this patch shouldn't have a large
negative effect on restart times. In fact, in many cases it can
_improve_ startup times, because in most steady workloads we don't have
peers that are extremely far behind. Our log retention only increases in
those cases, and only on those tablets which have a lagging follower.
For other tablets, the new retention policies actually serve to reduce
the number of retained segments, so if there are no laggy peers, we'll
start up faster.

Manually tested for now as follows:
- started a three-node cluster (locally), set to roll logs at 1MB
  segments, but otherwise default
- started an insert workload against a single-tablet table
- I could see that the three servers were maintaining 2 WAL segments in
  their WAL directory.
- I kill -STOPped a random server while continuing to insert. I saw that
  the WALs in this tablet server's directory froze as is (obviously),
  and the other two kept rolling. However, because of this change, the
  other servers started retaining wals starting from the point where I
  had stopped the follower.
- If I let the insert workload continue, the live servers kept rolling
  up until they had 10 segments (default --log_max_segments_to_retain)
  at which point they dropped the oldest log.
- I verified that, during this period while the extra segments were
  retained, the servers continued to flush frequently so that their
  recovery time would be bounded.
- I also verified that, if I un-paused the follower before the others
  had evicted it, it was able to catch up, at which point the other
  servers GCed those extra logs they had been retaining.

The above scenario is also tested through modifications to
RaftConsensusITest.TestCatchupAfterOpsEvicted and various log-test test
cases.

Change-Id: Icfc071270510f3dc3c65f88d615e93c6ffb26b12
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log-test-base.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/log_reader.cc
M src/kudu/consensus/log_reader.h
M src/kudu/consensus/raft_consensus-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/external_mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/external_mini_cluster_fs_inspector.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tablet/tablet_peer-test.cc
M src/kudu/tablet/tablet_peer.cc
M src/kudu/tablet/tablet_peer.h
23 files changed, 328 insertions(+), 336 deletions(-)


  git pull 

[kudu-CR] Tie log retention to consensus watermarks

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Tie log retention to consensus watermarks
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4177/5//COMMIT_MSG
Commit Message:

Line 11: durability (the committed index) as well as the watermark necessary to
> not sure about this link between the commit index and durability, at least 
k, clarified message.


Line 18: - we always maintain any logs necessary for durability
> could you make this clearer? what do we do here now? calc the min between a
clarified


http://gerrit.cloudera.org:8080/#/c/4177/5/src/kudu/consensus/consensus.h
File src/kudu/consensus/consensus.h:

Line 266:   // retain.
> nit no need to wrap
Done


http://gerrit.cloudera.org:8080/#/c/4177/5/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 1879:   return log::RetentionIndexes(queue_->GetCommittedIndex(), // for 
durability
> add a note that it's ok not to get these atomically
Done


http://gerrit.cloudera.org:8080/#/c/4177/5/src/kudu/integration-tests/external_mini_cluster_fs_inspector.h
File src/kudu/integration-tests/external_mini_cluster_fs_inspector.h:

Line 70:   // on TS 'index'.
> nit no need to wrap
Done


http://gerrit.cloudera.org:8080/#/c/4177/5/src/kudu/tablet/tablet_peer.h
File src/kudu/tablet/tablet_peer.h:

Line 193:   // up peers.
> nit: wrapping
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icfc071270510f3dc3c65f88d615e93c6ffb26b12
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1065: [java client] Flexible Partition Pruning

2016-09-08 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1065: [java client] Flexible Partition Pruning
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4299/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java:

Line 48: import java.util.concurrent.atomic.AtomicBoolean;
> Now unused?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1065: [java client] Flexible Partition Pruning

2016-09-08 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4299

to look at the new patch set (#3).

Change subject: KUDU-1065: [java client] Flexible Partition Pruning
..

KUDU-1065: [java client] Flexible Partition Pruning

This commit introduces an internal utility ByteVec class which is a
mashup of the C++ std::string / Rust Vec types. KeyEncoder has been
transitioned to use this type instead of ByteArrayOutputStream. The
partition pruning algorithm incrementally builds up partition keys from
predicates, and requires cloning the keys as they are being built in
order to multiply over hash partition buckets. ByteArrayOutputStream
doesn't have a clone method. ByteArrayOutputStream is also synchronized
internally, which is dumb. Thus begat ByteVec.

This version of partition pruning only looks at predicates when
determining which partitions to prune. Constraints in the primary key
bounds are not considered, unless the table is range partitioned over
the primary key columns and not hash partitioned (simple partitioning).
This limits the pruned partitions in some pretty rare cases, but the
workaround of explicitly setting the predicate is not too onerous.

Finally, this commit changes the default test flags to remove mini
cluster verbose logging, since it is extremely noisy.

Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
---
M 
java/kudu-client/src/main/java/org/apache/kudu/client/AbstractKuduScannerBuilder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduPredicate.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduTable.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
A java/kudu-client/src/main/java/org/apache/kudu/client/PartitionPruner.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartitionSchema.java
A java/kudu-client/src/main/java/org/apache/kudu/util/ByteVec.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
A java/kudu-client/src/test/java/org/apache/kudu/client/TestPartitionPruner.java
A java/kudu-client/src/test/java/org/apache/kudu/util/TestByteVec.java
M java/kudu-client/src/test/resources/flags
M src/kudu/common/partition_pruner-test.cc
17 files changed, 1,722 insertions(+), 198 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/4299/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tool: port kudu-admin to 'kudu cluster'

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: tool: port kudu-admin to 'kudu cluster'
..


Patch Set 4:

(2 comments)

didn't really look at the implementation because I assume that's just moved 
code and boilerplate, and imagine it's fine. But I still am -1 on some of the 
organizational stuff, and nervous about "we'll clean it up later" given this is 
a public API and we are counting the days to 1.0.0rc0 in single digits

http://gerrit.cloudera.org:8080/#/c/4180/4/src/kudu/tools/kudu-admin-test.cc
File src/kudu/tools/kudu-admin-test.cc:

PS4, Line 116: "cluster",
 : "change_config",
 : "add_server",
repeating my earlier comment that I don't think this grouping of commands makes 
sense. This sounds like you are "changing the configuration of the cluster to 
add a server" which is not at all what's happening.

I think 'kudu tablet change_config' or 'kudu tablet add_replica' or something 
would make a lot more sense what's going on.


Line 182: "delete_table",
still think "kudu table delete" and "kudu table list" would be more sensical, 
but this one isn't as egregious as the change_config one above.


-- 
To view, visit http://gerrit.cloudera.org:8080/4180
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id6e96fc5b0106946f29a2faee8e2667be738b740
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1420. client: remove unimplemented ApplyAsync() method

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1420. client: remove unimplemented ApplyAsync() method
..


KUDU-1420. client: remove unimplemented ApplyAsync() method

This was never implemented, and doesn't seem to have value as
documentation of a non-existent method.

Change-Id: If5006994039d13055a71235ebf4aa547329d35aa
Reviewed-on: http://gerrit.cloudera.org:8080/4332
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo 
---
M src/kudu/client/client.h
1 file changed, 0 insertions(+), 17 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4332
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If5006994039d13055a71235ebf4aa547329d35aa
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1420. client: remove unimplemented ApplyAsync() method

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1420. client: remove unimplemented ApplyAsync() method
..


Patch Set 1:

OK, I'll commit this removal for now, can always add it back with the magic of 
source control.

-- 
To view, visit http://gerrit.cloudera.org:8080/4332
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If5006994039d13055a71235ebf4aa547329d35aa
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: API and style improvements to the Kudu Flume Sink
..


Patch Set 3:

> > don't forget about RegexpKuduEventProducer,
 > > which I need to rebase and refactor as RegexpKuduOperationsProducer.
 > > Will try to do that later today.
 > 
 > That doesn't block this patch, does it? This can be committed and
 > you can rebase it onto master.

That's the plan.

-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged.

Change subject: API and style improvements to the Kudu Flume Sink
..


API and style improvements to the Kudu Flume Sink

This patch cleans up the Flume integration code a bit. Specifically:
1. s/EventProducer/OperationsProducer/ since the "KuduEventProducer"
consumed Flume Events and produced Kudu Operations.
2. The KuduOperationsProducer API was changed. Now the initialize
method takes just a KuduTable, and should be called once to initialize
the KuduOperationsProducer after it is configured. The getOperations
method now takes the Event instead.
3. The close method for a KuduOperationsProducer is now called when
the KuduSink is called. Previously this was implied so in a comment
but was not true.
4. General clean-up and style improvements.

Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Reviewed-on: http://gerrit.cloudera.org:8080/4320
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M docs/release_notes.adoc
R 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducer.java
D 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduEventProducer.java
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduOperationsProducer.java
M java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
M 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSinkConfigurationConstants.java
D 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKeyedKuduEventProducer.java
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKeyedKuduOperationsProducer.java
R 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKuduOperationsProducer.java
R java/kudu-flume-sink/src/test/avro/testAvroKuduOperationsProducer.avsc
R 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducerTest.java
R 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/KeyedKuduOperationsProducerTest.java
12 files changed, 375 insertions(+), 321 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: API and style improvements to the Kudu Flume Sink
..


Patch Set 3:

OK perfect

-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: API and style improvements to the Kudu Flume Sink
..


Patch Set 3:

> > Do we also want something in the release notes about the API
 > change
 > > + AvroKuduOperationsProducer?
 > 
 > Looks good. Yes, please add to the release notes for 1.0.0 in the
 > Incompatibility section as well as new features.
 > 
 > I think the AvroKuduOperationsProducer would also make a good topic
 > for a blog post, if / when you have time, because it makes
 > integrating with Kudu so easy.

Added to release notes for 1.0.0. I think I can find time for a blog post. But 
also don't forget about RegexpKuduEventProducer, which I need to rebase and 
refactor as RegexpKuduOperationsProducer. Will try to do that later today.

-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4320

to look at the new patch set (#3).

Change subject: API and style improvements to the Kudu Flume Sink
..

API and style improvements to the Kudu Flume Sink

This patch cleans up the Flume integration code a bit. Specifically:
1. s/EventProducer/OperationsProducer/ since the "KuduEventProducer"
consumed Flume Events and produced Kudu Operations.
2. The KuduOperationsProducer API was changed. Now the initialize
method takes just a KuduTable, and should be called once to initialize
the KuduOperationsProducer after it is configured. The getOperations
method now takes the Event instead.
3. The close method for a KuduOperationsProducer is now called when
the KuduSink is called. Previously this was implied so in a comment
but was not true.
4. General clean-up and style improvements.

Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
---
M docs/release_notes.adoc
R 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducer.java
D 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduEventProducer.java
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduOperationsProducer.java
M java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
M 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSinkConfigurationConstants.java
D 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKeyedKuduEventProducer.java
A 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKeyedKuduOperationsProducer.java
R 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/SimpleKuduOperationsProducer.java
R java/kudu-flume-sink/src/test/avro/testAvroKuduOperationsProducer.avsc
R 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducerTest.java
R 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/KeyedKuduOperationsProducerTest.java
12 files changed, 375 insertions(+), 321 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/4320/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Inlined dispatch for predicate evaluation

2016-09-08 Thread Andrew Wong (Code Review)
Hello Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4164

to look at the new patch set (#10).

Change subject: Inlined dispatch for predicate evaluation
..

Inlined dispatch for predicate evaluation

In order to evaluate a predicate, the correct comparator must first be
determined. Batched evaluation, which calls
ColumnPredicate::Evaluate(), will make a function call to the correct
comparator for every row. The evaluation itself gets split into
batches, but each row in the batch still makes the function call,
which slows performance. To remediate this, this evaluation has been
templatized so there is only a single function call per batch.

Additionally, when decoder-level evaluation is enabled, rather than
occuring in batches, dispatch occurs for each cell at EvaluateCell,
which leads to poorer performance.  To remediate this, the dispatch
has been templatized in hopes that the dispatching and branching are
inlined.

This figure shows the performance of plain encoding for decoder-level
evaluation without this templating adjustment. The query selects a one
tenth the values out of a plain-encoded column containing 10M strings.
EvaluateCell (the Pushdown bar) in this implementation gets compiled
to a dispatched function call per row, which slows it down.  Evaluate
(the Normal Eval bar) also dispatches once per row.
https://raw.githubusercontent.com/anjuwong/kudu/695cbaa016a8e94f164105d84024ceaac4b62375/docs/images/SELECT_WHERE_EQUAL_without_inlining.png

Compare the above with the plot below, which is the result of the same
query, but with inlined dispatch. The comparator is known statically,
so calls to EvaluateCell will be inlined. Additionally, Evaluate only
dispatches once per batch.
https://raw.githubusercontent.com/anjuwong/kudu/695cbaa016a8e94f164105d84024ceaac4b62375/docs/images/SELECT_WHERE_EQUAL_with_inlining.png

Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
---
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
4 files changed, 57 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/4164/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Inlined dispatch for predicate evaluation

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Inlined dispatch for predicate evaluation
..


Patch Set 10:

Build Started http://104.196.14.100/job/kudu-gerrit/3283/

-- 
To view, visit http://gerrit.cloudera.org:8080/4164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: API and style improvements to the Kudu Flume Sink
..


Patch Set 2:

> Do we also want something in the release notes about the API change
 > + AvroKuduOperationsProducer?

Looks good. Yes, please add to the release notes for 1.0.0 in the 
Incompatibility section as well as new features.

I think the AvroKuduOperationsProducer would also make a good topic for a blog 
post, if / when you have time, because it makes integrating with Kudu so easy.

-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] [server] clean-up: stringstream --> ostringstream

2016-09-08 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: [server] clean-up: stringstream --> ostringstream
..


[server] clean-up: stringstream --> ostringstream

Replaced std::stringstream with std::ostringstream in places
where only output stream functionality is required from the
stream buffer class.

Include iosfwd instead of stream headers for forward declarations.
Minor clean-up on header files include order to be in line
with the styling guide.

There are not any functional changes in this changelist.

Change-Id: I059c19bb287ee37f96e5ad42f4886015a1697d19
Reviewed-on: http://gerrit.cloudera.org:8080/4303
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
Reviewed-by: Will Berkeley 
---
M src/kudu/client/samples/sample.cc
M src/kudu/codegen/code_generator.cc
M src/kudu/codegen/module_builder.cc
M src/kudu/master/master-path-handlers.cc
M src/kudu/master/master-path-handlers.h
M src/kudu/server/default-path-handlers.cc
M src/kudu/server/pprof-path-handlers.cc
M src/kudu/server/rpcz-path-handler.cc
M src/kudu/server/server_base.cc
M src/kudu/server/tracing-path-handlers.cc
M src/kudu/server/tracing-path-handlers.h
M src/kudu/server/webserver-test.cc
M src/kudu/server/webserver.cc
M src/kudu/server/webserver.h
M src/kudu/server/webui_util.cc
M src/kudu/server/webui_util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tools/fs_dump-tool.cc
M src/kudu/tools/fs_list-tool.cc
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck_remote-test.cc
M src/kudu/tserver/tablet_server-stress-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/tserver/tserver-path-handlers.h
M src/kudu/util/jsonwriter.cc
M src/kudu/util/jsonwriter.h
M src/kudu/util/logging.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/metrics-test.cc
M src/kudu/util/metrics.cc
M src/kudu/util/os-util.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/spinlock_profiling-test.cc
M src/kudu/util/spinlock_profiling.cc
M src/kudu/util/spinlock_profiling.h
M src/kudu/util/thread.cc
M src/kudu/util/thread.h
M src/kudu/util/trace.cc
M src/kudu/util/url-coding-test.cc
M src/kudu/util/url-coding.cc
M src/kudu/util/url-coding.h
M src/kudu/util/web_callback_registry.h
43 files changed, 228 insertions(+), 192 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Will Berkeley: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I059c19bb287ee37f96e5ad42f4886015a1697d19
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Inlined dispatch for predicate evaluation

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Inlined dispatch for predicate evaluation
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4164/9/src/kudu/common/column_predicate.cc
File src/kudu/common/column_predicate.cc:

Line 312:   DCHECK_NOTNULL(sel);
nit: this will actually give an "unused result" compiler warning. Should just 
be DCHECK(sel)


-- 
To view, visit http://gerrit.cloudera.org:8080/4164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Predicate evaluation pushdown

2016-09-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: Predicate evaluation pushdown
..


Predicate evaluation pushdown

The premise of this patch is to avoid the excessive use of CPU when
evaluating column predicates in specific cases. Dictionary blocks, for
instance, can evaluate predicates by checking whether a string's
codeword matches with the predicate, rather than by doing string
comparisons.

This patch uses a bitset to represent the set of codewords that match
a given predicate on dictionary-encoded columns. Certain decoders now
have the ability to evaluate a predicate without eagerly copying all
of their underlying data into a buffer first. Rather, the decoders can
first evaluate the predicate and copy only when needed.

Since dictionary encoding relies on plain encoding when a dictionary
gets too large, plain encoding also supports decoder-level evaluation.
While lacking the benefit from reduced string comparisons, this
optimization still improves scan speeds by avoiding excessive copies.

See the performance doc for a look into the performance differences
for dictionary encoding and plain encoding:
https://github.com/anjuwong/kudu/blob/pred-pushdown/docs/decoder-eval-perf.md

See the design-doc for predicate-eval-pushdown for a brief overview of the
considered implementations:
https://github.com/anjuwong/kudu/blob/sorted-dict-block/docs/design-docs/predicate-eval-pushdown.md

More in-depth analysis and benchmarking in upcoming blog post.

Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Reviewed-on: http://gerrit.cloudera.org:8080/3990
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_util.cc
M src/kudu/cfile/encoding-test.cc
A src/kudu/common/column_materialization_context.h
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/common/iterator.h
M src/kudu/common/rowblock.h
M src/kudu/common/schema.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/delta_applier.cc
M src/kudu/tablet/delta_applier.h
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
A src/kudu/tablet/tablet-decoder-eval-test.cc
M src/kudu/tablet/tablet-test-util.h
35 files changed, 930 insertions(+), 124 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] API and style improvements to the Kudu Flume Sink

2016-09-08 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: API and style improvements to the Kudu Flume Sink
..


Patch Set 1:

(8 comments)

Do we also want something in the release notes about the API change + 
AvroKuduOperationsProducer?

http://gerrit.cloudera.org:8080/#/c/4320/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducer.java
File 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/AvroKuduOperationsProducer.java:

Line 243:   String.format("Failed to coerce value for column `%s` 
to type %s",
> nit: if we are going to quote, let's use single-quotes, not backquotes here
Done


http://gerrit.cloudera.org:8080/#/c/4320/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduOperationsProducer.java
File 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduOperationsProducer.java:

Line 36: public interface KuduOperationsProducer extends Configurable {
> How about also: extends AutoCloseable
Done


http://gerrit.cloudera.org:8080/#/c/4320/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
File 
java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java:

Line 101:   private KuduOperationsProducer eventProducer;
> Shouldn't this also be called operationsProducer instead of eventProducer?
Done


Line 145: eventProducer.close();
> I think we should wrap this in a try and log if it throws, then continue.
Done


Line 154:   throw new FlumeException("Error closing client", e);
> I think we should log errors but continue with shutdown here. Otherwise it 
Done


Line 158:   }
> I'd suggest throwing at the very end if we got any exceptions while shuttin
Done


Line 165: String.format("Missing master addresses. Please specify 
property '$s'.",
> nit: here and elsewhere, checkNotNull() supports substitution syntax: https
Done


http://gerrit.cloudera.org:8080/#/c/4320/1/java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/KeyedKuduOperationsProducerTest.java
File 
java/kudu-flume-sink/src/test/java/org/apache/kudu/flume/sink/KeyedKuduOperationsProducerTest.java:

Line 209: parameters.put(KuduSinkConfigurationConstants.PRODUCER, 
"org.apache.kudu.flume.sink.SimpleKeyedKuduOperationsProducer");
> nit: please wrap this long line. also prefer:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I357df8cac7daa6ce105f9568cc3af09697032eb6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Ara Ebrahimi 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] Inlined dispatch for predicate evaluation

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Inlined dispatch for predicate evaluation
..


Patch Set 9:

Build Started http://104.196.14.100/job/kudu-gerrit/3281/

-- 
To view, visit http://gerrit.cloudera.org:8080/4164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Predicate evaluation pushdown

2016-09-08 Thread Andrew Wong (Code Review)
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3990

to look at the new patch set (#16).

Change subject: Predicate evaluation pushdown
..

Predicate evaluation pushdown

The premise of this patch is to avoid the excessive use of CPU when
evaluating column predicates in specific cases. Dictionary blocks, for
instance, can evaluate predicates by checking whether a string's
codeword matches with the predicate, rather than by doing string
comparisons.

This patch uses a bitset to represent the set of codewords that match
a given predicate on dictionary-encoded columns. Certain decoders now
have the ability to evaluate a predicate without eagerly copying all
of their underlying data into a buffer first. Rather, the decoders can
first evaluate the predicate and copy only when needed.

Since dictionary encoding relies on plain encoding when a dictionary
gets too large, plain encoding also supports decoder-level evaluation.
While lacking the benefit from reduced string comparisons, this
optimization still improves scan speeds by avoiding excessive copies.

See the performance doc for a look into the performance differences
for dictionary encoding and plain encoding:
https://github.com/anjuwong/kudu/blob/pred-pushdown/docs/decoder-eval-perf.md

See the design-doc for predicate-eval-pushdown for a brief overview of the
considered implementations:
https://github.com/anjuwong/kudu/blob/sorted-dict-block/docs/design-docs/predicate-eval-pushdown.md

More in-depth analysis and benchmarking in upcoming blog post.

Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_util.cc
M src/kudu/cfile/encoding-test.cc
A src/kudu/common/column_materialization_context.h
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/common/iterator.h
M src/kudu/common/rowblock.h
M src/kudu/common/schema.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/delta_applier.cc
M src/kudu/tablet/delta_applier.h
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
A src/kudu/tablet/tablet-decoder-eval-test.cc
M src/kudu/tablet/tablet-test-util.h
35 files changed, 930 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/3990/16
-- 
To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Predicate evaluation pushdown

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Predicate evaluation pushdown
..


Patch Set 16:

Build Started http://104.196.14.100/job/kudu-gerrit/3280/

-- 
To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] tool: port kudu-fs dump, remove kudu-fs list (WIP)

2016-09-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: tool: port kudu-fs_dump, remove kudu-fs_list (WIP)
..


Patch Set 4:

Build Started http://104.196.14.100/job/kudu-gerrit/3279/

-- 
To view, visit http://gerrit.cloudera.org:8080/4305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1ec628b65613011d8c48b6239c13762276425966
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] tool: port kudu-fs dump, remove kudu-fs list (WIP)

2016-09-08 Thread Dinesh Bhat (Code Review)
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4305

to look at the new patch set (#4).

Change subject: tool: port kudu-fs_dump, remove kudu-fs_list (WIP)
..

tool: port kudu-fs_dump, remove kudu-fs_list (WIP)

This change also moves dump_cfile action under 'kudu fs dump cfile'.
kudu-fs_list tool has been removed altogether and
a 'kudu fs dump tree' has been added to dump the filesystem tree.
Majority of the kudu-fs_dump sub-commands which were relying on tablet
id as required parameter have been moved under 'kudu tablet'.

Also added tests under kudu-tool-test to exercise each of these fs tools.

Change-Id: I1ec628b65613011d8c48b6239c13762276425966
---
M docs/release_notes.adoc
M src/kudu/integration-tests/master_migration-itest.cc
M src/kudu/tools/CMakeLists.txt
D src/kudu/tools/fs_dump-tool.cc
D src/kudu/tools/fs_list-tool.cc
D src/kudu/tools/fs_tool.cc
D src/kudu/tools/fs_tool.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_fs.cc
M src/kudu/tools/tool_action_tablet.cc
13 files changed, 1,117 insertions(+), 1,151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/4305/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1ec628b65613011d8c48b6239c13762276425966
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon