[kudu-CR] Allow for reserving disk space for non-Kudu processes

2016-05-26 Thread Mike Percy (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3135

to look at the new patch set (#3).

Change subject: Allow for reserving disk space for non-Kudu processes
..

Allow for reserving disk space for non-Kudu processes

Adds gflags to reserve disk space such that Kudu will not use more than
specified. Hadoop calls this functionality "du.reserved".

If a WAL preallocation is attempted while the log disk is past its
reservation limit then a crash will result.

The log block manager will use non-full disks if possible until all of
the disks are full. If a flush or compaction is attempted when all disks
are beyond their configured capacity then the process will crash.

This initial implementation provides a "best effort" approach. Disk
space checks are only done at preallocation time, and if writes continue
beyond the preallocated point (for both a WAL segment and a data block)
those writes will not be prevented. This makes it easier to provide a
"friendly" option where the block manager will divert new writes to
non-full disks, avoiding a hard crash when only one disk is past its
reservation limit.

In the future, we may want to add "hard" and "soft" limits, such that
going beyond the soft limit will do what we do today, and going beyond
the hard limit (say, by writing a very large data block past its
preallocation point) will result in a crash.

This patch includes:

* Some unit tests.
* End-to-end test for compaction falling back to non-full disks due to
  disk space backpressure and finally crashing when there is no space
  left in any data dir.
* End-to-end test for writes failing due to WAL disk space backpressure,
  causing a crash.

Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk_reservation-itest.cc
M src/kudu/util/CMakeLists.txt
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
A src/kudu/util/env_util-test.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
M src/kudu/util/memenv/memenv.cc
M src/kudu/util/scoped_cleanup.h
16 files changed, 596 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Don't use InMemoryEnv in deltafile-test

2016-05-26 Thread Mike Percy (Code Review)
Mike Percy has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3235

Change subject: Don't use InMemoryEnv in deltafile-test
..

Don't use InMemoryEnv in deltafile-test

Get out the vote: #NeverMemEnv.

This is causing problems due to Status::NotSupported for StatVfs() in an
upcoming patch.

Change-Id: I380249e6a72a93e1fde86a551c9d4d32d35904da
---
M src/kudu/tablet/deltafile-test.cc
1 file changed, 3 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3235/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I380249e6a72a93e1fde86a551c9d4d32d35904da
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 


[kudu-CR] KUDU-1444. Get resource metrics of a scan.

2016-05-26 Thread zhen.zhang (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3013

to look at the new patch set (#8).

Change subject: KUDU-1444. Get resource metrics of a scan.
..

KUDU-1444. Get resource metrics of a scan.

This patch supports to get the resource metrics of a scan in client side. The
resource metrics will be sent back to client in every scan RPC response. This
is useful for impala to show these stats in a query profile.

For now, the resource metrics only contains cfile_cache_miss_bytes and
cfile_cache_hit_bytes. We may add more in the future as needed.

Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/resource_metrics-internal.h
A src/kudu/client/resource_metrics.cc
A src/kudu/client/resource_metrics.h
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/trace_metrics.h
13 files changed, 287 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/3013/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: zhen.zhang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: zhen.zhang 


[kudu-CR] Allow for reserving disk space for non-Kudu processes

2016-05-26 Thread Mike Percy (Code Review)
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3135

to look at the new patch set (#2).

Change subject: Allow for reserving disk space for non-Kudu processes
..

Allow for reserving disk space for non-Kudu processes

Adds gflags to reserve disk space such that Kudu will not use more than
specified. Hadoop calls this functionality "du.reserved".

If a WAL preallocation is attempted while the log disk is past its
reservation limit then a crash will result.

The log block manager will use non-full disks if possible until all of
the disks are full. If a flush or compaction is attempted when all disks
are beyond their configured capacity then the process will crash.

This initial implementation provides a "best effort" approach. Disk
space checks are only done at preallocation time, and if writes continue
beyond the preallocated point (for both a WAL segment and a data block)
those writes will not be prevented. This makes it easier to provide a
"friendly" option where the block manager will divert new writes to
non-full disks, avoiding a hard crash when only one disk is past its
reservation limit.

In the future, we may want to add "hard" and "soft" limits, such that
going beyond the soft limit will do what we do today, and going beyond
the hard limit (say, by writing a very large data block past its
preallocation point) will result in a crash.

This patch includes:

* Some unit tests.
* End-to-end test for compaction falling back to non-full disks due to
  disk space backpressure and finally crashing when there is no space
  left in any data dir.
* End-to-end test for writes failing due to WAL disk space backpressure,
  causing a crash.

Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk_reservation-itest.cc
M src/kudu/util/CMakeLists.txt
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
A src/kudu/util/env_util-test.cc
M src/kudu/util/env_util.cc
M src/kudu/util/env_util.h
M src/kudu/util/memenv/memenv.cc
M src/kudu/util/scoped_cleanup.h
16 files changed, 596 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Make BuildLog() return Status

2016-05-26 Thread Mike Percy (Code Review)
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3134

to look at the new patch set (#2).

Change subject: Make BuildLog() return Status
..

Make BuildLog() return Status

This allows for writing a reasonable test related to du.reserved

Change-Id: Icfa8ddba74b909f5697ce03b45e97c1733082b07
---
M src/kudu/consensus/log-test-base.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/mt-log-test.cc
M src/kudu/tablet/tablet_bootstrap-test.cc
4 files changed, 37 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/3134/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfa8ddba74b909f5697ce03b45e97c1733082b07
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log: Mark allocation finished even if allocation had an error

2016-05-26 Thread Mike Percy (Code Review)
Hello Adar Dembo, Todd Lipcon,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/3234

to review the following change.

Change subject: log: Mark allocation finished even if allocation had an error
..

log: Mark allocation finished even if allocation had an error

The problem this patch fixes is that if a disk preallocation fails we
will enter a "stuck" state where we cannot preallocate a new segment.
Errors to allocate or append should be fatal.

A test exercising this code path is in a follow-up patch related to disk
reservations. Since this is a bug fix, it seemed cleaner to separate it
out into its own commit.

Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3
---
M src/kudu/consensus/log.cc
1 file changed, 9 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/3234/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fix compile error when compiling column_predicate-test.cc In env : boost 1.57, centos 6.5, compile failed because of : "operator<<: cannot bind lvalue to 'std::basic_ostream&&'" wr

2016-05-26 Thread song bruce zhang (Code Review)
song bruce zhang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3233

Change subject:  fix compile error when compiling column_predicate-test.cc  In 
env : boost 1.57, centos 6.5, compile failed because of :  "operator<<: cannot 
bind lvalue to 'std::basic_ostream&&'"  write a operator<< in 
column_predicate.h , compile succeed.
..

fix compile error when compiling column_predicate-test.cc
 In env : boost 1.57, centos 6.5, compile failed because of :
 "operator<<: cannot bind lvalue to 'std::basic_ostream&&'"
 write a operator<< in column_predicate.h , compile succeed.

  Author:bruceSz song zhang
  Date:  Fri May 27 09:27:46 2016 +0800
 Change-Id: I1f6794b351e49d7a59c542a80161171b7e211093

Change-Id: I6f9a9634eeccd86616be80b004ecce596155bb57
---
M src/kudu/common/column_predicate.h
1 file changed, 10 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/3233/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6f9a9634eeccd86616be80b004ecce596155bb57
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: song bruce zhang 


[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number

2016-05-26 Thread Ted Malaska (Code Review)
Ted Malaska has posted comments on this change.

Change subject: KUDU-1470 Exceptions on getting a column value should return 
the column name not the column number
..


Patch Set 2:

I made a second patch because I noticed my indention was off.

-- 
To view, visit http://gerrit.cloudera.org:8080/3231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ted Malaska 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Ted Malaska 
Gerrit-HasComments: No


[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number

2016-05-26 Thread Ted Malaska (Code Review)
Ted Malaska has uploaded a new patch set (#2).

Change subject: KUDU-1470 Exceptions on getting a column value should return 
the column name not the column number
..

KUDU-1470 Exceptions on getting a column value should return the column name 
not the column number

Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5
---
M java/kudu-client/src/main/java/org/kududb/client/RowResult.java
1 file changed, 6 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3231/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ted Malaska 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number

2016-05-26 Thread Ted Malaska (Code Review)
Ted Malaska has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3231

Change subject: KUDU-1470 Exceptions on getting a column value should return 
the column name not the column number
..

KUDU-1470 Exceptions on getting a column value should return the column name 
not the column number

Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5
---
M java/kudu-client/src/main/java/org/kududb/client/RowResult.java
1 file changed, 5 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3231/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ted Malaska 


[kudu-CR] Kudu 0.9.0 release notes edit

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3176

to look at the new patch set (#4).

Change subject: Kudu 0.9.0 release notes edit
..

Kudu 0.9.0 release notes edit

Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97
---
M docs/installation.adoc
M docs/release_notes.adoc
2 files changed, 42 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3176/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Misty Stanley-Jones 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR](branch-0.9.x) Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3229

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Reviewed-on: http://gerrit.cloudera.org:8080/3131
Tested-by: Kudu Jenkins
Reviewed-by: Misty Stanley-Jones 
(cherry picked from commit 0e7c257f950e8875fe9d6a541cc03918ae23912e)
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 226 insertions(+), 140 deletions(-)


  git pull 

[kudu-CR] Document advice about max columns and record size

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Document advice about max columns and record size
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/2778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Misty Stanley-Jones 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Non-covering Range Partitions design doc

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Non-covering Range Partitions design doc
..


Patch Set 11: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/2772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Binglin Chang 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#17).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 226 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/17
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3131/15/docs/schema_design.adoc
File docs/schema_design.adoc:

Line 163: IMPORTANT: Kudu does not provide a default partitioning strategy when 
creating tables. It
> If you want to be able to link to this, here is how. You need to change it 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#16).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 221 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/16
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Misty Stanley-Jones (Code Review)
Misty Stanley-Jones has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3131/13/docs/release_notes.adoc
File docs/release_notes.adoc:

Line 69: - Default table partitioning has been removed. All tables must now be 
created
> The specifics of how to set partitioning depends on the client, so I'm not 
Maybe a link to how to do it in Impala? I just think that people might read 
this and say "Meh, I don't know how to do that so I'm going to ignore it." But 
maybe not.


http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc
File docs/schema_design.adoc:

Line 178: distribution keyspace. Range partitioning may be configured to use 
any subset of
> I updated the sentence, let me know if it makes more sense now.
Done


http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java:

Line 299:  "setRangePartitionColumns or 
addHashPartitions");
> I personally think documenting setRangePartitionColumns is enough given tha
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#15).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 219 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/15
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Document advice about max columns and record size

2016-05-26 Thread Misty Stanley-Jones (Code Review)
Misty Stanley-Jones has posted comments on this change.

Change subject: Document advice about max columns and record size
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2778/3/docs/schema_design.adoc
File docs/schema_design.adoc:

Line 282: Size of Records:: There is no hard limit imposed by Kudu, but large 
values (10s of
> I don't think so. I think there is size of rows, number of columns, and siz
This was a terminology misunderstanding. I meant 'cells'. Changed.


-- 
To view, visit http://gerrit.cloudera.org:8080/2778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Misty Stanley-Jones 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Document advice about max columns and record size

2016-05-26 Thread Misty Stanley-Jones (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/2778

to look at the new patch set (#4).

Change subject: Document advice about max columns and record size
..

Document advice about max columns and record size

Change-Id: I70a82d59c431f69246128acc19227af3194fa15a
---
M docs/schema_design.adoc
1 file changed, 14 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/2778/4
-- 
To view, visit http://gerrit.cloudera.org:8080/2778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Misty Stanley-Jones 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR](gh-pages) Add support for anchors.js to get permalinks on all headers

2016-05-26 Thread Misty Stanley-Jones (Code Review)
Misty Stanley-Jones has uploaded a new patch set (#4).

Change subject: Add support for anchors.js to get permalinks on all headers
..

Add support for anchors.js to get permalinks on all headers

Change-Id: I17bb50f412d8214f91fc5cf8267823d07db03222
---
M _includes/bottom_common.html
1 file changed, 8 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/2739/4
-- 
To view, visit http://gerrit.cloudera.org:8080/2739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17bb50f412d8214f91fc5cf8267823d07db03222
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-Owner: Misty Stanley-Jones 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#13).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 219 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/13
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Reduce verbosity of Java logs

2016-05-26 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: Reduce verbosity of Java logs
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] Reduce verbosity of Java logs

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3203

to look at the new patch set (#3).

Change subject: Reduce verbosity of Java logs
..

Reduce verbosity of Java logs

This commit changes the logging format in order to make it more compact. Date
and originating class have been removed from log lines, however the time,
originating filename, and line remain. The minicluster has been changed to use
the relative binary name plus port as the thread name instead of the full path
to the binary (e.g. kudu-master:7051 instead of
/home/dan/kudu/build/debug/bin/kudu-master). The result is that a typical log
line originating from the mini cluster has been reduced from 349 to 214 columns.

Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d
---
M java/kudu-client/src/test/java/org/kududb/client/MiniKuduCluster.java
M java/kudu-client/src/test/resources/log4j.properties
M java/kudu-spark/src/test/resources/log4j.properties
3 files changed, 9 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/3203/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
File java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java:

Line 24
> Nit: don't unroll.
Done


http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 20: import java.util
> If you're not changing file contents, could you avoid changing the import o
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#12).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 218 insertions(+), 139 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/12
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR](branch-0.9.x) ksck: usability improvements in error messages

2016-05-26 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: ksck: usability improvements in error messages
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-0.9.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-0.9.x) ksck: usability improvements in error messages

2016-05-26 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has submitted this change and it was merged.

Change subject: ksck: usability improvements in error messages
..


ksck: usability improvements in error messages

- print the address of tablet servers that fail to connect
- print the table names of bad tablets

This doesn't lend itself to new test assertions, but I verified the
changes both against a real cluster and by inspecting the output
of 'ksck-test'.

Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54
Reviewed-on: http://gerrit.cloudera.org:8080/3225
Reviewed-by: Jean-Daniel Cryans
Tested-by: Kudu Jenkins
(cherry picked from commit 3ab902eb33bb6f0d7e5ac56892ddcf4d0f1f1300)
Reviewed-on: http://gerrit.cloudera.org:8080/3226
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
5 files changed, 47 insertions(+), 31 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: branch-0.9.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1444. Get resource metrics of a scan.

2016-05-26 Thread zhen.zhang (Code Review)
zhen.zhang has posted comments on this change.

Change subject: KUDU-1444. Get resource metrics of a scan.
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3013/7/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 272: // As all the data are in MRS, both cfile_cache_miss_bytes and 
cfile_cache_miss_bytes are 0
It seems that all data inserted are in MemeoryRowSet, so when we scan, 
cfile_reader will not be called, which results in that both 
cfile_cache_miss_bytes and cfile_cache_miss_bytes are 0. What should I do to 
make the data dumped to DiskRowSet?


-- 
To view, visit http://gerrit.cloudera.org:8080/3013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: zhen.zhang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: zhen.zhang 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1444. Get resource metrics of a scan.

2016-05-26 Thread zhen.zhang (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3013

to look at the new patch set (#7).

Change subject: KUDU-1444. Get resource metrics of a scan.
..

KUDU-1444. Get resource metrics of a scan.

This patch supports to get the resource metrics of a scan in client side. The
resource metrics will be sent back to client in every scan RPC response. This
is useful for impala to show these stats in a query profile.

For now, the resource metrics only contains cfile_cache_miss_bytes and
cfile_cache_hit_bytes. We may add more in the future as needed.

Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/resource_metrics-internal.h
A src/kudu/client/resource_metrics.cc
A src/kudu/client/resource_metrics.h
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/trace_metrics.h
13 files changed, 265 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/3013/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: zhen.zhang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: zhen.zhang 


[kudu-CR] KUDU-1444. Get resource metrics of a scan.

2016-05-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1444. Get resource metrics of a scan.
..


Patch Set 6:

OK. The code now looks good, but missing a test. Sorry, I should have mentioned 
that previously. Maybe add an assertion somewhere in client-test.cc? There are 
lots of scanner tests there, should be able to just check that the metrics show 
up and at least one of the two is > 0.

-- 
To view, visit http://gerrit.cloudera.org:8080/3013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: zhen.zhang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: zhen.zhang 
Gerrit-HasComments: No