[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#17).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 226 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/17
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3131/15/docs/schema_design.adoc
File docs/schema_design.adoc:

Line 163: IMPORTANT: Kudu does not provide a default partitioning strategy when 
creating tables. It
> If you want to be able to link to this, here is how. You need to change it 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#16).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 221 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/16
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Misty Stanley-Jones (Code Review)
Misty Stanley-Jones has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3131/13/docs/release_notes.adoc
File docs/release_notes.adoc:

Line 69: - Default table partitioning has been removed. All tables must now be 
created
> The specifics of how to set partitioning depends on the client, so I'm not 
Maybe a link to how to do it in Impala? I just think that people might read 
this and say "Meh, I don't know how to do that so I'm going to ignore it." But 
maybe not.


http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc
File docs/schema_design.adoc:

Line 178: distribution keyspace. Range partitioning may be configured to use 
any subset of
> I updated the sentence, let me know if it makes more sense now.
Done


http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java:

Line 299:  "setRangePartitionColumns or 
addHashPartitions");
> I personally think documenting setRangePartitionColumns is enough given tha
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#15).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 219 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/15
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#13).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 219 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/13
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
File java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java:

Line 24
> Nit: don't unroll.
Done


http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 20: import java.util
> If you're not changing file contents, could you avoid changing the import o
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-26 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#12).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/client.pyx
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 218 insertions(+), 139 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/12
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit 

[kudu-CR] Remove default table partitioning

2016-05-25 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 11:

Python tests are still broken, and it looks like there's an RpcBenchmark 
failure too?

Oh, and I think you missed my two comments from PS9.

-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Remove default table partitioning

2016-05-25 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#11).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M python/kudu/tests/common.py
M python/kudu/tests/test_client.py
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
56 files changed, 223 insertions(+), 141 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/11
-- 
To view, visit http://gerrit.cloudera.org:8080/3131

[kudu-CR] Remove default table partitioning

2016-05-25 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3131

to look at the new patch set (#10).

Change subject: Remove default table partitioning
..

Remove default table partitioning

This commit removes the current default of creating tables with range
partitioning over the primary key columns with no splits. This default is
problematic because it results in a single tablet, which is a known
anti-pattern. Kudu can't predict appropriate split rows without knowledge of the
dataset, so creating default splits is not technically feasible.

A better default to range partitioning would be to hash partition on the primary
key columns with a number of buckets based on the number of tablet servers.
Unfortunately, it's similarly difficult to predict an appopriate number of hash
buckets with knowledge of the data set.

Since changing the default would be a breaking change, and we don't currently
have a bullet-proof default option, this commit changes the table creator in the
C++ and Java clients to force users to explicitly specify at least range or
hash partitioning. Users who really do want a table with no partitioning (a
single tablet), can still explicitly set the range partition columns to an
empty list and provide no split rows.

Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
---
M docs/release_notes.adoc
M docs/schema_design.adoc
M 
java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java
M 
java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java
M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java
M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java
M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java
M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java
M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java
M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java
M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java
M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java
M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java
M 
java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java
M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/samples/sample.cc
M src/kudu/integration-tests/all_types-itest.cc
M src/kudu/integration-tests/alter_table-randomized-test.cc
M src/kudu/integration-tests/alter_table-test.cc
M src/kudu/integration-tests/create-table-itest.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/integration-tests/full_stack-insert-scan-test.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/remote_bootstrap-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/update_scan_delta_compact-test.cc
M src/kudu/integration-tests/write_throttling-itest.cc
M src/kudu/tools/ksck_remote-test.cc
54 files changed, 216 insertions(+), 139 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/10
-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings


[kudu-CR] Remove default table partitioning

2016-05-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java
File java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java:

Line 24
Nit: don't unroll.


http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 20: import java.util
If you're not changing file contents, could you avoid changing the import order?


-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Remove default table partitioning

2016-05-24 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Remove default table partitioning
..


Patch Set 9:

Looks like only the Python tests are still failing now.

-- 
To view, visit http://gerrit.cloudera.org:8080/3131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Misty Stanley-Jones 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No