[kudu-CR] Allow for reserving disk space for non-Kudu processes
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3135 to look at the new patch set (#3). Change subject: Allow for reserving disk space for non-Kudu processes .. Allow for reserving disk space for non-Kudu processes Adds gflags to reserve disk space such that Kudu will not use more than specified. Hadoop calls this functionality "du.reserved". If a WAL preallocation is attempted while the log disk is past its reservation limit then a crash will result. The log block manager will use non-full disks if possible until all of the disks are full. If a flush or compaction is attempted when all disks are beyond their configured capacity then the process will crash. This initial implementation provides a "best effort" approach. Disk space checks are only done at preallocation time, and if writes continue beyond the preallocated point (for both a WAL segment and a data block) those writes will not be prevented. This makes it easier to provide a "friendly" option where the block manager will divert new writes to non-full disks, avoiding a hard crash when only one disk is past its reservation limit. In the future, we may want to add "hard" and "soft" limits, such that going beyond the soft limit will do what we do today, and going beyond the hard limit (say, by writing a very large data block past its preallocation point) will result in a crash. This patch includes: * Some unit tests. * End-to-end test for compaction falling back to non-full disks due to disk space backpressure and finally crashing when there is no space left in any data dir. * End-to-end test for writes failing due to WAL disk space backpressure, causing a crash. Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 --- M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_util.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_reservation-itest.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/env.h M src/kudu/util/env_posix.cc A src/kudu/util/env_util-test.cc M src/kudu/util/env_util.cc M src/kudu/util/env_util.h M src/kudu/util/memenv/memenv.cc M src/kudu/util/scoped_cleanup.h 16 files changed, 596 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/3 -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Don't use InMemoryEnv in deltafile-test
Mike Percy has uploaded a new change for review. http://gerrit.cloudera.org:8080/3235 Change subject: Don't use InMemoryEnv in deltafile-test .. Don't use InMemoryEnv in deltafile-test Get out the vote: #NeverMemEnv. This is causing problems due to Status::NotSupported for StatVfs() in an upcoming patch. Change-Id: I380249e6a72a93e1fde86a551c9d4d32d35904da --- M src/kudu/tablet/deltafile-test.cc 1 file changed, 3 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3235/1 -- To view, visit http://gerrit.cloudera.org:8080/3235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I380249e6a72a93e1fde86a551c9d4d32d35904da Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy
[kudu-CR] log: Mark allocation finished even if allocation had an error
Mike Percy has posted comments on this change. Change subject: log: Mark allocation finished even if allocation had an error .. Patch Set 1: Verified+1 Overriding flaky python test kudu.tests.test_scanner.TestScanner.test_scan_batch_by_batch (from pytest) -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1444. Get resource metrics of a scan.
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3013 to look at the new patch set (#8). Change subject: KUDU-1444. Get resource metrics of a scan. .. KUDU-1444. Get resource metrics of a scan. This patch supports to get the resource metrics of a scan in client side. The resource metrics will be sent back to client in every scan RPC response. This is useful for impala to show these stats in a query profile. For now, the resource metrics only contains cfile_cache_miss_bytes and cfile_cache_hit_bytes. We may add more in the future as needed. Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/client/CMakeLists.txt M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h A src/kudu/client/resource_metrics-internal.h A src/kudu/client/resource_metrics.cc A src/kudu/client/resource_metrics.h M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto M src/kudu/util/trace_metrics.h 13 files changed, 287 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/3013/8 -- To view, visit http://gerrit.cloudera.org:8080/3013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: zhen.zhang Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: zhen.zhang
[kudu-CR] Make BuildLog() return Status
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3134 to look at the new patch set (#2). Change subject: Make BuildLog() return Status .. Make BuildLog() return Status This allows for writing a reasonable test related to du.reserved Change-Id: Icfa8ddba74b909f5697ce03b45e97c1733082b07 --- M src/kudu/consensus/log-test-base.h M src/kudu/consensus/log-test.cc M src/kudu/consensus/mt-log-test.cc M src/kudu/tablet/tablet_bootstrap-test.cc 4 files changed, 37 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/3134/2 -- To view, visit http://gerrit.cloudera.org:8080/3134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfa8ddba74b909f5697ce03b45e97c1733082b07 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Hello Adar Dembo, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3135 to look at the new patch set (#2). Change subject: Allow for reserving disk space for non-Kudu processes .. Allow for reserving disk space for non-Kudu processes Adds gflags to reserve disk space such that Kudu will not use more than specified. Hadoop calls this functionality "du.reserved". If a WAL preallocation is attempted while the log disk is past its reservation limit then a crash will result. The log block manager will use non-full disks if possible until all of the disks are full. If a flush or compaction is attempted when all disks are beyond their configured capacity then the process will crash. This initial implementation provides a "best effort" approach. Disk space checks are only done at preallocation time, and if writes continue beyond the preallocated point (for both a WAL segment and a data block) those writes will not be prevented. This makes it easier to provide a "friendly" option where the block manager will divert new writes to non-full disks, avoiding a hard crash when only one disk is past its reservation limit. In the future, we may want to add "hard" and "soft" limits, such that going beyond the soft limit will do what we do today, and going beyond the hard limit (say, by writing a very large data block past its preallocation point) will result in a crash. This patch includes: * Some unit tests. * End-to-end test for compaction falling back to non-full disks due to disk space backpressure and finally crashing when there is no space left in any data dir. * End-to-end test for writes failing due to WAL disk space backpressure, causing a crash. Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 --- M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_util.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_reservation-itest.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/env.h M src/kudu/util/env_posix.cc A src/kudu/util/env_util-test.cc M src/kudu/util/env_util.cc M src/kudu/util/env_util.h M src/kudu/util/memenv/memenv.cc M src/kudu/util/scoped_cleanup.h 16 files changed, 596 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/2 -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log: Mark allocation finished even if allocation had an error
Hello Adar Dembo, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3234 to review the following change. Change subject: log: Mark allocation finished even if allocation had an error .. log: Mark allocation finished even if allocation had an error The problem this patch fixes is that if a disk preallocation fails we will enter a "stuck" state where we cannot preallocate a new segment. Errors to allocate or append should be fatal. A test exercising this code path is in a follow-up patch related to disk reservations. Since this is a bug fix, it seemed cleaner to separate it out into its own commit. Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 --- M src/kudu/consensus/log.cc 1 file changed, 9 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/34/3234/1 -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Todd Lipcon
[kudu-CR] fix compile error when compiling column_predicate-test.cc In env : boost 1.57, centos 6.5, compile failed because of : "operator<<: cannot bind lvalue to 'std::basic_ostream&&'" wr
song bruce zhang has uploaded a new change for review. http://gerrit.cloudera.org:8080/3233 Change subject: fix compile error when compiling column_predicate-test.cc In env : boost 1.57, centos 6.5, compile failed because of : "operator<<: cannot bind lvalue to 'std::basic_ostream&&'" write a operator<< in column_predicate.h , compile succeed. .. fix compile error when compiling column_predicate-test.cc In env : boost 1.57, centos 6.5, compile failed because of : "operator<<: cannot bind lvalue to 'std::basic_ostream&&'" write a operator<< in column_predicate.h , compile succeed. Author:bruceSz song zhang Date: Fri May 27 09:27:46 2016 +0800 Change-Id: I1f6794b351e49d7a59c542a80161171b7e211093 Change-Id: I6f9a9634eeccd86616be80b004ecce596155bb57 --- M src/kudu/common/column_predicate.h 1 file changed, 10 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/3233/1 -- To view, visit http://gerrit.cloudera.org:8080/3233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6f9a9634eeccd86616be80b004ecce596155bb57 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: song bruce zhang
[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number
Ted Malaska has posted comments on this change. Change subject: KUDU-1470 Exceptions on getting a column value should return the column name not the column number .. Patch Set 2: I made a second patch because I noticed my indention was off. -- To view, visit http://gerrit.cloudera.org:8080/3231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Ted Malaska Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Ted Malaska Gerrit-HasComments: No
[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number
Ted Malaska has uploaded a new patch set (#2). Change subject: KUDU-1470 Exceptions on getting a column value should return the column name not the column number .. KUDU-1470 Exceptions on getting a column value should return the column name not the column number Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5 --- M java/kudu-client/src/main/java/org/kududb/client/RowResult.java 1 file changed, 6 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3231/2 -- To view, visit http://gerrit.cloudera.org:8080/3231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Ted Malaska Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1470 Exceptions on getting a column value should return the column name not the column number
Ted Malaska has uploaded a new change for review. http://gerrit.cloudera.org:8080/3231 Change subject: KUDU-1470 Exceptions on getting a column value should return the column name not the column number .. KUDU-1470 Exceptions on getting a column value should return the column name not the column number Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5 --- M java/kudu-client/src/main/java/org/kududb/client/RowResult.java 1 file changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3231/1 -- To view, visit http://gerrit.cloudera.org:8080/3231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie8bb3db1ed7b4e2027814815776f6252b0f749c5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Ted Malaska
[kudu-CR] Kudu 0.9.0 release notes edit
Misty Stanley-Jones has posted comments on this change. Change subject: Kudu 0.9.0 release notes edit .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3176/4/docs/installation.adoc File docs/installation.adoc: Line 654: The Kudu 0.9 client APIs require setting partitioning options explicitly during In Kudu 0.9 and higher, you must set partitioning options explicitly when creating a new table. If you do not specify partitioning options, an exception will be thrown and the table will not be created. This behavior change does not affect existing tables. -- To view, visit http://gerrit.cloudera.org:8080/3176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Misty Stanley-Jones Gerrit-HasComments: Yes
[kudu-CR] Kudu 0.9.0 release notes edit
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3176 to look at the new patch set (#4). Change subject: Kudu 0.9.0 release notes edit .. Kudu 0.9.0 release notes edit Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 --- M docs/installation.adoc M docs/release_notes.adoc 2 files changed, 42 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3176/4 -- To view, visit http://gerrit.cloudera.org:8080/3176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Kudu 0.9.0 release notes edit
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3176 to look at the new patch set (#3). Change subject: Kudu 0.9.0 release notes edit .. Kudu 0.9.0 release notes edit Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 --- M docs/installation.adoc M docs/release_notes.adoc 2 files changed, 39 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3176/3 -- To view, visit http://gerrit.cloudera.org:8080/3176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](branch-0.9.x) Remove default table partitioning
Dan Burkert has submitted this change and it was merged. Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Reviewed-on: http://gerrit.cloudera.org:8080/3131 Tested-by: Kudu Jenkins Reviewed-by: Misty Stanley-Jones (cherry picked from commit 0e7c257f950e8875fe9d6a541cc03918ae23912e) Reviewed-on: http://gerrit.cloudera.org:8080/3229 Reviewed-by: Jean-Daniel Cryans --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 226 insertions(+), 140 deletions(-) Approvals:
[kudu-CR](branch-0.9.x) Remove default table partitioning
Jean-Daniel Cryans has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Remove default table partitioning
Dan Burkert has uploaded a new change for review. http://gerrit.cloudera.org:8080/3229 Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Reviewed-on: http://gerrit.cloudera.org:8080/3131 Tested-by: Kudu Jenkins Reviewed-by: Misty Stanley-Jones (cherry picked from commit 0e7c257f950e8875fe9d6a541cc03918ae23912e) --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 226 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/change
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has submitted this change and it was merged. Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Reviewed-on: http://gerrit.cloudera.org:8080/3131 Tested-by: Kudu Jenkins Reviewed-by: Misty Stanley-Jones --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 226 insertions(+), 140 deletions(-) Approvals: Misty Stanley-Jones: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsub
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 18: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 18 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Document advice about max columns and record size
Dan Burkert has posted comments on this change. Change subject: Document advice about max columns and record size .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/2778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Non-covering Range Partitions design doc
Dan Burkert has posted comments on this change. Change subject: Non-covering Range Partitions design doc .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/2772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Binglin Chang Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Non-covering Range Partitions design doc
Dan Burkert has submitted this change and it was merged. Change subject: Non-covering Range Partitions design doc .. Non-covering Range Partitions design doc Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc Reviewed-on: http://gerrit.cloudera.org:8080/2772 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert --- M docs/design-docs/README.md A docs/design-docs/non-covering-range-partitions.md 2 files changed, 217 insertions(+), 0 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/2772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Binglin Chang Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [c++-client]: minimal changes to support tables with non-covering range partitions
Dan Burkert has abandoned this change. Change subject: [c++-client]: minimal changes to support tables with non-covering range partitions .. Abandoned I'm not confident that this change will work as advertised, so I'm going to not attempt to get it into 0.9. I'll figure out another way to disable non-covered range partitioned tables from old clients. I'll fold this patch and the associated review fixes into another patch. -- To view, visit http://gerrit.cloudera.org:8080/3177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ib25b7a57b14b3d1e4e6dca75b88dad7c19ba7565 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Non-covering Range Partitions design doc
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/2772 to look at the new patch set (#11). Change subject: Non-covering Range Partitions design doc .. Non-covering Range Partitions design doc Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc --- M docs/design-docs/README.md A docs/design-docs/non-covering-range-partitions.md 2 files changed, 217 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/2772/11 -- To view, visit http://gerrit.cloudera.org:8080/2772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Binglin Chang Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Remove default table partitioning
Dan Burkert has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/3131/17/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: [[no_default_partitioning] > missing last ending ] Done -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#18). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 226 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/18 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/3131/17/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: [[no_default_partitioning] missing last ending ] -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#17). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 226 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/17 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Remove default table partitioning
Dan Burkert has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/3131/15/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: IMPORTANT: Kudu does not provide a default partitioning strategy when creating tables. It > If you want to be able to link to this, here is how. You need to change it Done -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#16). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 221 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/16 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/3131/15/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: IMPORTANT: Kudu does not provide a default partitioning strategy when creating tables. It If you want to be able to link to this, here is how. You need to change it to an admonition block. [[no_default_partitioning]] [IMPORTANT] .No Default Partitioning Scheme The first line is the anchor. You can link to it by doing <> in the same file, or schema_design.html#no_default_partitioning in a sibling doc. The second line is the admonition macro. The third line is the title. If it's a block, you need one. The next line is the begin-block delimiter, followed by the body text, and the end-block delimiter. -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 13: (3 comments) http://gerrit.cloudera.org:8080/#/c/3131/13/docs/release_notes.adoc File docs/release_notes.adoc: Line 69: - Default table partitioning has been removed. All tables must now be created > The specifics of how to set partitioning depends on the client, so I'm not Maybe a link to how to do it in Impala? I just think that people might read this and say "Meh, I don't know how to do that so I'm going to ignore it." But maybe not. http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc File docs/schema_design.adoc: Line 178: distribution keyspace. Range partitioning may be configured to use any subset of > I updated the sentence, let me know if it makes more sense now. Done http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java: Line 299: "setRangePartitionColumns or addHashPartitions"); > I personally think documenting setRangePartitionColumns is enough given tha Done -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Dan Burkert has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc File docs/schema_design.adoc: Line 178: distribution keyspace. Range partitioning may be configured to use any subset of > I'm not sure what it means to divide tablets, and the first sentence in the I updated the sentence, let me know if it makes more sense now. -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Dan Burkert has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/3131/13/docs/release_notes.adoc File docs/release_notes.adoc: Line 69: - Default table partitioning has been removed. All tables must now be created > Should we link to where we show how to create tables? The specifics of how to set partitioning depends on the client, so I'm not sure where it should link to. http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: Kudu does not provide a default partitioning strategy when creating tables. It > Enclose this in an admonition by prefixing IMPORTANT: (just before 'Kudu do Done Line 178: distribution keyspace. Range partitioning may be configured to use any subset of > With range partitioning, you can divide your tablets using any subset of th I'm not sure what it means to divide tablets, and the first sentence in the paragraph already has that same structure. http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java: Line 299: "setRangePartitionColumns or addHashPartitions"); > Maybe a little more info in this exception? "To prevent the accidental crea I personally think documenting setRangePartitionColumns is enough given that it's explicitly named in the error. http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java File java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java: Line 103:* partitioning. If the table should only have a single partition (not > s/If the table should/To force the use of a single tablet, call... Done -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#15). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 219 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/15 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Kudu 0.9.0 release notes edit
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3176 to look at the new patch set (#2). Change subject: Kudu 0.9.0 release notes edit .. Kudu 0.9.0 release notes edit Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 --- M docs/installation.adoc M docs/release_notes.adoc 2 files changed, 39 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/3176/2 -- To view, visit http://gerrit.cloudera.org:8080/3176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6242089b099a7e220ce4094f3ba0377859338b97 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](branch-0.9.x) KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas .. KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas When handling a zipfian update workload, we have a lot of delta records applied to a single row. In the compaction code path, we were previously using Mutation::AppendToList() to append each such delta to the end of a linked list. Thus, constructing the linked list was O(n^2) in the number of deltas corresponding to a row. Each such iteration also most likely was a CPU cache miss, making this quite slow. In fact, in a YCSB workload I'm currently testing, a single compaction has been running for over an hour and a half, spending all of its time in the AppendToList function. This patch changes the flow so that the deltas are collected by prepending them to the linked list (an O(1) operation). The resulting list is then reversed (an O(n) operation) before feeding into the rest of the unchanged compaction code path. This patch is also notable for being one of the few times in real life where a "reverse the linked list" algorithm was actually necessary. Due to the notoriously tricky nature of this foundational computer science problem (I once botched it on a job interview), and the fact that this was a so-called "real life situation", I googled the algorithm. Thus, I am confident of its correctness. Aside from such assurances, this code path is also very well covered by existing test cases which perform multiple updates against the same row (in particular fuzz-itest). Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Reviewed-on: http://gerrit.cloudera.org:8080/3221 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo (cherry picked from commit c7178e97e842f42e9ed9d5e9e2a4f521fbe70b6b) Reviewed-on: http://gerrit.cloudera.org:8080/3224 Reviewed-by: Jean-Daniel Cryans --- M src/kudu/tablet/compaction.cc M src/kudu/tablet/compaction.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_store.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltamemstore.cc M src/kudu/tablet/memrowset.cc M src/kudu/tablet/memrowset.h M src/kudu/tablet/mutation.cc M src/kudu/tablet/mutation.h 10 files changed, 40 insertions(+), 44 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas
Jean-Daniel Cryans has posted comments on this change. Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Document advice about max columns and record size
Misty Stanley-Jones has posted comments on this change. Change subject: Document advice about max columns and record size .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/2778/3/docs/schema_design.adoc File docs/schema_design.adoc: Line 282: Size of Records:: There is no hard limit imposed by Kudu, but large values (10s of > I don't think so. I think there is size of rows, number of columns, and siz This was a terminology misunderstanding. I meant 'cells'. Changed. -- To view, visit http://gerrit.cloudera.org:8080/2778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Document advice about max columns and record size
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/2778 to look at the new patch set (#4). Change subject: Document advice about max columns and record size .. Document advice about max columns and record size Change-Id: I70a82d59c431f69246128acc19227af3194fa15a --- M docs/schema_design.adoc 1 file changed, 14 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/2778/4 -- To view, visit http://gerrit.cloudera.org:8080/2778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I70a82d59c431f69246128acc19227af3194fa15a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1444. Get resource metrics of a scan.
Todd Lipcon has posted comments on this change. Change subject: KUDU-1444. Get resource metrics of a scan. .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/3013/7/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: Line 268: void DoTestScanReousrceMetrics(KuduScanner& scanner) { - typo: "Resource" - should be a const reference Line 272: // As all the data are in MRS, both cfile_cache_miss_bytes and cfile_cache_miss_bytes are 0 > It seems that all data inserted are in MemeoryRowSet, so when we scan, cfil Check out TestScanFaultTolerance for an example test where we manually Flush() the tablet peer. Line 273: ASSERT_TRUE(metrics["cfile_cache_miss_bytes"] + metrics["cfile_cache_hit_bytes"] == 0); use ASSERT_EQ -- To view, visit http://gerrit.cloudera.org:8080/3013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: zhen.zhang Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: zhen.zhang Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Add support for anchors.js to get permalinks on all headers
Misty Stanley-Jones has uploaded a new patch set (#4). Change subject: Add support for anchors.js to get permalinks on all headers .. Add support for anchors.js to get permalinks on all headers Change-Id: I17bb50f412d8214f91fc5cf8267823d07db03222 --- M _includes/bottom_common.html 1 file changed, 8 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/2739/4 -- To view, visit http://gerrit.cloudera.org:8080/2739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I17bb50f412d8214f91fc5cf8267823d07db03222 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Misty Stanley-Jones Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones
[kudu-CR] Remove default table partitioning
Misty Stanley-Jones has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/3131/13/docs/release_notes.adoc File docs/release_notes.adoc: Line 69: - Default table partitioning has been removed. All tables must now be created Should we link to where we show how to create tables? http://gerrit.cloudera.org:8080/#/c/3131/13/docs/schema_design.adoc File docs/schema_design.adoc: Line 163: Kudu does not provide a default partitioning strategy when creating tables. It Enclose this in an admonition by prefixing IMPORTANT: (just before 'Kudu does'). Line 178: distribution keyspace. Range partitioning may be configured to use any subset of With range partitioning, you can divide your tablets using any subset of the primary key columns. http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java: Line 299: "setRangePartitionColumns or addHashPartitions"); Maybe a little more info in this exception? "To prevent the accidental creation of only a single tablet, which will cause bad performance, table partitioning must be specified" http://gerrit.cloudera.org:8080/#/c/3131/13/java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java File java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java: Line 103:* partitioning. If the table should only have a single partition (not s/If the table should/To force the use of a single tablet, call... -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#14). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 219 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/14 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Reduce verbosity of Java logs
Dan Burkert has submitted this change and it was merged. Change subject: Reduce verbosity of Java logs .. Reduce verbosity of Java logs This commit changes the logging format in order to make it more compact. Date and originating class have been removed from log lines, however the time, originating filename, and line remain. The minicluster has been changed to use the relative binary name plus port as the thread name instead of the full path to the binary (e.g. kudu-master:7051 instead of /home/dan/kudu/build/debug/bin/kudu-master). The result is that a typical log line originating from the mini cluster has been reduced from 349 to 214 columns. Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d Reviewed-on: http://gerrit.cloudera.org:8080/3203 Reviewed-by: Jean-Daniel Cryans Tested-by: Kudu Jenkins --- M java/kudu-client/src/test/java/org/kududb/client/MiniKuduCluster.java M java/kudu-client/src/test/resources/log4j.properties M java/kudu-spark/src/test/resources/log4j.properties 3 files changed, 9 insertions(+), 11 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3203 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#13). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 219 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/13 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR] Reduce verbosity of Java logs
Jean-Daniel Cryans has posted comments on this change. Change subject: Reduce verbosity of Java logs .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3203 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] Reduce verbosity of Java logs
Dan Burkert has posted comments on this change. Change subject: Reduce verbosity of Java logs .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3203/2/java/kudu-client/src/test/java/org/kududb/client/MiniKuduCluster.java File java/kudu-client/src/test/java/org/kududb/client/MiniKuduCluster.java: Line 398: port > I agree with Adar that this can be unplumbed (is that even a word?). Done -- To view, visit http://gerrit.cloudera.org:8080/3203 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] Reduce verbosity of Java logs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3203 to look at the new patch set (#3). Change subject: Reduce verbosity of Java logs .. Reduce verbosity of Java logs This commit changes the logging format in order to make it more compact. Date and originating class have been removed from log lines, however the time, originating filename, and line remain. The minicluster has been changed to use the relative binary name plus port as the thread name instead of the full path to the binary (e.g. kudu-master:7051 instead of /home/dan/kudu/build/debug/bin/kudu-master). The result is that a typical log line originating from the mini cluster has been reduced from 349 to 214 columns. Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d --- M java/kudu-client/src/test/java/org/kududb/client/MiniKuduCluster.java M java/kudu-client/src/test/resources/log4j.properties M java/kudu-spark/src/test/resources/log4j.properties 3 files changed, 9 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/3203/3 -- To view, visit http://gerrit.cloudera.org:8080/3203 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf26651908f533859f16cb4293a06157f9577c7d Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Remove default table partitioning
Dan Burkert has posted comments on this change. Change subject: Remove default table partitioning .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java File java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java: Line 24 > Nit: don't unroll. Done http://gerrit.cloudera.org:8080/#/c/3131/9/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala: Line 20: import java.util > If you're not changing file contents, could you avoid changing the import o Done -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Misty Stanley-Jones Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Remove default table partitioning
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3131 to look at the new patch set (#12). Change subject: Remove default table partitioning .. Remove default table partitioning This commit removes the current default of creating tables with range partitioning over the primary key columns with no splits. This default is problematic because it results in a single tablet, which is a known anti-pattern. Kudu can't predict appropriate split rows without knowledge of the dataset, so creating default splits is not technically feasible. A better default to range partitioning would be to hash partition on the primary key columns with a number of buckets based on the number of tablet servers. Unfortunately, it's similarly difficult to predict an appopriate number of hash buckets with knowledge of the data set. Since changing the default would be a breaking change, and we don't currently have a bullet-proof default option, this commit changes the table creator in the C++ and Java clients to force users to explicitly specify at least range or hash partitioning. Users who really do want a table with no partitioning (a single tablet), can still explicitly set the range partition columns to an empty list and provide no split rows. Change-Id: I7021d7950f8dbb4918503ea6fab2e6ee35076064 --- M docs/release_notes.adoc M docs/schema_design.adoc M java/kudu-client-tools/src/main/java/org/kududb/mapreduce/tools/IntegrationTestBigLinkedList.java M java/kudu-client-tools/src/test/java/org/kududb/mapreduce/tools/ITImportCsv.java M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/kududb/client/CreateTableOptions.java M java/kudu-client/src/main/java/org/kududb/client/KuduClient.java M java/kudu-client/src/test/java/org/kududb/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestAsyncKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestFlexiblePartitioning.java M java/kudu-client/src/test/java/org/kududb/client/TestHybridTime.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduClient.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduSession.java M java/kudu-client/src/test/java/org/kududb/client/TestKuduTable.java M java/kudu-client/src/test/java/org/kududb/client/TestLeaderFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestMasterFailover.java M java/kudu-client/src/test/java/org/kududb/client/TestRowErrors.java M java/kudu-client/src/test/java/org/kududb/client/TestRowResult.java M java/kudu-client/src/test/java/org/kududb/client/TestScanPredicate.java M java/kudu-client/src/test/java/org/kududb/client/TestScannerMultiTablet.java M java/kudu-client/src/test/java/org/kududb/client/TestStatistics.java M java/kudu-client/src/test/java/org/kududb/client/TestTimeouts.java M java/kudu-flume-sink/src/test/java/org/kududb/flume/sink/KuduSinkTest.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableInputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITKuduTableOutputFormat.java M java/kudu-mapreduce/src/test/java/org/kududb/mapreduce/ITOutputFormatJob.java M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/TestContext.scala M python/kudu/client.pyx M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/predicate-test.cc M src/kudu/client/samples/sample.cc M src/kudu/integration-tests/all_types-itest.cc M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/create-table-itest.cc M src/kudu/integration-tests/create-table-stress-test.cc M src/kudu/integration-tests/delete_table-test.cc M src/kudu/integration-tests/full_stack-insert-scan-test.cc M src/kudu/integration-tests/fuzz-itest.cc M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/integration-tests/master_replication-itest.cc M src/kudu/integration-tests/remote_bootstrap-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/ts_itest-base.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/integration-tests/update_scan_delta_compact-test.cc M src/kudu/integration-tests/write_throttling-itest.cc M src/kudu/tools/ksck_remote-test.cc 56 files changed, 218 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/3131/12 -- To view, visit http://gerrit.cloudera.org:8080/3131 To unsubscribe, visit http://gerrit.cloudera.org
[kudu-CR](branch-0.9.x) ksck: usability improvements in error messages
Jean-Daniel Cryans has posted comments on this change. Change subject: ksck: usability improvements in error messages .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) ksck: usability improvements in error messages
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: ksck: usability improvements in error messages .. ksck: usability improvements in error messages - print the address of tablet servers that fail to connect - print the table names of bad tablets This doesn't lend itself to new test assertions, but I verified the changes both against a real cluster and by inspecting the output of 'ksck-test'. Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54 Reviewed-on: http://gerrit.cloudera.org:8080/3225 Reviewed-by: Jean-Daniel Cryans Tested-by: Kudu Jenkins (cherry picked from commit 3ab902eb33bb6f0d7e5ac56892ddcf4d0f1f1300) Reviewed-on: http://gerrit.cloudera.org:8080/3226 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h 5 files changed, 47 insertions(+), 31 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3cd6a6e0f82d03f890ac70af32814277dfd33c54 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] WIP: KUDU-1469. Fix handling of fully-deduped requests after a leader change
David Ribeiro Alves has posted comments on this change. Change subject: WIP: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/3228/1/src/kudu/consensus/consensus.proto File src/kudu/consensus/consensus.proto: Line 221: last_received we should probably rename these. this to something to log_head and the below to something like last_prepared. http://gerrit.cloudera.org:8080/#/c/3228/1/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: Line 646 would need to change this methods name since it's now updating it conditionally -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-1444. Get resource metrics of a scan.
zhen.zhang has posted comments on this change. Change subject: KUDU-1444. Get resource metrics of a scan. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/3013/7/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: Line 272: // As all the data are in MRS, both cfile_cache_miss_bytes and cfile_cache_miss_bytes are 0 It seems that all data inserted are in MemeoryRowSet, so when we scan, cfile_reader will not be called, which results in that both cfile_cache_miss_bytes and cfile_cache_miss_bytes are 0. What should I do to make the data dumped to DiskRowSet? -- To view, visit http://gerrit.cloudera.org:8080/3013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: zhen.zhang Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: zhen.zhang Gerrit-HasComments: Yes
[kudu-CR] KUDU-1444. Get resource metrics of a scan.
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3013 to look at the new patch set (#7). Change subject: KUDU-1444. Get resource metrics of a scan. .. KUDU-1444. Get resource metrics of a scan. This patch supports to get the resource metrics of a scan in client side. The resource metrics will be sent back to client in every scan RPC response. This is useful for impala to show these stats in a query profile. For now, the resource metrics only contains cfile_cache_miss_bytes and cfile_cache_hit_bytes. We may add more in the future as needed. Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/client/CMakeLists.txt M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h A src/kudu/client/resource_metrics-internal.h A src/kudu/client/resource_metrics.cc A src/kudu/client/resource_metrics.h M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tserver.proto M src/kudu/util/trace_metrics.h 13 files changed, 265 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/3013/7 -- To view, visit http://gerrit.cloudera.org:8080/3013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iedaf570a7601651c93275ae0a8565f1e33da842d Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: zhen.zhang Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: zhen.zhang
[kudu-CR] KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas
Todd Lipcon has posted comments on this change. Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas .. Patch Set 3: FWIW I tested this patch on my test cluster, and the compaction that was previously taking 1.5h+ (never finished before I killed it) took a handful of seconds with the patch -- To view, visit http://gerrit.cloudera.org:8080/3221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] WIP: KUDU-1469. Fix handling of fully-deduped requests after a leader change
Hello David Ribeiro Alves, Mike Percy, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3228 to review the following change. Change subject: WIP: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. WIP: KUDU-1469. Fix handling of fully-deduped requests after a leader change This fixes KUDU-1469. WIP because an integration test would be great, but wanted to get a full test run on the change itself. Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd --- M src/kudu/consensus/consensus.proto M src/kudu/consensus/raft_consensus-test.cc M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus_state.cc 4 files changed, 17 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/3228/1 -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy