[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 07:16:35 + Gerrit-HasComments: No
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc@204 PS5, Line 204: "Hive Metastore configuration is invalid: $0 must be set to false", Add "to support dropping columns" to specify the reason. http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc File src/kudu/integration-tests/alter_table-randomized-test.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc@79 PS5, Line 79: const char* kTableName = "default.test_table"; Do we need to add a comment that when enable_hive_metastore is true, the table name has to have databasename.tablename pattern? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@176 PS6, Line 176: ASSERT_EQ(table->id(), hms_table.parameters[hms::HmsClient::kKuduTableIdKey]); Assert master addresses as well? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@236 PS6, Line 236: // Drop the HMS table entry and rename the table. I do not quite follow why do we need to drop the hms table entry? Will the drop table in HMS also trigger drop table in Kudu? If you are testing the corner case when the entry is not present in HMS, maybe add more comment here to be more clear. Can we have a most common rename table test case without any corner cases? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@245 PS6, Line 245: ASSERT_OK(table_alterer->RenameTo(renamed_table_name)->Alter()); Assert table id/master addresses/table schema. http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@285 PS6, Line 285: ASSERT_OK(hms_client_->AlterTable(hms_database_name, hms_table_name, hms_table)); So alter column through HMS would not affect the table schema stored in Kudu? http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc@164 PS7, Line 164: create it in the HMS I am not sure it is good to create a HMS entry if not present. What if the users make some mistakes when specifying the original table name? -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 7 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 06:40:50 + Gerrit-HasComments: Yes
[kudu-CR] iwyu: workaround missing IWYU results when using Ninja
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9554 ) Change subject: iwyu: workaround missing IWYU results when using Ninja .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9554/1/build-support/iwyu/iwyu_tool.py File build-support/iwyu/iwyu_tool.py: http://gerrit.cloudera.org:8080/#/c/9554/1/build-support/iwyu/iwyu_tool.py@190 PS1, Line 190: ) > maybe, make it possible to have spaces after the '-I' option but before the basically, I think about the following pattern: '(-isystem |-I\s*)([^\/\s]\S+)' -- To view, visit http://gerrit.cloudera.org:8080/9554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I092cab794fd20a39fcfbbbe52f4c325fac2d149c Gerrit-Change-Number: 9554 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 04:40:04 + Gerrit-HasComments: Yes
[kudu-CR] Remove namespace hack in hms client.h
Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9553 ) Change subject: Remove namespace hack in hms_client.h .. Remove namespace hack in hms_client.h To avoid having to use the very long Apache.Hadoop.Hive namespace for all of the Thrift-generated HMS types, we aliased the namespace in hms_client.h to 'hive'. This doesn't really work well, I've seen issues in tests where for unknown reasons the hive namespace doesn't work. A better fix is to just edit hive_metastore.thrift to change the generated namespace, since it's vendored anyway. This exposed that our CMake thrift generator doesn't properly consider the input .thrift files to be a dependency of the code generation step, so I had to fix that as well to get this to compile without blowing away the build dir. Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf Reviewed-on: http://gerrit.cloudera.org:8080/9553 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M cmake_modules/FindThrift.cmake M src/kudu/hms/hive_metastore.thrift M src/kudu/hms/hms_client.h 3 files changed, 5 insertions(+), 3 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf Gerrit-Change-Number: 9553 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] iwyu: workaround missing IWYU results when using Ninja
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9554 ) Change subject: iwyu: workaround missing IWYU results when using Ninja .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9554/1/build-support/iwyu/iwyu_tool.py File build-support/iwyu/iwyu_tool.py: http://gerrit.cloudera.org:8080/#/c/9554/1/build-support/iwyu/iwyu_tool.py@190 PS1, Line 190: ) maybe, make it possible to have spaces after the '-I' option but before the path? Also, maybe the better criterion for an absolute path is that it's first character is not '/'? -- To view, visit http://gerrit.cloudera.org:8080/9554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I092cab794fd20a39fcfbbbe52f4c325fac2d149c Gerrit-Change-Number: 9554 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 04:16:23 + Gerrit-HasComments: Yes
[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9549 ) Change subject: [tablet] fix nullptr dereference while capturing iterators .. Patch Set 2: Verified+1 Flakes in org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen -- To view, visit http://gerrit.cloudera.org:8080/9549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: comment Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff Gerrit-Change-Number: 9549 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Thu, 08 Mar 2018 04:00:25 + Gerrit-HasComments: No
[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators
Alexey Serbin has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/9549 ) Change subject: [tablet] fix nullptr dereference while capturing iterators .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/9549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff Gerrit-Change-Number: 9549 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Mike Percy
[kudu-CR] iwyu: workaround missing IWYU results when using Ninja
Todd Lipcon has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9554 Change subject: iwyu: workaround missing IWYU results when using Ninja .. iwyu: workaround missing IWYU results when using Ninja This adds a workaround so that IWYU reports the same results whether cmake is invoked with the Makefile generator or the Ninja generator. Tested by verifying that IWYU on my laptop now reports the same issues that Jenkins was reporting precommit on a recent in-flight patch. Change-Id: I092cab794fd20a39fcfbbbe52f4c325fac2d149c --- M build-support/iwyu/iwyu_tool.py 1 file changed, 27 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/9554/1 -- To view, visit http://gerrit.cloudera.org:8080/9554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I092cab794fd20a39fcfbbbe52f4c325fac2d149c Gerrit-Change-Number: 9554 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Hello Tidy Bot, Mike Percy, Kudu Jenkins, Adar Dembo, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9375 to look at the new patch set (#6). Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow In the past we've had several bugs and performance issues which manifest themselves as service queue overflows. It would be easier to pinpoint the root cause (eg slow IO vs lock contention) if we had a process-wide stack snapshot at the time of the overflow. This patch does some plumbing to trigger such a stack trace into the diagnostics log when a service queue overflow occurs. The logging is throttled to once every 5 seconds so that we don't contribute too much to the load itself. The plumbing is a bit ugly since the diagnostics log is in the server/ module whereas the overflows surface in the rpc/ module. Having rpc/ call back into server/ would be a cyclic dependency, so instead the rpc/ module exposes a std::function<> style callback for the server to hook into. Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 --- M src/kudu/master/master-test.cc M src/kudu/master/mini_master.h M src/kudu/rpc/service_pool.cc M src/kudu/rpc/service_pool.h M src/kudu/server/diagnostics_log.cc M src/kudu/server/diagnostics_log.h M src/kudu/server/rpc_server.cc M src/kudu/server/rpc_server.h M src/kudu/server/server_base.cc M src/kudu/server/server_base.h 10 files changed, 185 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/9375/6 -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 6 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias
Hello Mike Percy, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9523 to look at the new patch set (#3). Change subject: KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias .. KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias This changes the timing for the collection of stack samples into the metrics log to avoid two issues: 1) Inflexible configuration: in many cases we might want to sample stacks more or less frequently than metrics. The previous implementation used the same schedule for both. 2) Sampling bias: if we collect stack samples on a fixed schedule, we risk unintended correlation with background tasks that might run on a similar fixed schedule. For example, if we collect stacks exactly once a minute, and the user has some monitoring software which polls the HTTP server once a minute for some data, we might either line up perfectly with the polling (in which case we'd overestimate its impact) or line of perfectly away from the polling (in which case we'd never see its effects). This patch changes the wakeups for stacks and metrics to be decoupled. In addition, it adds randomness to the stack sampling. The configuration now represents the mean sampling rate rather than an exact sampling rate. I manually ran a master with -diagnostics-log-stack-traces-interval-ms=2000 and verified that the stack sample times were between 0 and 4 seconds apart while the metric dumps were still exactly one minute apart. I also plan to modify some local test clusters to configure this to be relatively frequent to try to suss out any races or bugs which might occur in a real workload. Change-Id: I538eb59f95a695fa2da50f24ed82148715b15e44 --- M src/kudu/server/diagnostics_log.cc M src/kudu/server/diagnostics_log.h 2 files changed, 77 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/9523/3 -- To view, visit http://gerrit.cloudera.org:8080/9523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I538eb59f95a695fa2da50f24ed82148715b15e44 Gerrit-Change-Number: 9523 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9523 ) Change subject: KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.h File src/kudu/server/diagnostics_log.h: http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.h@59 PS2, Line 59: enum WakeupType { > enum class is the new hotness. Done http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@75 PS2, Line 75: If this is set to " : "a non-positive number, stack traces will be not be periodically logged. > How about making this a uint and using 0 to disable? Done http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@169 PS2, Line 169: Random rng(GetRandomSeed32()); > Why create a new PRNG each time? seemed easier than making it a class member, and this isn't perf critical enough that the seed computation matters much http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@182 PS2, Line 182: __builtin_unreachable(); > Fancy. If you're feeling generous, there are quite a few places in the code I suppose it coudl be. This isn't the first use of it, though. It's not supported by MSVC but I think we have so many Linux-isms that this is the least of our problems :) FWIW Clang on windows is also now pretty feasible (Chrome just switched to it) http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@204 PS2, Line 204: } else if (MonoTime::Now() > next_log) { > >= is more correct, no? Done http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@208 PS2, Line 208: wakeups.emplace(ComputeNextWakeup(what), what); > what what what Ack ack ack http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@214 PS2, Line 214: // Unlock the mutex while actually logging metrics or stacks since it's somewhat > This is much cleaner, thanks for the cleanup. Ack http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@222 PS2, Line 222: WARN_NOT_OK(LogStacks(reason), "Unable to collect stacks to diagnostics log"); > So error no longer feds into the next logging time? Didn't find it useful? yea, at first I tried to maintain it, but then it just added complexity and I couldn't really see the value relative to the extra 5-10 lines of code -- To view, visit http://gerrit.cloudera.org:8080/9523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I538eb59f95a695fa2da50f24ed82148715b15e44 Gerrit-Change-Number: 9523 Gerrit-PatchSet: 2 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 03:49:37 + Gerrit-HasComments: Yes
[kudu-CR] [WIP] KUDU-2289: Tablet deletion should be throttled
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9551 to look at the new patch set (#2). Change subject: [WIP] KUDU-2289: Tablet deletion should be throttled .. [WIP] KUDU-2289: Tablet deletion should be throttled When a table is deleted, the master eagerly sends DeleteTablet reuests for every replica of every tablet. Since DeleteTablet can be IO-heavy and DeleteTablet was run by service threads, deleting tables could harm other concurrent workloads. This changes DeleteTablet to run on a threadpool. The number of threads is capped by --num_tablets_to_delete_simultaneously, which default to the number of data dirs, a proxy for the number of disks. This should help throttle tablet deletions, both preventing them from monopolizing service threads and limiting their IO. WIP because I'm still doing some tests, and I haven't thought enough about whether the pool's queue size should be limited. Change-Id: I3819bf8a3acf8ea03a76cc6cacd92d85bb114998 --- M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h M src/kudu/util/threadpool.h 4 files changed, 133 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/9551/2 -- To view, visit http://gerrit.cloudera.org:8080/9551 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3819bf8a3acf8ea03a76cc6cacd92d85bb114998 Gerrit-Change-Number: 9551 Gerrit-PatchSet: 2 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins
[kudu-CR] Remove namespace hack in hms client.h
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9553 ) Change subject: Remove namespace hack in hms_client.h .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9553/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9553/1//COMMIT_MSG@11 PS1, Line 11: I've seen issues : in tests where for unknown reasons the hive namespace doesn't work. > Could you elaborate? These were namespace resolution issues at compile time Yeah at compile time. I was able to work around it, and I don't recall all of the details now. IWYU also may be having issues with the namespace hack. -- To view, visit http://gerrit.cloudera.org:8080/9553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf Gerrit-Change-Number: 9553 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Thu, 08 Mar 2018 03:21:43 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@196 PS5, Line 196: WARN_NOT_OK(s, "Unable to collect stacks to diagnostics log"); > The next patch in this series gets rid of this anyway, so I'll skip this on Sure. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@201 PS5, Line 201: dump_stacks_now_reason_ = boost::none; > I actually think that's somewhat advantageous because if you get "requested I don't think it's a big deal. The only rebuttal I can come up with is that if I'm troubleshooting I might find it weird to see N "service queue overflowed" messages but only N-1 or N-2 stack dumps. But that's pretty contrived, plus the relationship between overflows and dumps is already murky due to throttling. -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 03:20:44 + Gerrit-HasComments: Yes
[kudu-CR] Remove namespace hack in hms client.h
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9553 ) Change subject: Remove namespace hack in hms_client.h .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/9553/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9553/1//COMMIT_MSG@11 PS1, Line 11: I've seen issues : in tests where for unknown reasons the hive namespace doesn't work. Could you elaborate? These were namespace resolution issues at compile time? -- To view, visit http://gerrit.cloudera.org:8080/9553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf Gerrit-Change-Number: 9553 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Thu, 08 Mar 2018 03:18:31 + Gerrit-HasComments: Yes
[kudu-CR] Remove namespace hack in hms client.h
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9553 to review the following change. Change subject: Remove namespace hack in hms_client.h .. Remove namespace hack in hms_client.h To avoid having to use the very long Apache.Hadoop.Hive namespace for all of the Thrift-generated HMS types, we aliased the namespace in hms_client.h to 'hive'. This doesn't really work well, I've seen issues in tests where for unknown reasons the hive namespace doesn't work. A better fix is to just edit hive_metastore.thrift to change the generated namespace, since it's vendored anyway. This exposed that our CMake thrift generator doesn't properly consider the input .thrift files to be a dependency of the code generation step, so I had to fix that as well to get this to compile without blowing away the build dir. Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf --- M cmake_modules/FindThrift.cmake M src/kudu/hms/hive_metastore.thrift M src/kudu/hms/hms_client.h 3 files changed, 5 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/9553/1 -- To view, visit http://gerrit.cloudera.org:8080/9553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1b4d823a0329c63c9ce830f855c3d1021fbc08cf Gerrit-Change-Number: 9553 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. Patch Set 5: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@140 PS3, Line 140: * the ticket cache to obtain a new ticket with a later expiration time. So, if > I don't think so because it is more like us trying to prevent a weird edge Ok, makes sense to me. -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 03:02:59 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc@829 PS5, Line 829: Env::Default() > KuduTest has an env_ member you can use. Below too. Done http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc@835 PS5, Line 835: return line.find("service queue overflowed for kudu.master.MasterService") != string::npos; > Would MatchPattern be more idiomatic? Won't have to worry about string::npo Done http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.h File src/kudu/rpc/service_pool.h: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.h@67 PS5, Line 67: void set_reject_too_busy_hook(std::functionhook) { > Nit: if this is supposed to look like a setter for too_busy_hook_, I think Done http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.cc File src/kudu/rpc/service_pool.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.cc@136 PS5, Line 136: if (too_busy_hook_) { : too_busy_hook_(); : } > You may get a less noisy stack if you call this before responding to the RP I think though it's best to respond as quickly as possible because it frees up the memory used by the inbound request http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@196 PS5, Line 196: WARN_NOT_OK(s, "Unable to collect stacks to diagnostics log"); > Nit: "Will try again in ..." ? The next patch in this series gets rid of this anyway, so I'll skip this one if you don't mind. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@201 PS5, Line 201: dump_stacks_now_reason_ = boost::none; > Why do this here and not on L187, right after storing a local copy in 'reas I actually think that's somewhat advantageous because if you get "requested" to dump stacks while you're already dumping stacks, there isn't really any need to wake up and dump stacks again (presumably your previous dump was already more accurate). Again if you disagree would rather change it in the follow-up patch which restructures this code path. In the end though it doesn't really matter much since it's not a super likely race and the only harm is an extra (or missing) stack dump which in most cases no one will ever even look at. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/rpc_server.h File src/kudu/server/rpc_server.h: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/rpc_server.h@62 PS5, Line 62: void set_reject_too_busy_hook(std::function hook) { > Nit: same comment about setter naming. Done -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 02:55:43 + Gerrit-HasComments: Yes
[kudu-CR] docs: update scaling guide with new thread information
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9552 ) Change subject: docs: update scaling guide with new thread information .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9552/1/docs/scaling_guide.adoc File docs/scaling_guide.adoc: http://gerrit.cloudera.org:8080/#/c/9552/1/docs/scaling_guide.adoc@175 PS1, Line 175: Note that all replicas may be considered hot at startup, so tablet servers' thread usage will : generally peak when started and settle down thereafter. Is this actually true though? Isn't the number of simultaneously bootstrapping replicas capped by the size of the bootstrap threadpool? -- To view, visit http://gerrit.cloudera.org:8080/9552 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f6a39faa572e16805e1961105140d823bd82f40 Gerrit-Change-Number: 9552 Gerrit-PatchSet: 1 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 02:51:51 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#6). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,263 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/6 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 6 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] docs: update scaling guide with new thread information
Hello Alex Rodoni, Jean-Daniel Cryans, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9552 to review the following change. Change subject: docs: update scaling guide with new thread information .. docs: update scaling guide with new thread information Change-Id: I6f6a39faa572e16805e1961105140d823bd82f40 --- M docs/scaling_guide.adoc 1 file changed, 7 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/9552/1 -- To view, visit http://gerrit.cloudera.org:8080/9552 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6f6a39faa572e16805e1961105140d823bd82f40 Gerrit-Change-Number: 9552 Gerrit-PatchSet: 1 Gerrit-Owner: Adar DemboGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@140 PS3, Line 140: * the ticket cache to obtain a new ticket with a later expiration time. So, if > Do we need to mention that Kudu will refuse the re-login attempt if the pri I don't think so because it is more like us trying to prevent a weird edge case we expect to never happen. I think providing too much detail in this context may just confuse the reader. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java File java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@95 PS3, Line 95: static > 'static' is redundant for inner enums. Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@107 PS3, Line 107: . > Nit: add ')' Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@130 PS3, Line 130:* @param subject JAAS Subject that the client's credentials are stored in > Nit: remove the extra @param. Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@171 PS3, Line 171: public void refreshSubject() { > Nit: Maybe add a brief comment in front of the function since it is public. Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@191 PS3, Line 191: > Extra line. actually I like this extra line because the above comment refers to the rest of the function. If I remove this line it makes it look like it's referring to the if statement below. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@55 PS3, Line 55: static > Redundant. Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@118 PS3, Line 118: miniCluster.killMasters(); > I do not quite follow why do we have to kill the masters. Does it mean as l yep, because the connection has already been established, keeping it alive means you never need to re-authenticate. I'll clarify that point. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@273 PS3, Line 273: public void testRenewAndReacquireKeberosCredentials() throws Exception { > Nit: addd a comment here? Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@355 PS3, Line 355: Closeable c > Not used? For a "try-with-resources" block I think it's required to name the resource variable even if it's never used: https://stackoverflow.com/questions/16588843/why-does-try-with-resource-require-a-local-variable http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@386 PS3, Line 386: Closeable c > Same here. See above -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 4 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 02:44:54 + Gerrit-HasComments: Yes
[kudu-CR] env: generalize resource limits and add RLIMIT NPROC support
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9521 ) Change subject: env: generalize resource limits and add RLIMIT_NPROC support .. env: generalize resource limits and add RLIMIT_NPROC support A follow-on patch will use this to cap the max number of threads in some process-wide thread pools (see KUDU-1913). Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 Reviewed-on: http://gerrit.cloudera.org:8080/9521 Reviewed-by: David Ribeiro AlvesTested-by: Kudu Jenkins --- M src/kudu/fs/block_manager.cc M src/kudu/rpc/rpc-test.cc M src/kudu/util/env-test.cc M src/kudu/util/env.h M src/kudu/util/env_posix.cc 5 files changed, 106 insertions(+), 33 deletions(-) Approvals: David Ribeiro Alves: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9521 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 Gerrit-Change-Number: 9521 Gerrit-PatchSet: 3 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1913: cap number of threads on server-wide pools
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9522 ) Change subject: KUDU-1913: cap number of threads on server-wide pools .. KUDU-1913: cap number of threads on server-wide pools The last piece of work is to establish an upper bound on the number of threads that may be started in the Raft and Prepare server-wide threadpools. Such caps will make it easier for admins to reason about appropriate values for the configuration of the Kudu processes' RLIMIT_NPROC resource. KUDU-1913 proposed a cap of "number of cores + number of disks", but a lively Slack discussion yielded a better solution: set the cap at some percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu generally uses a constant number of threads, this should prevent spikes from ever exceeding the RLIMIT_NPROC and crashing the server due to an election storm. This patch implements a cap of 10% per pool and also provides a new gflag as an "escape hatch" (in case we were horribly wrong). Note: it's still possible for a massive number of "hot" replicas to exceed RLIMIT_NPROC by virtue of each replica's log append thread, but the server is more likely to run out of memory for MemRowSets before that happens. Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Reviewed-on: http://gerrit.cloudera.org:8080/9522 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro AlvesReviewed-by: Todd Lipcon --- M src/kudu/kserver/kserver.cc 1 file changed, 57 insertions(+), 11 deletions(-) Approvals: Kudu Jenkins: Verified David Ribeiro Alves: Looks good to me, approved Todd Lipcon: Looks good to me, but someone else must approve -- To view, visit http://gerrit.cloudera.org:8080/9522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Gerrit-Change-Number: 9522 Gerrit-PatchSet: 3 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1913: cap number of threads on server-wide pools
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 ) Change subject: KUDU-1913: cap number of threads on server-wide pools .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Gerrit-Change-Number: 9522 Gerrit-PatchSet: 2 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 02:36:42 + Gerrit-HasComments: No
[kudu-CR](branch-1.6.x) KUDU-2274. RaftConsensus should not access cmeta when shutdown
Hello Mike Percy, Jean-Daniel Cryans, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9548 to look at the new patch set (#2). Change subject: KUDU-2274. RaftConsensus should not access cmeta when shutdown .. KUDU-2274. RaftConsensus should not access cmeta when shutdown An additional case of KUDU-2274 found during stress testing was that RaftConsensus will return a ConsensusStatePB from the ConsensusState() RPC method even when shutdown. This patch prevents that. Conflicts: src/kudu/tablet/tablet_replica.h src/kudu/tserver/tserver_path_handlers.cc Change-Id: Ib3f3f75674d5739b281e7c689ddb2d433b9b7415 Reviewed-on: http://gerrit.cloudera.org:8080/9266 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin(cherry picked from commit d977d1cf16d5b8e82d84396a0a82351582258fa1) --- M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/sys_catalog.cc M src/kudu/tablet/tablet_replica.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/tserver_path_handlers.cc 12 files changed, 100 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/9548/2 -- To view, visit http://gerrit.cloudera.org:8080/9548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib3f3f75674d5739b281e7c689ddb2d433b9b7415 Gerrit-Change-Number: 9548 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] WIP: exported kudu client static archives
Adar Dembo has abandoned this change. ( http://gerrit.cloudera.org:8080/9479 ) Change subject: WIP: exported kudu client static archives .. Abandoned Still not something we actually want to do. -- To view, visit http://gerrit.cloudera.org:8080/9479 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I233972163aabfe6bb36aba02616b3c03fc4041e0 Gerrit-Change-Number: 9479 Gerrit-PatchSet: 1 Gerrit-Owner: Adar DemboGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [WIP] KUDU-2289: Tablet deletion should be throttled
Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9551 Change subject: [WIP] KUDU-2289: Tablet deletion should be throttled .. [WIP] KUDU-2289: Tablet deletion should be throttled When a table is deleted, the master eagerly sends DeleteTablet reuests for every replica of every tablet. Since DeleteTablet can be IO-heavy and DeleteTablet was run by service threads, deleting tables could harm other concurrent workloads. This changes DeleteTablet to run on a threadpool. The number of threads is capped by --num_tablets_to_delete_simultaneously, which default to the number of data dirs, a proxy for the number of disks. This should help throttle tablet deletions, both preventing them from monopolizing service threads and limiting their IO. WIP because I'm still doing some tests, and I haven't thought enough about whether the pool's queue size should be limited. Change-Id: I3819bf8a3acf8ea03a76cc6cacd92d85bb114998 --- M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h M src/kudu/util/threadpool.h 4 files changed, 131 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/9551/1 -- To view, visit http://gerrit.cloudera.org:8080/9551 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3819bf8a3acf8ea03a76cc6cacd92d85bb114998 Gerrit-Change-Number: 9551 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley
[kudu-CR](branch-1.6.x) KUDU-2274. RaftConsensus should not access cmeta when shutdown
Hello Kudu Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9548 to review the following change. Change subject: KUDU-2274. RaftConsensus should not access cmeta when shutdown .. KUDU-2274. RaftConsensus should not access cmeta when shutdown An additional case of KUDU-2274 found during stress testing was that RaftConsensus will return a ConsensusStatePB from the ConsensusState() RPC method even when shutdown. This patch prevents that. Conflicts: src/kudu/tserver/tserver_path_handlers.cc Change-Id: Ib3f3f75674d5739b281e7c689ddb2d433b9b7415 Reviewed-on: http://gerrit.cloudera.org:8080/9266 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin(cherry picked from commit d977d1cf16d5b8e82d84396a0a82351582258fa1) --- M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/ts_tablet_manager-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/sys_catalog.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/tserver_path_handlers.cc 11 files changed, 91 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/9548/1 -- To view, visit http://gerrit.cloudera.org:8080/9548 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ib3f3f75674d5739b281e7c689ddb2d433b9b7415 Gerrit-Change-Number: 9548 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR](branch-1.6.x) [tablet service] remove useless call
Hello Mike Percy, Kudu Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9550 to review the following change. Change subject: [tablet_service] remove useless call .. [tablet_service] remove useless call Removed useless call of TabletReplica::permanent_uuid() in ConsensusServiceImpl::UpdateConsensus(). Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6 Reviewed-on: http://gerrit.cloudera.org:8080/9324 Reviewed-by: Mike PercyTested-by: Kudu Jenkins (cherry picked from commit 0396cca5bd7e2ba6b038813ed70fd82c707d9861) --- M src/kudu/tserver/tablet_service.cc 1 file changed, 0 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/9550/1 -- To view, visit http://gerrit.cloudera.org:8080/9550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: newchange Gerrit-Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6 Gerrit-Change-Number: 9550 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR](branch-1.6.x) [tablet] fix nullptr dereference while capturing iterators
Hello Mike Percy, Kudu Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9549 to review the following change. Change subject: [tablet] fix nullptr dereference while capturing iterators .. [tablet] fix nullptr dereference while capturing iterators Added a check into Tablet::CaptureConsistentIterators() to make sure the tablet is not stopped/shutdown. Before this patch in one test scenario I saw stack traces like below (built in DEBUG configuration): kudu-tserver: src/kudu/gutil/ref_counted.h:284: T *scoped_refptr::operator->() const [T = kudu::tablet::TabletComponents]: Assertion `ptr_ != __null' failed. *** Aborted at 1517534012 (unix time) try "date -d @1517534012" if you are using GNU date *** PC: @ 0x7ff9ad39cc37 gsignal *** SIGABRT (@0x3e8745f) received by PID 29791 (TID 0x7ff99a0bc700) from PID 29791; stack trace: *** @ 0x7ff9b5129330 (unknown) at ??:0 @ 0x7ff9ad39cc37 gsignal at ??:0 @ 0x7ff9ad3a0028 abort at ??:0 @ 0x7ff9ad395bf6 (unknown) at ??:0 @ 0x7ff9ad395ca2 __assert_fail at ??:0 @ 0x7ff9b7f2ce52 scoped_refptr<>::operator->() at ??:0 @ 0x7ff9b7f1bf6d kudu::tablet::Tablet::CaptureConsistentIterators() at ??:0 @ 0x7ff9b7f225f6 kudu::tablet::Tablet::Iterator::Init() at ??:0 @ 0x7ff9b94372e3 kudu::tserver::TabletServiceImpl::HandleNewScanRequest() at ??:0 @ 0x7ff9b943a906 kudu::tserver::TabletServiceImpl::Checksum() at ??:0 @ 0x7ff9b3d3c83d kudu::tserver::TabletServerServiceIf::TabletServerServiceIf()::$_11::operator()() at ??:0 @ 0x7ff9b3d3c682 std::_Function_handler<>::_M_invoke() at ??:0 @ 0x7ff9b2ea026b std::function<>::operator()() at ??:0 @ 0x7ff9b2e9fb2d kudu::rpc::GeneratedServiceIf::Handle() at ??:0 @ 0x7ff9b2ea1ee6 kudu::rpc::ServicePool::RunThread() at ??:0 @ 0x7ff9b2ea4499 boost::_mfi::mf0<>::operator()() at ??:0 @ 0x7ff9b2ea4400 boost::_bi::list1<>::operator()<>() at ??:0 @ 0x7ff9b2ea43aa boost::_bi::bind_t<>::operator()() at ??:0 @ 0x7ff9b2ea418d boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0 @ 0x7ff9b2e45f68 boost::function0<>::operator()() at ??:0 @ 0x7ff9b115162d kudu::Thread::SuperviseThread() at ??:0 @ 0x7ff9b5121184 start_thread at ??:0 @ 0x7ff9ad463ffd clone at ??:0 @0x0 (unknown) I used the following WIP stress test for the reproduction scenario: https://gerrit.cloudera.org/#/c/9255/ For DEBUG builds, without fix the issues appeared ~0.5% of cases. After the fix, the issue could not be reproduced: Without fix: http://dist-test.cloudera.org//job?job_id=aserbin.1518492521.137030 With fix: http://dist-test.cloudera.org//job?job_id=aserbin.1518492937.141401 Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff Reviewed-on: http://gerrit.cloudera.org:8080/9189 Tested-by: Kudu Jenkins Reviewed-by: Mike Percy(cherry picked from commit 5d10a56f9d06dc695f2a4469edbabce978912eb4) --- M src/kudu/tablet/tablet.cc 1 file changed, 11 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/9549/1 -- To view, visit http://gerrit.cloudera.org:8080/9549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.6.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ia7600f006c8df7f445cc2551e99390177378bcff Gerrit-Change-Number: 9549 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-1704: add java client support for READ YOUR WRITES mode
Hello Dan Burkert, David Ribeiro Alves, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8847 to look at the new patch set (#9). Change subject: KUDU-1704: add java client support for READ_YOUR_WRITES mode .. KUDU-1704: add java client support for READ_YOUR_WRITES mode Change-Id: I6239521c022147257859e399f55c6f3f945af465 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestScannerMultiTablet.java 4 files changed, 233 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/8847/9 -- To view, visit http://gerrit.cloudera.org:8080/8847 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6239521c022147257859e399f55c6f3f945af465 Gerrit-Change-Number: 8847 Gerrit-PatchSet: 9 Gerrit-Owner: Hao HaoGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt@79 PS2, Line 79: ADD_KUDU_TEST(fuzz-itest RUN_SERIAL true) > I think you missed this one. Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1066 PS2, Line 1066: return s; > I think you missed this one. I just did it in reverse order of Init(), I'm not aware of any ordering issues per se, as long as Shutdown() isn't called concurrently with any other use of the hms catalog. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1598 PS2, Line 1598: } > Hmm, the new behavior treats HMS table entry dropping as fatal, so this new Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@2092 PS2, Line 2092: case AlterTableRequestPB::DROP_RANGE_PARTITION: { > I think you missed this. Done http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc@1561 PS4, Line 1561: // TODO(dan): figure out how to test this. > You can inject faults into SysCatalogTable::SyncWrite. Maybe that will help That's a good idea, I'm going to leave these here for now until I can write some fault tests. Until then, I've changed MasterFailoverTest to have the HMS enabled and it does appear to be triggering these codepaths. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h File src/kudu/master/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@29 PS3, Line 29: #include "kudu/util/net/net_util.h" > Not using optional anymore? right, but HmsClient is now a field so I don't think it can be forward declared. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@51 PS3, Line 51: ~HmsCatalog(); > Please doc the class and its methods. I'm especially interested in the sync I've documented the public api in the .h, there are already pretty extensive notes in the .cc about synchronization, retry behavior, etc. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 4 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 00:47:12 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#5). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,263 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/5 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 5 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Add CatalogManager::master consensus() accessor
Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Add CatalogManager::master_consensus() accessor A patch I'm working on is going to start calling Master::GetMasterHostPorts from inside of CatalogManager::Init. GetMasterHostPorts calls into the catalog manager, and checks that it is initialized. To break this circular dependency this introduces a new accessor for the master tablet RaftConsensus instance which becomes available immediately after the tablet is initialized. A few call-sites are switched over to this accessor instead of drilling into catalog manager. Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Reviewed-on: http://gerrit.cloudera.org:8080/9541 Reviewed-by: Adar DemboReviewed-by: Alexey Serbin Tested-by: Kudu Jenkins --- M src/kudu/integration-tests/security-master-auth-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master.cc 4 files changed, 20 insertions(+), 14 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Alexey Serbin: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 4 Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add CatalogManager::master consensus() accessor
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 3 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 08 Mar 2018 00:25:49 + Gerrit-HasComments: No
[kudu-CR] Add CatalogManager::master consensus() accessor
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 3 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 08 Mar 2018 00:23:11 + Gerrit-HasComments: No
[kudu-CR] Add CatalogManager::master consensus() accessor
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9541/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/9541/2/src/kudu/master/catalog_manager.cc@3759 PS2, Line 3759: std::lock_guard l(lock_); > Would shared_lock l(lock_) be a better choice here? Yes, good point. -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 2 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 08 Mar 2018 00:15:25 + Gerrit-HasComments: Yes
[kudu-CR] Add CatalogManager::master consensus() accessor
Hello Will Berkeley, Alexey Serbin, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9541 to look at the new patch set (#3). Change subject: Add CatalogManager::master_consensus() accessor .. Add CatalogManager::master_consensus() accessor A patch I'm working on is going to start calling Master::GetMasterHostPorts from inside of CatalogManager::Init. GetMasterHostPorts calls into the catalog manager, and checks that it is initialized. To break this circular dependency this introduces a new accessor for the master tablet RaftConsensus instance which becomes available immediately after the tablet is initialized. A few call-sites are switched over to this accessor instead of drilling into catalog manager. Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 --- M src/kudu/integration-tests/security-master-auth-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master.cc 4 files changed, 20 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9541/3 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 3 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add CatalogManager::master consensus() accessor
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9541/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/9541/2/src/kudu/master/catalog_manager.cc@3759 PS2, Line 3759: std::lock_guard l(lock_); Would shared_lock l(lock_) be a better choice here? -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 2 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 08 Mar 2018 00:01:42 + Gerrit-HasComments: Yes
[kudu-CR] Add CatalogManager::master consensus() accessor
Hello Will Berkeley, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9541 to look at the new patch set (#2). Change subject: Add CatalogManager::master_consensus() accessor .. Add CatalogManager::master_consensus() accessor A patch I'm working on is going to start calling Master::GetMasterHostPorts from inside of CatalogManager::Init. GetMasterHostPorts calls into the catalog manager, and checks that it is initialized. To break this circular dependency this introduces a new accessor for the master tablet RaftConsensus instance which becomes available immediately after the tablet is initialized. A few call-sites are switched over to this accessor instead of drilling into catalog manager. Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 --- M src/kudu/integration-tests/security-master-auth-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master.cc 4 files changed, 17 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9541/2 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 2 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] Add CatalogManager::master consensus() accessor
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 07 Mar 2018 23:36:00 + Gerrit-HasComments: No
[kudu-CR] [doc] Document the new decimal column type
Hello Alex Rodoni, Dan Burkert, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9432 to look at the new patch set (#3). Change subject: [doc] Document the new decimal column type .. [doc] Document the new decimal column type Change-Id: I9489613d35daad708648ea04d49e472d3149b33d --- M docs/developing.adoc M docs/known_issues.adoc M docs/kudu_impala_integration.adoc M docs/release_notes.adoc M docs/schema_design.adoc 5 files changed, 64 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/32/9432/3 -- To view, visit http://gerrit.cloudera.org:8080/9432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9489613d35daad708648ea04d49e472d3149b33d Gerrit-Change-Number: 9432 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [doc] Document the new decimal column type
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9432 ) Change subject: [doc] Document the new decimal column type .. Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/9432/2/docs/developing.adoc File docs/developing.adoc: http://gerrit.cloudera.org:8080/#/c/9432/2/docs/developing.adoc@183 PS2, Line 183: , > nit: remove this comma (even if you are a fan of the oxford comma, you only Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/known_issues.adoc File docs/known_issues.adoc: http://gerrit.cloudera.org:8080/#/c/9432/2/docs/known_issues.adoc@56 PS2, Line 56: * Type and nullability of existing columns cannot be changed by altering the table. > maybe worth adding a note here that the precision and scale of DECIMAL colu Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/release_notes.adoc File docs/release_notes.adoc: http://gerrit.cloudera.org:8080/#/c/9432/2/docs/release_notes.adoc@53 PS2, Line 53: * Kudu now supports a decimal column type. The decimal type is a numeric data type > it's probably worth some note in compat section on what happens if you try Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/release_notes.adoc@53 PS2, Line 53: a > the Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc File docs/schema_design.adoc: http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@99 PS2, Line 99: float > should we format 'float' and 'double' with `...` so it shows up more like a Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@100 PS2, Line 100: int64 > same Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@103 PS2, Line 103: a precision and scale > "a" seems misplaced Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@108 PS2, Line 108: For example, representing : integer values up to , and fractional values up to 99.99, both require a : precision of 4 > I think this would read a little more clearly written as: Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@116 PS2, Line 116: If precision and scale are equal, all the digits come after the decimal point, : making all the values between -0.999... and 0.999... > This sentence is a bit confusing. Perhaps: Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@129 PS2, Line 129: * Decimal values with precision greater than 18 are stored in 16 bytes. > Is it worth adding a note that we don't currently support BITSHUFFLE encodi Done http://gerrit.cloudera.org:8080/#/c/9432/2/docs/schema_design.adoc@143 PS2, Line 143: bitshuffle > is bitshuffle the default for int128? I thought it wasnt supported but mayb I think you are thinking of run length. That is tracked by KUDU-2284. -- To view, visit http://gerrit.cloudera.org:8080/9432 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9489613d35daad708648ea04d49e472d3149b33d Gerrit-Change-Number: 9432 Gerrit-PatchSet: 2 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 23:33:53 + Gerrit-HasComments: Yes
[kudu-CR] Add CatalogManager::master consensus() accessor
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9541 ) Change subject: Add CatalogManager::master_consensus() accessor .. Patch Set 1: Code-Review+2 Maybe Will wants to take a look too. -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 07 Mar 2018 23:23:38 + Gerrit-HasComments: No
[kudu-CR] Add CatalogManager::master consensus() accessor
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9541 to review the following change. Change subject: Add CatalogManager::master_consensus() accessor .. Add CatalogManager::master_consensus() accessor A patch I'm working on is going to start calling Master::GetMasterHostPorts from inside of CatalogManager::Init. GetMasterHostPorts calls into the catalog manager, and checks that it is initialized. To break this circular dependency this introduces a new accessor for the master tablet RaftConsensus instance which becomes available immediately after the tablet is initialized. A few call-sites are switched over to this accessor instead of drilling into catalog manager. Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 --- M src/kudu/integration-tests/security-master-auth-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master.cc 4 files changed, 17 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9541/1 -- To view, visit http://gerrit.cloudera.org:8080/9541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie6887900329e67222da129b4a1b532cfb0a364b4 Gerrit-Change-Number: 9541 Gerrit-PatchSet: 1 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo
[kudu-CR] KUDU-2129 make ksck less scary when copying
Hello Will Berkeley, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9528 to look at the new patch set (#3). Change subject: KUDU-2129 make ksck less scary when copying .. KUDU-2129 make ksck less scary when copying This patch changes some of the outputs for ksck to be less troubling when the cluster is recovering. Updates include: - Report tablets with replicas whose data states are TABLET_DATA_COPYING as RECOVERING instead of UNDER_REPLICATED. - Report tables with recovering tablets as RECOVERING instead of UNDER_REPLICATED. - Report replicas that aren't running as "not running" instead of "bad state". Most non-RUNNING states have a time and place during a healthy cluster's lifespan. - Don't log the error type with the errors when running the `ksck` tool. Our messages are generally good enough, and there isn't a perfect mapping to some errors. E.g. we should take the "Corruption" out of "Corruption: 1 out of 1 table(s) are bad" - Report "not healthy" instead of "bad" in logs like the one above Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/tool_action_cluster.cc 4 files changed, 189 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/9528/3 -- To view, visit http://gerrit.cloudera.org:8080/9528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 Gerrit-Change-Number: 9528 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-2129 make ksck less scary when copying
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9528 ) Change subject: KUDU-2129 make ksck less scary when copying .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/9528/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9528/1//COMMIT_MSG@25 PS1, Line 25: > This is the one change I have an issue with because it makes it less clear Yeah, I think essentially we want to convey the messages: - The table is good, DON'T WORRY ABOUT IT - The table is recovering, KEEP AN EYE ON IT - The table is under-replicated/unavailable, DO SOMETHING ABOUT IT I'll go with "not healthy" for now, hopefully its proximity with the table summary will be enough to connect some dots. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@322 PS1, Line 322: > Howabout "are not healthy"? Sure, I wanted to avoid being so direct because this could easily be seen as "the table is unhealthy", but I see what you mean. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@402 PS1, Line 402: RT_OK(ksck_->ChecksumData(ChecksumOptions())); : ASSERT_STR_CONTAINS(err_stream_.str(), : "0/9 replicas remaining (180B from disk, 90 rows summed)"); : > Are these the long lines you're complaining about? Maybe we can add a helpe Yeah, I'm going to just reuse the printing code we have in the Ksck class. This ended up not being as pretty as I'd hoped because TableSummary is private, and I'd rather not make it public for the sake of tests. Alternatively, I could FRIEND_TEST everything. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@420 PS1, Line 420: RT_STR_CONTAINS(err_stream_.str(), ExpectedKsckTableSummary("test", : /*healthy_tables=*/ 3, : /*recovering_tables=*/ 0, : /*underreplicated_tables=*/ 0, : /*consensus_mismatch_tables=*/ 0, : /*unavailable_tables=*/ 0)); : } > On the other hand I think the consensus matrix literals are kind of a featu Yeah, I think the cmatrix stuff is fine, this patch doesn't really affect it and I agree it's helpful to have the full output. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h@461 PS1, Line 461: // The tablet is healthy. > Can you add a blurb like this for all non-OK statuses? Maybe like Done http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h@489 PS1, Line 489: t TotalTablets() const { : return healthy_tablets + recovering_tablets + underreplicated_tablets + : consensus_mismatch_tablets + unavailable_tablets; : } : : // > Isn't under-replicated worse than recovering? Ah, missed that above comment. -- To view, visit http://gerrit.cloudera.org:8080/9528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 Gerrit-Change-Number: 9528 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 07 Mar 2018 21:57:06 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2129 make ksck less scary when copying
Hello Will Berkeley, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9528 to look at the new patch set (#2). Change subject: KUDU-2129 make ksck less scary when copying .. KUDU-2129 make ksck less scary when copying This patch changes some of the outputs for ksck to be less troubling when the cluster is recovering. Updates include: - Report tablets with replicas whose data states are TABLET_DATA_COPYING as RECOVERING instead of UNDER_REPLICATED. - Report tables with recovering tablets as RECOVERING instead of UNDER_REPLICATED. - Report replicas that aren't running as "not running" instead of "bad state". Most non-RUNNING states have a time and place during a healthy cluster's lifespan. - Don't log the error type with the errors when running the `ksck` tool. Our messages are generally good enough, and there isn't a perfect mapping to some errors. E.g. we should take the "Corruption" out of "Corruption: 1 out of 1 table(s) are bad" - Report "not healthy" instead of "bad" in logs like the one above Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 --- M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/tool_action_cluster.cc 4 files changed, 189 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/9528/2 -- To view, visit http://gerrit.cloudera.org:8080/9528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 Gerrit-Change-Number: 9528 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] env: generalize resource limits and add RLIMIT NPROC support
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9521 ) Change subject: env: generalize resource limits and add RLIMIT_NPROC support .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9521 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 Gerrit-Change-Number: 9521 Gerrit-PatchSet: 2 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:44:00 + Gerrit-HasComments: No
[kudu-CR] KUDU-1913: cap number of threads on server-wide pools
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 ) Change subject: KUDU-1913: cap number of threads on server-wide pools .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Gerrit-Change-Number: 9522 Gerrit-PatchSet: 2 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:41:13 + Gerrit-HasComments: No
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@118 PS5, Line 118: MutexLock l(lock_); > just because dump_stacks_now_reason_ is mutated from the logger thread too, Ah right, nvm then. -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:19:35 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-1913: cap number of threads on server-wide pools
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9522 to look at the new patch set (#2). Change subject: KUDU-1913: cap number of threads on server-wide pools .. KUDU-1913: cap number of threads on server-wide pools The last piece of work is to establish an upper bound on the number of threads that may be started in the Raft and Prepare server-wide threadpools. Such caps will make it easier for admins to reason about appropriate values for the configuration of the Kudu processes' RLIMIT_NPROC resource. KUDU-1913 proposed a cap of "number of cores + number of disks", but a lively Slack discussion yielded a better solution: set the cap at some percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu generally uses a constant number of threads, this should prevent spikes from ever exceeding the RLIMIT_NPROC and crashing the server due to an election storm. This patch implements a cap of 10% per pool and also provides a new gflag as an "escape hatch" (in case we were horribly wrong). Note: it's still possible for a massive number of "hot" replicas to exceed RLIMIT_NPROC by virtue of each replica's log append thread, but the server is more likely to run out of memory for MemRowSets before that happens. Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 --- M src/kudu/kserver/kserver.cc 1 file changed, 57 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/9522/2 -- To view, visit http://gerrit.cloudera.org:8080/9522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Gerrit-Change-Number: 9522 Gerrit-PatchSet: 2 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] env: generalize resource limits and add RLIMIT NPROC support
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9521 to look at the new patch set (#2). Change subject: env: generalize resource limits and add RLIMIT_NPROC support .. env: generalize resource limits and add RLIMIT_NPROC support A follow-on patch will use this to cap the max number of threads in some process-wide thread pools (see KUDU-1913). Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 --- M src/kudu/fs/block_manager.cc M src/kudu/rpc/rpc-test.cc M src/kudu/util/env-test.cc M src/kudu/util/env.h M src/kudu/util/env_posix.cc 5 files changed, 106 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/9521/2 -- To view, visit http://gerrit.cloudera.org:8080/9521 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 Gerrit-Change-Number: 9521 Gerrit-PatchSet: 2 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1913: cap number of threads on server-wide pools
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9522 ) Change subject: KUDU-1913: cap number of threads on server-wide pools .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc File src/kudu/kserver/kserver.cc: http://gerrit.cloudera.org:8080/#/c/9522/1/src/kudu/kserver/kserver.cc@39 PS1, Line 39: server_thread_pool_thread_limit > lgtm, only minor nit is the name of the flag server_thread_pool_thread_limi Sure, I'll change it to server_thread_pool_max_thread_count. -- To view, visit http://gerrit.cloudera.org:8080/9522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I194907a7f8a483c9cba71eba8caed6bc6090f618 Gerrit-Change-Number: 9522 Gerrit-PatchSet: 1 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:04:57 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@140 PS3, Line 140: * the ticket cache to obtain a new ticket with a later expiration time. So, if Do we need to mention that Kudu will refuse the re-login attempt if the principal switched? http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java File java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@95 PS3, Line 95: static 'static' is redundant for inner enums. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@107 PS3, Line 107: . Nit: add ')' http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@130 PS3, Line 130:* @param subject JAAS Subject that the client's credentials are stored in Nit: remove the extra @param. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@171 PS3, Line 171: public void refreshSubject() { Nit: Maybe add a brief comment in front of the function since it is public. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@191 PS3, Line 191: Extra line. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@55 PS3, Line 55: static Redundant. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@118 PS3, Line 118: miniCluster.killMasters(); I do not quite follow why do we have to kill the masters. Does it mean as long as the client has an existing connection, even though the credentials expired the client is still able to work fine? http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@273 PS3, Line 273: public void testRenewAndReacquireKeberosCredentials() throws Exception { Nit: addd a comment here? http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@355 PS3, Line 355: Closeable c Not used? http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java@386 PS3, Line 386: Closeable c Same here. -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 4 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:05:04 + Gerrit-HasComments: Yes
[kudu-CR] env: generalize resource limits and add RLIMIT NPROC support
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9521 ) Change subject: env: generalize resource limits and add RLIMIT_NPROC support .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9521/1/src/kudu/util/env-test.cc File src/kudu/util/env-test.cc: http://gerrit.cloudera.org:8080/#/c/9521/1/src/kudu/util/env-test.cc@675 PS1, Line 675: KUDU-1798. > hum was the number of the jira wrong? Yes, the original number was wrong. This is the correct JIRA. http://gerrit.cloudera.org:8080/#/c/9521/1/src/kudu/util/env.h File src/kudu/util/env.h: http://gerrit.cloudera.org:8080/#/c/9521/1/src/kudu/util/env.h@319 PS1, Line 319: RUNNING_THREADS > hum, technically RLIMIT_NPROC is the limit of the total number of processes That's a good point; I'll clarify these. -- To view, visit http://gerrit.cloudera.org:8080/9521 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I567f581a6f8a85ac1f08878b61ac316cc2da36a0 Gerrit-Change-Number: 9521 Gerrit-PatchSet: 1 Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 21:01:50 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/9050/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9050/3//COMMIT_MSG@7 PS3, Line 7: java: automatically attempt to re-acquire Kerberos credentials before expiration > nit: this looks too long; maybe Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@113 PS3, Line 113: associate them with a {@link javax.security.auth.Subject} : * instance, and associate them with the current thread of execution > it will be a bit wordier, but I think you should avoid pronouns (them) in t Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@143 PS3, Line 143: * example by re-running 'kinit' once each key. > Probably a typo: "by re-running 'kinit' once each key" should probably read Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@180 PS3, Line 180: This > Specify that this is a function of UserGroupInformation Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@185 PS3, Line 185: The Kudu client emits : * DEB > wrapping Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@221 PS3, Line 221: {@link InterfaceAudience.Private > Are we giving up on ever changing an Unstable interface? Might be good to Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java File java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@97 PS3, Line 97: options.put("debug", Boolean.toString(Boolean.getBoolean("kudu.jaas.debug"))); > Is this trick worth adding to the debug section of your new doc? Done http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@170 PS3, Line 170: return millisUntilEnd * 1000 < REFRESH_BEFORE_EXPIRATION_SECS; > Shouldn't this be multiplying the expiration_secs by 1000 to make the units good catch, put the multiplication on the wrong side! -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 20:51:33 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials
Hello Alexey Serbin, Dan Burkert, Jean-Daniel Cryans, Kudu Jenkins, Adar Dembo, Hao Hao, Anonymous Coward #380, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9050 to look at the new patch set (#4). Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials .. KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials This fixes the Java client to notice when the Kerberos credentials it has are about to expire, and initiates a re-login when necessary. It also adds a long javadoc for AsyncKuduClient indicating the proper way to authenticate to a secured Kudu cluster for various scenarios, since this is a common question we've gotten. Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/apache/kudu/client/AuthnTokenReacquirer.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduException.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduRpc.java M java/kudu-client/src/main/java/org/apache/kudu/client/Negotiator.java M java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java M java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestNegotiator.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestSecurity.java M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool.proto M src/kudu/tools/tool_action_test.cc 14 files changed, 786 insertions(+), 97 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/9050/4 -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 4 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@118 PS5, Line 118: MutexLock l(lock_); > Why bother locking this? It seems to me like when RunThread() gives up the just because dump_stacks_now_reason_ is mutated from the logger thread too, and this may be called concurrently from multiple threads, so it would be unsafe to mutate it without a lock, right? std::string is not threadsafe -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 20:38:27 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.cc File src/kudu/rpc/service_pool.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.cc@136 PS5, Line 136: if (too_busy_hook_) { : too_busy_hook_(); : } You may get a less noisy stack if you call this before responding to the RPC. Since responding to the RPC would start off some reactor thread tasks asynchronously. But the trade off is that you'll reply to this RPC a little later. Not a big deal, but I just thought I'd bring it up. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@118 PS5, Line 118: MutexLock l(lock_); Why bother locking this? It seems to me like when RunThread() gives up the lock to do the actual logging, the 'dump_stacks_now_reason_' can be overwritten anyway by another thread calling DumpStacksNow(). Unless I'm missing something. -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Wed, 07 Mar 2018 20:34:51 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9523 ) Change subject: KUDU-2297 (part 6): enable periodic stack sampling by default, avoid bias .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.h File src/kudu/server/diagnostics_log.h: http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.h@59 PS2, Line 59: enum WakeupType { enum class is the new hotness. http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@75 PS2, Line 75: If this is set to " : "a non-positive number, stack traces will be not be periodically logged. How about making this a uint and using 0 to disable? http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@169 PS2, Line 169: Random rng(GetRandomSeed32()); Why create a new PRNG each time? http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@182 PS2, Line 182: __builtin_unreachable(); Fancy. If you're feeling generous, there are quite a few places in the codebase where we add a comment like // unreachable instead of actually using this, and they could be fixed up. Also, should this be abstracted away by something in gutil/port.h? Or do you expect it to be the same everywhere, including in compilers we might not support yet (like MSVC)? http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@204 PS2, Line 204: } else if (MonoTime::Now() > next_log) { >= is more correct, no? http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@208 PS2, Line 208: wakeups.emplace(ComputeNextWakeup(what), what); what what what http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@214 PS2, Line 214: // Unlock the mutex while actually logging metrics or stacks since it's somewhat This is much cleaner, thanks for the cleanup. http://gerrit.cloudera.org:8080/#/c/9523/2/src/kudu/server/diagnostics_log.cc@222 PS2, Line 222: WARN_NOT_OK(LogStacks(reason), "Unable to collect stacks to diagnostics log"); So error no longer feds into the next logging time? Didn't find it useful? -- To view, visit http://gerrit.cloudera.org:8080/9523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I538eb59f95a695fa2da50f24ed82148715b15e44 Gerrit-Change-Number: 9523 Gerrit-PatchSet: 2 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Wed, 07 Mar 2018 20:22:48 + Gerrit-HasComments: Yes
[kudu-CR] Fix RUN FLAKY ONLY build
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9533 ) Change subject: Fix RUN_FLAKY_ONLY build .. Fix RUN_FLAKY_ONLY build Now that our tests have shard numbers appended (foo-test.) according to ctest, the RUN_FLAKY_ONLY build regex was no longer applying properly. It was converting the flaky test 'foo-test' to the regex '^foo-test$' which would not match any of the shards. This simply removes the '$' from the end of the regex. I attempted to instead add a '(.\d+)?' to the regex, but ctest has a hard limit on the number of '()' pairs in a regex so it was failing to compile the regex. It's also not possible to pass multiple separate regexes on the command line. Given that we never have test names where one is a prefix of another, I think this is probably fine. Change-Id: I3574bb88c2f3c465e3c659b80b122c1109053648 Reviewed-on: http://gerrit.cloudera.org:8080/9533 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M build-support/jenkins/build-and-test.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3574bb88c2f3c465e3c659b80b122c1109053648 Gerrit-Change-Number: 9533 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java File java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@86 PS3, Line 86: private final Object subjectLock = new Object(); > Subject could have been passed in by the caller, though, in which case we'd Good point. -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 20:12:07 + Gerrit-HasComments: Yes
[kudu-CR] Fix RUN FLAKY ONLY build
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9533 ) Change subject: Fix RUN_FLAKY_ONLY build .. Patch Set 1: Code-Review+2 (1 comment) It's perl, so automatic +2. http://gerrit.cloudera.org:8080/#/c/9533/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9533/1//COMMIT_MSG@18 PS1, Line 18: Given that we never have test names where one is a prefix of : another, I think this is probably fine. I'll be sure to add "find a complex test and write a test for it" to my TODO list. -- To view, visit http://gerrit.cloudera.org:8080/9533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3574bb88c2f3c465e3c659b80b122c1109053648 Gerrit-Change-Number: 9533 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 07 Mar 2018 20:08:41 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java File java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@86 PS3, Line 86: private final Object subjectLock = new Object(); > I don't know how you feel about this style, but since 'subject' is private, Subject could have been passed in by the caller, though, in which case we'd be using an external object as an internal lock which is a no-no IMO. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java File java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@196 PS3, Line 196: principal.getRealm() + "@" + principal.getRealm()) > Do we always have the service and the client in the same domain? Maybe, it Even in a cross-realm situation, you end up with a service ticket to your local realm's KDC. This is called from the findTgt function above which loops over all your tickets. For example, I just logged into a cluster which has cross-realm trust from our corporate active directory to a cluster-local KDC, kinitted to active directory, and then connected to a kerberos-authenticated service on the cluster. kinit shows: [todd@xxx ~]$ klist Ticket cache: FILE:/tmp/krb5cc_2009 Default principal: todd@CLOUDERA.LOCAL Valid starting ExpiresService principal 03/07/18 19:56:23 03/08/18 05:56:25 krbtgt/CLOUDERA.LOCAL@CLOUDERA.LOCAL renew until 03/14/18 19:56:23 03/07/18 19:56:27 03/08/18 05:56:25 krbtgt/PROD.EDH@CLOUDERA.LOCAL renew until 03/14/18 19:56:23 03/07/18 19:56:27 03/08/18 05:56:25 impala/xxx.cloudera@prod.edh renew until 03/12/18 19:56:27 In this case, it's the krbtgt/CLOUDERA.LOCAL@CLOUDERA.LOCAL ticket that we're looking for (ie the TGT associated with the primary realm you authenticated to). I'll see if I can add some commentary. -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 20:07:00 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration .. Patch Set 3: (3 comments) I just did a first quick pass. Overall looks good, I'm thinking about re-visiting this patch once more time tonight. http://gerrit.cloudera.org:8080/#/c/9050/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9050/3//COMMIT_MSG@7 PS3, Line 7: java: automatically attempt to re-acquire Kerberos credentials before expiration nit: this looks too long; maybe java: automatically re-acquire Kerberos credentials http://gerrit.cloudera.org:8080/#/c/9050/3//COMMIT_MSG@12 PS3, Line 12: It also adds a long javadoc for AsyncKuduClient indicating the proper : way to authenticate to a secured Kudu cluster for various scenarios, : since this is a common question we've gotten. it's a nice addition http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java File java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@196 PS3, Line 196: principal.getRealm() + "@" + principal.getRealm()) Do we always have the service and the client in the same domain? Maybe, it's better to use some pattern matching here instead? -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 19:57:30 + Gerrit-HasComments: Yes
[kudu-CR] Fix RUN FLAKY ONLY build
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9533 to review the following change. Change subject: Fix RUN_FLAKY_ONLY build .. Fix RUN_FLAKY_ONLY build Now that our tests have shard numbers appended (foo-test.) according to ctest, the RUN_FLAKY_ONLY build regex was no longer applying properly. It was converting the flaky test 'foo-test' to the regex '^foo-test$' which would not match any of the shards. This simply removes the '$' from the end of the regex. I attempted to instead add a '(.\d+)?' to the regex, but ctest has a hard limit on the number of '()' pairs in a regex so it was failing to compile the regex. It's also not possible to pass multiple separate regexes on the command line. Given that we never have test names where one is a prefix of another, I think this is probably fine. Change-Id: I3574bb88c2f3c465e3c659b80b122c1109053648 --- M build-support/jenkins/build-and-test.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/9533/1 -- To view, visit http://gerrit.cloudera.org:8080/9533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3574bb88c2f3c465e3c659b80b122c1109053648 Gerrit-Change-Number: 9533 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo
[kudu-CR] KUDU-721: [Python] Add DECIMAL column type support
David Ribeiro Alves has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9496 ) Change subject: KUDU-721: [Python] Add DECIMAL column type support .. KUDU-721: [Python] Add DECIMAL column type support This patch adds basic support to the Python client to create, read, and write tables with DECIMAL columns. Change-Id: I8e0855100ab1ea891f990931ec94d0b98c0dece1 Reviewed-on: http://gerrit.cloudera.org:8080/9496 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves--- M python/kudu/__init__.py M python/kudu/client.pyx M python/kudu/libkudu_client.pxd M python/kudu/schema.pxd M python/kudu/schema.pyx M python/kudu/tests/test_scanner.py M python/kudu/tests/test_scantoken.py M python/kudu/tests/test_schema.py A python/kudu/tests/test_util.py M python/kudu/tests/util.py M python/kudu/util.py 11 files changed, 368 insertions(+), 11 deletions(-) Approvals: Kudu Jenkins: Verified David Ribeiro Alves: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9496 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8e0855100ab1ea891f990931ec94d0b98c0dece1 Gerrit-Change-Number: 9496 Gerrit-PatchSet: 5 Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jordan Birdsell Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Wes McKinney Gerrit-Reviewer: a...@phdata.io
[kudu-CR] KUDU-721: [Python] Add DECIMAL column type support
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9496 ) Change subject: KUDU-721: [Python] Add DECIMAL column type support .. Patch Set 4: merging this. we can follow up if we missed anything. -- To view, visit http://gerrit.cloudera.org:8080/9496 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e0855100ab1ea891f990931ec94d0b98c0dece1 Gerrit-Change-Number: 9496 Gerrit-PatchSet: 4 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jordan Birdsell Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Wes McKinney Gerrit-Reviewer: a...@phdata.io Gerrit-Comment-Date: Wed, 07 Mar 2018 19:46:55 + Gerrit-HasComments: No
[kudu-CR] KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9375 ) Change subject: KUDU-2297 (part 5): dump stacks into diagnostics log on service queue overflow .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc@829 PS5, Line 829: Env::Default() KuduTest has an env_ member you can use. Below too. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/master/master-test.cc@835 PS5, Line 835: return line.find("service queue overflowed for kudu.master.MasterService") != string::npos; Would MatchPattern be more idiomatic? Won't have to worry about string::npos comparisons then. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.h File src/kudu/rpc/service_pool.h: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/rpc/service_pool.h@67 PS5, Line 67: void set_reject_too_busy_hook(std::functionhook) { Nit: if this is supposed to look like a setter for too_busy_hook_, I think set_too_busy_hook() would be a better name. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc File src/kudu/server/diagnostics_log.cc: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@196 PS5, Line 196: WARN_NOT_OK(s, "Unable to collect stacks to diagnostics log"); Nit: "Will try again in ..." ? http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/diagnostics_log.cc@201 PS5, Line 201: dump_stacks_now_reason_ = boost::none; Why do this here and not on L187, right after storing a local copy in 'reason'? By doing it down here it's possible for another thread to overwrite dump_stacks_now_reason_ during the stack logging. http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/rpc_server.h File src/kudu/server/rpc_server.h: http://gerrit.cloudera.org:8080/#/c/9375/5/src/kudu/server/rpc_server.h@62 PS5, Line 62: void set_reject_too_busy_hook(std::function hook) { Nit: same comment about setter naming. -- To view, visit http://gerrit.cloudera.org:8080/9375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I62019c71121d7ca4037cab86a7c2ea048a261ad8 Gerrit-Change-Number: 9375 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Wed, 07 Mar 2018 19:27:44 + Gerrit-HasComments: Yes
[kudu-CR] build: don't name unsharded tests with a shard suffix
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9524 ) Change subject: build: don't name unsharded tests with a shard suffix .. Patch Set 1: (1 comment) No need to purge the flaky test DB since we track flakiness on a per-suite level rather than per-shard currently (see http://dist-test.cloudera.org:8080/) http://gerrit.cloudera.org:8080/#/c/9524/1/build-support/run-test.sh File build-support/run-test.sh: http://gerrit.cloudera.org:8080/#/c/9524/1/build-support/run-test.sh@68 PS1, Line 68: :-0 > Under what circumstances would we have more than one shard but GTEST_SHARD_ yea probably. It seems gtest will fail if you try it: todd@todd-laptop:/src/kudu$ GTEST_TOTAL_SHARDS=1 ./build/latest/bin/rpc-test Invalid environment variables: you have GTEST_TOTAL_SHARDS = 1, but have left GTEST_SHARD_INDEX unset. -- To view, visit http://gerrit.cloudera.org:8080/9524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I33611f9766381d367c7a5ac7b09a00f01c48fe74 Gerrit-Change-Number: 9524 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 19:26:46 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@113 PS3, Line 113: associate them with a {@link javax.security.auth.Subject} : * instance, and associate them with the current thread of execution it will be a bit wordier, but I think you should avoid pronouns (them) in this sentence, it's not 100% clear what they refer to. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@180 PS3, Line 180: This Specify that this is a function of UserGroupInformation http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@185 PS3, Line 185: The Kudu client emits : * DEB wrapping http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@221 PS3, Line 221: {@link InterfaceAudience.Private Are we giving up on ever changing an Unstable interface? Might be good to add a link to InterfaceStability. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/Negotiator.java File java/kudu-client/src/main/java/org/apache/kudu/client/Negotiator.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/Negotiator.java@423 PS3, Line 423: if (s == null || : s.getPrivateCredentials(KerberosTicket.class).isEmpty()) { this can be one line now, which will read better http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java File java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/SecurityContext.java@86 PS3, Line 86: private final Object subjectLock = new Object(); I don't know how you feel about this style, but since 'subject' is private, you could just use it directly as the lock. http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java File java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@97 PS3, Line 97: options.put("debug", Boolean.toString(Boolean.getBoolean("kudu.jaas.debug"))); Is this trick worth adding to the debug section of your new doc? http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/util/SecurityUtil.java@170 PS3, Line 170: return millisUntilEnd * 1000 < REFRESH_BEFORE_EXPIRATION_SECS; Shouldn't this be multiplying the expiration_secs by 1000 to make the units line up? -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 19:21:59 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration
Anonymous Coward #380 has posted comments on this change. ( http://gerrit.cloudera.org:8080/9050 ) Change subject: KUDU-2264. java: automatically attempt to re-acquire Kerberos credentials before expiration .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java: http://gerrit.cloudera.org:8080/#/c/9050/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@143 PS3, Line 143: * example by re-running 'kinit' once each key. Probably a typo: "by re-running 'kinit' once each key" should probably read: "by re-running 'kinit' once each day" s/key/day -- To view, visit http://gerrit.cloudera.org:8080/9050 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539 Gerrit-Change-Number: 9050 Gerrit-PatchSet: 3 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward #380 Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 19:20:43 + Gerrit-HasComments: Yes
[kudu-CR] [WIP] updated to use READ YOUR WRITES scan mode for Jepsen tests
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/9526 ) Change subject: [WIP] updated to use READ_YOUR_WRITES scan mode for Jepsen tests .. Patch Set 1: (4 comments) > (5 comments) > > we'll also need to test RYR, which I'm not sure the register test > can do, since it only writes a single row. > > cockroachdb has a good set of tests that we might want to look at: > https://github.com/jepsen-io/jepsen/tree/master/cockroachdb Thanks for sharing that! I will take a look. http://gerrit.cloudera.org:8080/#/c/9526/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9526/1//COMMIT_MSG@9 PS1, Line 9: This patches updated the Jespen tests to use READ_YOUR_WRITES scan mode. > we're supposed to have both eventually right? Right, this is just a WIP patch. Eventually I will add an option to select which mode to use. http://gerrit.cloudera.org:8080/#/c/9526/1//COMMIT_MSG@14 PS1, Line 14: https://drive.google.com/file/d/1jW1CeGjB9AdnJZ2XbhJrf0IyOwXnKC_p/view?usp=sharing > thanks for sharing this. So as discussed on slack I think the exception "Snapshot timestamp is earlier than the ancient history mark. Consider increasing the value of the configuration parameter --tablet_history_max_age_sec. Snapshot timestamp: P: 0 usec, L: 1 Ancient History Mark: P: 1520385210671765 usec, L: 0 Physical time difference: -1520385210.672s" may be caused by the safe time is not correctly advanced yet(similar to KUDU-2233). Do you think so? http://gerrit.cloudera.org:8080/#/c/9526/1/java/kudu-jepsen/src/main/clojure/jepsen/kudu/register.clj File java/kudu-jepsen/src/main/clojure/jepsen/kudu/register.clj: http://gerrit.cloudera.org:8080/#/c/9526/1/java/kudu-jepsen/src/main/clojure/jepsen/kudu/register.clj@36 PS1, Line 36: [table-created? kclient ktable] : (reify jepsen.client/Client : (setup! [_ test _] : "Create the client and the test table. Use the same Kudu client instance " : "across all test actors to achieve timestamp propagation for all " : "operations." : (let [kclient (kc/sync-client (:master-addresses test)) : table-name (:table-name test) : ktable (locking table-created? : (when (compare-and-set! table-created? false true) :(kc/create-table : kclient : table-name : kt/kv-table-schema : (let [ranges (:table-ranges test) :rep-factor (:num-replicas test)] :(if (nil? ranges) : (kt/kv-table-options-hash rep-factor (count (:tservers test))) : (kt/kv-table-options-range rep-factor ranges) : (kc/open-table kclient table-name))] : (client table-created? kclient ktable))) > this is just a revert of the changes we made to have 1 client per table, vs Yes, agree. http://gerrit.cloudera.org:8080/#/c/9526/1/java/kudu-jepsen/src/main/clojure/jepsen/kudu/table.clj File java/kudu-jepsen/src/main/clojure/jepsen/kudu/table.clj: http://gerrit.cloudera.org:8080/#/c/9526/1/java/kudu-jepsen/src/main/clojure/jepsen/kudu/table.clj@100 PS1, Line 100: READ_YOUR_WRITES > this will have to be a parameter Agree. -- To view, visit http://gerrit.cloudera.org:8080/9526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I92d5c0e3b91af58576eb6cd408d922ec7c0fef6c Gerrit-Change-Number: 9526 Gerrit-PatchSet: 1 Gerrit-Owner: Hao HaoGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 07 Mar 2018 19:06:13 + Gerrit-HasComments: Yes
[kudu-CR] build: don't name unsharded tests with a shard suffix
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9524 ) Change subject: build: don't name unsharded tests with a shard suffix .. Patch Set 1: (1 comment) Thanks. Does it make sense to also purge the flaky test database of "foo.0" entries for tests that were actually unsharded? http://gerrit.cloudera.org:8080/#/c/9524/1/build-support/run-test.sh File build-support/run-test.sh: http://gerrit.cloudera.org:8080/#/c/9524/1/build-support/run-test.sh@68 PS1, Line 68: :-0 Under what circumstances would we have more than one shard but GTEST_SHARD_INDEX is undefined? Seems like an error, no? -- To view, visit http://gerrit.cloudera.org:8080/9524 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I33611f9766381d367c7a5ac7b09a00f01c48fe74 Gerrit-Change-Number: 9524 Gerrit-PatchSet: 1 Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 07 Mar 2018 18:22:50 + Gerrit-HasComments: Yes
[kudu-CR] WIP KUDU-2129 make ksck less scary when copying
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9528 ) Change subject: WIP KUDU-2129 make ksck less scary when copying .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/9528/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9528/1//COMMIT_MSG@25 PS1, Line 25: Report "non-HEALTHY" instead of "bad" in the above log This is the one change I have an issue with because it makes it less clear what ksck is saying. It tells you what the thing is not instead of what it is, which makes it harder to know how to respond. Admittedly "bad" isn't great either, since "bad" tables may actually be self-healing. Maybe we need to break down the results further into HEALTHY, RECOVERING, NEEDS_INTERVENTION, or something? http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@322 PS1, Line 322: with non-HEALTHY statuses Howabout "are not healthy"? http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@402 PS1, Line 402: "Table Summary\n" : " Name | Status | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable\n" : "--+-+---+-++--+-\n" : " test | HEALTHY | 3 | 3 | 0 | 0| 0"); Are these the long lines you're complaining about? Maybe we can add a helper like ExpectedKsckTableSummary() that takes a container of expected CheckResults or just an expected count of each kind, and returns the table as a string. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck-test.cc@420 PS1, Line 420: "The consensus matrix is:\n" : " Config source | Replicas | Current term | Config index | Committed?\n" : "---+--+--+--+\n" : " master| A* B C| | | Yes\n" : " A | A* B C D| 0| | Yes\n" : " B | A* B C| 0| | Yes\n" : " C | A* B C| 0| | Yes"); On the other hand I think the consensus matrix literals are kind of a feature, since as we test for detecting more situations they serve as a reference for how certain situations look in ksck. Also it'd be harder to write a helper. http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h@461 PS1, Line 461: // There was a discrepancy among the tablets' consensus configs and the master's. Can you add a blurb like this for all non-OK statuses? Maybe like // The tablet has less replicas than its table's replication factor. UNDER_REPLICATED // The tablet is missing a majority of its replicas and is unavailable for writes. If a majority cannot be brought back online then the tablet requires manual intervention to recover. UNAVAILABLE // The tablet is missing replicas but is in the process of self-healing. RECOVERING I also have weak preferences for: 1. Renaming OK to HEALTHY. 2. Reordering the enum values to go from better to worse. IIRC it's HEALTHY/OK RECOVERING UNDER_REPLICATED CONSENSUS_MISMATCH UNAVAILABLE http://gerrit.cloudera.org:8080/#/c/9528/1/src/kudu/tools/ksck.h@489 PS1, Line 489: if (recovering_tablets > 0) { : return CheckResult::RECOVERING; : } : if (underreplicated_tablets > 0) { : return CheckResult::UNDER_REPLICATED; : } Isn't under-replicated worse than recovering? -- To view, visit http://gerrit.cloudera.org:8080/9528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba0c1f5b5a7083cbef99d3674dfebe1457075ce0 Gerrit-Change-Number: 9528 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew WongGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 07 Mar 2018 18:04:32 + Gerrit-HasComments: Yes
[kudu-CR] [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/9510 ) Change subject: [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc File src/kudu/tools/kudu-admin-test.cc: http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc@327 PS3, Line 327: void AdminCliTest::DoTestMoveTabletKUDU2331(EnableKudu1097 enable_kudu_1097) { > It's a nice test. Does it make sense to add the 'negative' scenario as wel It's pretty clear it won't work if the source or destination servers are down, but it would be good to have a test that if a different replica is down then the move fails. http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc@386 PS3, Line 386: // Skip the ClusterVerifier since one tablet server is down. > maybe, add Done http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/tool_action_tablet.cc File src/kudu/tools/tool_action_tablet.cc: http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/tool_action_tablet.cc@154 PS3, Line 154: ignore_result(ksck.FetchInfoFromTabletServers()); > Did you consider the approach of 'applying' the 'tablet_id_filters' for Fet Yes. It would complicate that code quite a bit. Even though a filter would let us do less work by only going to the tablet servers the master thinks host a replica of the tablet, in the general case for ksck we want to check every tablet server every time since an important part of ksck is checking for inconsistencies between master and tablet servers, so we don't want to rely just on what the master tells us. -- To view, visit http://gerrit.cloudera.org:8080/9510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id1fc60233f4f50f478da7ccb104e37df3067400c Gerrit-Change-Number: 9510 Gerrit-PatchSet: 3 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 07 Mar 2018 17:42:57 + Gerrit-HasComments: Yes
[kudu-CR] [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9510 to look at the new patch set (#4). Change subject: [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down .. [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down The move tool requires a clean ksck for the tablet it's acting on for safety reasons, but the check for this failed if a tablet server was down, even if the tablet server didn't host a replica for the tablet and wasn't the destination server. This fixes the move command so a down tserver uninvolved in the move won't prevent the move. Change-Id: Id1fc60233f4f50f478da7ccb104e37df3067400c --- M src/kudu/tools/kudu-admin-test.cc M src/kudu/tools/tool_action_tablet.cc 2 files changed, 138 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/10/9510/4 -- To view, visit http://gerrit.cloudera.org:8080/9510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1fc60233f4f50f478da7ccb104e37df3067400c Gerrit-Change-Number: 9510 Gerrit-PatchSet: 4 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley
[kudu-CR] [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9510 ) Change subject: [tools] KUDU-2331: Allow move tool to work when uninvolved tserver is down .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc File src/kudu/tools/kudu-admin-test.cc: http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc@327 PS3, Line 327: void AdminCliTest::DoTestMoveTabletKUDU2331(EnableKudu1097 enable_kudu_1097) { It's a nice test. Does it make sense to add the 'negative' scenario as well (i.e. when a tablet server hosting one of the replicas is down)? http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/kudu-admin-test.cc@386 PS3, Line 386: // Skip the ClusterVerifier since one tablet server is down. maybe, add NO_FATALS(cluster_->AssertNoCrashes()); http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/tool_action_tablet.cc File src/kudu/tools/tool_action_tablet.cc: http://gerrit.cloudera.org:8080/#/c/9510/3/src/kudu/tools/tool_action_tablet.cc@154 PS3, Line 154: ignore_result(ksck.FetchInfoFromTabletServers()); Did you consider the approach of 'applying' the 'tablet_id_filters' for FetchInfoFromTabletServers()? -- To view, visit http://gerrit.cloudera.org:8080/9510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id1fc60233f4f50f478da7ccb104e37df3067400c Gerrit-Change-Number: 9510 Gerrit-PatchSet: 3 Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Wed, 07 Mar 2018 15:39:01 + Gerrit-HasComments: Yes