[kudu-CR] [docs] Refresh and augment the known issues
Hello Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6699 to look at the new patch set (#3). Change subject: [docs] Refresh and augment the known issues .. [docs] Refresh and augment the known issues We've learned a lot about Kudu since people have started using it. I've gathered in this patch what I think should be the new recommendations we make to users. Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d --- M docs/developing.adoc M docs/installation.adoc M docs/known_issues.adoc M docs/kudu_impala_integration.adoc 4 files changed, 101 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/6699/3 -- To view, visit http://gerrit.cloudera.org:8080/6699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [docs] Refresh and augment the known issues
Jean-Daniel Cryans has posted comments on this change. Change subject: [docs] Refresh and augment the known issues .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6699/2/docs/known_issues.adoc File docs/known_issues.adoc: PS2, Line 135: Supported > This seems out of place in the Known Issues doc - There is a requirements s Well it's "Known Issues and Limitations", so you're limited to those. But I agree it should only be in one place. PS2, Line 168: == Spark > The Admin topic also has a list of limitations for Spark integration - coul I'll add that there. PS2, Line 172: Impala > Same comment as Spark - Consider maintaining in 1 place rather than two. Done -- To view, visit http://gerrit.cloudera.org:8080/6699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) FAQ refresh
Hello Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6697 to look at the new patch set (#2). Change subject: FAQ refresh .. FAQ refresh Change-Id: Idd8a79e3067bedbef71b0dcf9921be673e9dfaa0 --- M faq.md 1 file changed, 7 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/6697/2 -- To view, visit http://gerrit.cloudera.org:8080/6697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idd8a79e3067bedbef71b0dcf9921be673e9dfaa0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) FAQ refresh
Jean-Daniel Cryans has posted comments on this change. Change subject: FAQ refresh .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6697/1/faq.md File faq.md: PS1, Line 250: Kudu hasn't been officially tested with Jepsen but it can be setup using : [these instructions](https://github.com/apache/kudu/blob/master/java/kudu-jepsen/README.adoc). > what do you mean by officially? does it have to run by aphyr to be official > what do you mean by officially? does it have to run by aphyr to be official? > :) Yes! j/k, I think anyone posting a blog post with reproducible steps would count as "official". I can change the wording to "publicly" maybe? > Can you say instead that the jepsen tests are a work in progress, that the > tests we have run continusously and where people can find them? Yes, maybe, yes. Saying it's being run continuously is weird if we can't point to them IMO. "trust us!" -- To view, visit http://gerrit.cloudera.org:8080/6697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd8a79e3067bedbef71b0dcf9921be673e9dfaa0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Grant Henke has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/6630/6/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 446: data_size -= checksum_size; > careful of underflow Done Line 472: if (block.size() != data_size && checksum.size() != checksum_size) { > || Done http://gerrit.cloudera.org:8080/#/c/6630/6/src/kudu/util/env.h File src/kudu/util/env.h: Line 21: #include > Would prefer not to leak this sort of platform-specific thing outside of en I agree. This was just a rough patch to see if the approach and use of preadv made sense before polishing. See my comment below as well. Line 390: virtual Status ReadV(uint64_t offset, Slice** results, > Would std::vector* be more intuitive for 'results'? That is, ReadV() I agree that a vector with all the combined parameters would be better in the end. Something that has slice, scratch, and length so we don't need to expose iovec. I planned to do this, I just posted this super rough version to see if the approach and use of preadv made sense before spending the time polishing, testing, etc. -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant HenkeGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Simplify MemTracker and move process throttling elsewhere
Adar Dembo has posted comments on this change. Change subject: Simplify MemTracker and move process throttling elsewhere .. Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/integration-tests/raft_consensus-itest.cc File src/kudu/integration-tests/raft_consensus-itest.cc: PS3, Line 2254: since we can get accurate process memory : // usage statistic I presume you tested against the kNumOps of 1, and this new value made sense? http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/server/default-path-handlers.cc File src/kudu/server/default-path-handlers.cc: Line 149: *output << "Total memory usage\n"; "Total" suggests that the per-subsystem stuff should add up to it. Perhaps "Process memory usage" would be more precise? Line 152: HumanReadableNumBytes::ToString(process_memory::CurrentConsumption())); Maybe add a warning if !TCMALLOC_ENABLED that this isn't accurate? http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/tablet/multi_column_writer.cc File src/kudu/tablet/multi_column_writer.cc: Line 89: LOG(INFO) << "Opened CFile writers for " << cfile_writers_.size() << " column(s)"; Heh, got tired of this output? http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/util/mem_tracker.cc File src/kudu/util/mem_tracker.cc: PS3, Line 229: // TODO: This might leave us with an allocated resource that we can't use. Do we need : // to adjust the consumption of the query tracker to stop the resource from never : // getting used by a subsequent TryConsume()? Probably irrelevant to us. Line 240: Consume(-bytes); Technically this can fail, yet we drop the failure on the ground. I wonder if we'd be better off just not allowing Consume(negative) and Release(negative). Line 251: process_memory::MaybeGCAfterRelease(bytes); Maybe this new bit should be documented in the header somewhere? Line 264: return CheckLimitExceeded(); Could we just remove one of these two variants if they're identical? http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/util/mem_tracker.h File src/kudu/util/mem_tracker.h: PS3, Line 149: LogUsage() Not relevant anymore. Plus, 'id' is actually used for more than just cosmetic stuff. http://gerrit.cloudera.org:8080/#/c/6620/3/src/kudu/util/process_memory.cc File src/kudu/util/process_memory.cc: Line 20: #include > warning: #includes are not sorted properly [llvm-include-order] Nit: this should be in its own group ahead of gflags/gperftools since it's a "real" system include and not a "project" system include. Line 141: // Nothing to do if not using tcmalloc. Maybe this should be moved up into MaybeGCAfterRelease() so we can avoid the increment on g_released_memory_since_gc? -- To view, visit http://gerrit.cloudera.org:8080/6620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id16bad7d9a29a83e820a38e9d703811391cffe90 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] tpch: allow hash partitioning
Adar Dembo has posted comments on this change. Change subject: tpch: allow hash partitioning .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6709/1/src/kudu/benchmarks/tpch/rpc_line_item_dao.h File src/kudu/benchmarks/tpch/rpc_line_item_dao.h: PS1, Line 47: PartitionStrategy partition_strategy, : int num_buckets, : std::vector tablet_splits = {}); Looks OK but it would be nicer if num_buckets and tablet_splits were dependent on partition_strategy in some way (i.e. if PartitionStrategy were a "variant" type and num_buckets/tablet_splits were encapsulated within each of the two variants). Feel free to punt if this is too much work. -- To view, visit http://gerrit.cloudera.org:8080/6709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbbb643447dff58b32e67764f6fb632441a4f6e4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] Silence clang -Waddress-of-packed-member warning in concurrent btree.h
David Ribeiro Alves has posted comments on this change. Change subject: Silence clang -Waddress-of-packed-member warning in concurrent_btree.h .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6693/2/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: Line 62: #pragma clang diagnostic push > Unfortunately, gcc emits a warning on all "#pragma clang ..." I believe. He I'll try that. on the last question see end of file. -- To view, visit http://gerrit.cloudera.org:8080/6693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I03518bcca43ea6c891a7f622972465c1ea87ce63 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Silence clang -Waddress-of-packed-member warning in concurrent btree.h
Adar Dembo has posted comments on this change. Change subject: Silence clang -Waddress-of-packed-member warning in concurrent_btree.h .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6693/2/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: Line 62: #pragma clang diagnostic push Unfortunately, gcc emits a warning on all "#pragma clang ..." I believe. Here's gcc 4.9.2: /data/jenkins/workspace/generic-package-centos64-7-0-impala/topdir/BUILD/kudu-1.4.0-cdh5.12.0-SNAPSHOT/src/kudu/consensus/consensus_peers.cc:46:0: warning: ignoring #pragma clang diagnostic [-Wunknown-pragmas] #pragma clang diagnostic ignored "-Wc++14-extensions" ^ I'm not sure whether there's any way to fix that. Maybe you can do first #pragma GCC diagnostic ignored "-Wpragmas"? Would that work? Also, doesn't this need a corresponding #pragma clang diagnostic pop? -- To view, visit http://gerrit.cloudera.org:8080/6693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I03518bcca43ea6c891a7f622972465c1ea87ce63 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Silence clang -Waddress-of-packed-member warning in concurrent btree.h
Todd Lipcon has posted comments on this change. Change subject: Silence clang -Waddress-of-packed-member warning in concurrent_btree.h .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I03518bcca43ea6c891a7f622972465c1ea87ce63 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Simplify MemTracker and move process throttling elsewhere
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6620 to look at the new patch set (#3). Change subject: Simplify MemTracker and move process throttling elsewhere .. Simplify MemTracker and move process throttling elsewhere This takes a first step towards simplifying MemTracker: - Remove the "GC function" callbacks (we never used this) - Remove the 'ExpandLimits' code which was unimplemented. - Remove the logging functionality, which we've never used as far as I can remember. - Remove soft limiting. We only used this on the root tracker, so I just moved it to a new process_memory.{h,cc} - Remove 'consumption_func' and un-tie the root tracker from the global process memory usage. Now the root tracker is a simple sum of its descendents. For a stress/benchmark I ran a 500GB YCSB with a memory limit set to 10GB. Results showed no major difference with this patch (throughput was a few percent faster but within the realm of noise). Details at [1] [1] https://docs.google.com/document/d/1dOe5-L5BWUhF-uV4-AE5hduvfUWctXizlaoSLlp38yM/edit?usp=sharing Change-Id: Id16bad7d9a29a83e820a38e9d703811391cffe90 --- M src/kudu/consensus/raft_consensus.cc M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/server/default-path-handlers.cc M src/kudu/tablet/multi_column_writer.cc M src/kudu/tserver/tablet_service.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/maintenance_manager-test.cc M src/kudu/util/maintenance_manager.cc M src/kudu/util/maintenance_manager.h M src/kudu/util/mem_tracker-test.cc M src/kudu/util/mem_tracker.cc M src/kudu/util/mem_tracker.h A src/kudu/util/process_memory.cc A src/kudu/util/process_memory.h 14 files changed, 349 insertions(+), 594 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/6620/3 -- To view, visit http://gerrit.cloudera.org:8080/6620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id16bad7d9a29a83e820a38e9d703811391cffe90 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] tpch: allow hash partitioning
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6709 to review the following change. Change subject: tpch: allow hash partitioning .. tpch: allow hash partitioning This enables a hash-partitioning mode for tpch_real_world. In this mode, we still create one tablet per thread, but we use hash-partitioning so that we don't get the perfect one-to-one insert pattern that we get with range partitioning. This is a bit more realistic for many batch insert scenarios, in which many writers write locally-sorted (but overlapping) data ranges into a single hash-partitioned table. Change-Id: Icbbb643447dff58b32e67764f6fb632441a4f6e4 --- M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc M src/kudu/benchmarks/tpch/rpc_line_item_dao.h M src/kudu/benchmarks/tpch/tpch_real_world.cc 3 files changed, 50 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/6709/1 -- To view, visit http://gerrit.cloudera.org:8080/6709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Icbbb643447dff58b32e67764f6fb632441a4f6e4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo
[kudu-CR] Silence clang -Waddress-of-packed-member warning in concurrent btree.h
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6693 to look at the new patch set (#2). Change subject: Silence clang -Waddress-of-packed-member warning in concurrent_btree.h .. Silence clang -Waddress-of-packed-member warning in concurrent_btree.h macOS/clang 4.0 builds have become polluted with warnings like: In file included from ../../src/kudu/tablet/deltamemstore.cc:22: In file included from ../../src/kudu/tablet/deltafile.h:34: In file included from ../../src/kudu/tablet/deltamemstore.h:33: ../../src/kudu/tablet/concurrent_btree.h:386:33: warning: taking address of packed member 'version_' of class or structure 'kudu::tablet::btree::NodeBase' may result in an unaligned pointer value [-Waddress-of-packed-member] VersionField::SetInserting(_); ^~~~ ../../src/kudu/tablet/concurrent_btree.h:721:11: note: in instantiation of member function 'kudu::tablet::btree::NodeBase::SetInserting' requested here this->SetInserting(); ^ ../../src/kudu/tablet/concurrent_btree.h:707:12: note: in instantiation of member function 'kudu::tablet::btree::LeafNode::InsertNew' requested here return InsertNew(mut->idx(), mut->key(), val, mut->arena()); ^ ../../src/kudu/tablet/concurrent_btree.h:1316:19: note: in instantiation of member function 'kudu::tablet::btree::LeafNode::Insert' requested here switch (node->Insert(mutation, val)) { ^ ../../src/kudu/tablet/concurrent_btree.h:869:19: note: in instantiation of member function 'kudu::tablet::btree::CBTree::Insert' requested here return tree_->Insert(this, val); ^ ../../src/kudu/tablet/deltamemstore.cc:103:31: note: in instantiation of member function 'kudu::tablet::btree::PreparedMutation::Insert' requested here if (PREDICT_FALSE(!mutation.Insert(update.slice( { This is because of clang's -Waddress-of-packed-member warning which has been committed and reverted and committed again in the clang codebase. This patch silences this warning in concurrent_btree.h with a #pragma statement. Change-Id: I03518bcca43ea6c891a7f622972465c1ea87ce63 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 6 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/6693/2 -- To view, visit http://gerrit.cloudera.org:8080/6693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03518bcca43ea6c891a7f622972465c1ea87ce63 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Adar Dembo has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/6630/6/src/kudu/util/env.h File src/kudu/util/env.h: Line 21: #include Would prefer not to leak this sort of platform-specific thing outside of env_posix.cc. An iovec is a pretty simple structure; perhaps you can define a Kudu version and adapt it to/from an iovec inside env_posix.cc? Line 390: virtual Status ReadV(uint64_t offset, Slice** results, Would std::vector* be more intuitive for 'results'? That is, ReadV() takes a caller-provided vector of Slices and replaces its contents with a couple Slices of its own creation? Might be interesting to combine 'scratches' and 'results' into the same structure, since I imagine the length of the two must be the same. -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant HenkeGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Tweak ListMasters RPC error handling
Dan Burkert has posted comments on this change. Change subject: Tweak ListMasters RPC error handling .. Patch Set 2: We discussed testing offline and decided to punt. A follow-up patch will be testing the success case. -- To view, visit http://gerrit.cloudera.org:8080/6704 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I28e175df2cecad4f62806bc3ea0c314e5b861e40 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](gh-pages) FAQ refresh
David Ribeiro Alves has posted comments on this change. Change subject: FAQ refresh .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6697/1/faq.md File faq.md: PS1, Line 250: Kudu hasn't been officially tested with Jepsen but it can be setup using : [these instructions](https://github.com/apache/kudu/blob/master/java/kudu-jepsen/README.adoc). what do you mean by officially? does it have to run by aphyr to be official? :) Can you say instead that the jepsen tests are a work in progress, that the tests we have run continusously and where people can find them? -- To view, visit http://gerrit.cloudera.org:8080/6697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd8a79e3067bedbef71b0dcf9921be673e9dfaa0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Allow to pad UNIXTIME MICROS slots in scan results
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6623 to look at the new patch set (#10). Change subject: Allow to pad UNIXTIME_MICROS slots in scan results .. Allow to pad UNIXTIME_MICROS slots in scan results This changes the wire protocol to, upon request, pad the slots that contain values for UNIXTIME_MICROS columns by 8 bytes. This allows Impala to adopt the result of a Kudu scan as a whole while still having space to transform UNIXTIME_MICROS values, which occupy 8 bytes, to the Impala representation of timestamp, which occupies 16, in place and without copying memory. This patch doesn't include any de-serialization logic the reasoning being that Impala will have knowledge of the data format in order to perform de-serialization directly on the "raw" direct and indirect data. This patch doesn't introduce any branching in the serialization patch. It does move the memset() call that was performed once per nullable column, per row, to be performed on the whole block instead. While less cache friendly, this is also executed less times. The net gain is not significant, but it does not regress in the normal case. Results of running the wire_protocol-test benchmark in slow mode: Before (avg 3 runs): 3.076 After (avg 3 runs): 3.000 The difference is around -2%, which might be in the noise but demostrates no significant regression. Change-Id: I99fc6d3be089d19ebe2e70c938f2405c381578b4 --- M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/common/wire_protocol.h 3 files changed, 200 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/6623/10 -- To view, visit http://gerrit.cloudera.org:8080/6623 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I99fc6d3be089d19ebe2e70c938f2405c381578b4 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#6). Change subject: KUDU-1952 Remove round-robin for block placement .. KUDU-1952 Remove round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 28 files changed, 688 insertions(+), 165 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/6 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Allow to pad UNIXTIME MICROS slots in scan results
David Ribeiro Alves has posted comments on this change. Change subject: Allow to pad UNIXTIME_MICROS slots in scan results .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/6623/9/src/kudu/common/wire_protocol-test.cc File src/kudu/common/wire_protocol-test.cc: PS9, Line 261: as > nit: from Done PS9, Line 273: direct.ToString > maybe HexDump here and below? Done -- To view, visit http://gerrit.cloudera.org:8080/6623 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99fc6d3be089d19ebe2e70c938f2405c381578b4 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Allow to pad UNIXTIME MICROS slots in scan results
Todd Lipcon has posted comments on this change. Change subject: Allow to pad UNIXTIME_MICROS slots in scan results .. Patch Set 9: (2 comments) lgtm aside from a couple nits http://gerrit.cloudera.org:8080/#/c/6623/9/src/kudu/common/wire_protocol-test.cc File src/kudu/common/wire_protocol-test.cc: PS9, Line 261: as nit: from PS9, Line 273: direct.ToString maybe HexDump here and below? -- To view, visit http://gerrit.cloudera.org:8080/6623 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99fc6d3be089d19ebe2e70c938f2405c381578b4 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Todd Lipcon has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 6: (2 comments) Just did a quick look at the approach using readv, it seems like a good direction to me. I didn't review the new header/footer checksum stuff yet. It may be worth checking for Adar's opinion on the API in Env, he tends to have opinions on that stuff. The current one doesn't seem super idiomatic to me but also don't really have a better suggestion :) http://gerrit.cloudera.org:8080/#/c/6630/6/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 446: data_size -= checksum_size; careful of underflow PS6, Line 472: && || -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: Expose "raw" mode in KuduScanner and allow to pass flags
Todd Lipcon has posted comments on this change. Change subject: WIP: Expose "raw" mode in KuduScanner and allow to pass flags .. Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/client.h File src/kudu/client/client.h: Line 2018: enum { kDefaultRawModeFlags = -1 }; > This is confusing. If the idea is to provide a constant that logically mean yea, I'd expect the default to be 0 http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scan_configuration.cc File src/kudu/client/scan_configuration.cc: Line 182: raw_mode_flags_ = flags; we should probably check that there are no unsupported flags set http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scan_configuration.h File src/kudu/client/scan_configuration.h: Line 139: return raw_mode_flags_ != KuduScanner::kDefaultRawModeFlags; strikes me as odd, since this is -1 (all flags set) http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scanner-internal.cc File src/kudu/client/scanner-internal.cc: PS5, Line 536: raw_mode_flags_ != KuduScanner::kDefaultRawModeFlags yea, this defaultRawModeFlags thing is funny looking here - I'd expect the default "0" to mean not enabled, and maybe allocate a flag called 'RAW_MODE_V1" or something? That way if we ever changed our internal format, we could switch to setting RAW_MODE_V2 instead so the client can indicate its compatibility with the current version Line 550: LOG(FATAL) << "Cannot extract rows, \"raw\" mode"; would it be possible for a user in a release build to hit this case? Or would this really be indicative of a bug on our side? http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/scanners.h File src/kudu/tserver/scanners.h: Line 100: int64_t raw_mode_flags() const { return raw_mode_flags_; } > Why is this a ScannerManager thing? I thought it'd only be per scanner? yea http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: Line 410: void set_raw_mode_flags(int64_t raw_mode_flags) override { > This is unintuitive; since collectors are constructed per RPC, I would have would it be simpler if we pass the raw-mode flags on every ContinueScan RPC rather than storing it as part of the scanner object? Then we would be able to avoid the changes to the Scanner interface, as well as solve this problem http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tserver.proto File src/kudu/tserver/tserver.proto: Line 270: // "Raw" mode flags. > I understand the desire for a "raw" mode client-side, so that users who sig +1, it shouldn't distinguish raw vs not-raw. At this level all scans are kind of "raw" Perhaps it should be called "format_flags" or something? As for a flag bitset vs separate booleans I dont have a strong opinion. I suppose separate booleans would be less dense in memory than a bitset, but doubt it matters. -- To view, visit http://gerrit.cloudera.org:8080/6624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I043b6514dc5fc307fc9c94eb41f3ae79796ba273 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#5). Change subject: KUDU-1952 Don't use round-robin for block placement .. KUDU-1952 Don't use round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 28 files changed, 688 insertions(+), 165 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/5 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6630 to look at the new patch set (#6). Change subject: WIP: KUDU-463. Add checksumming to cfile .. WIP: KUDU-463. Add checksumming to cfile Adds optional checksumming and validation to cfile blocks. Introduces 2 flags to control cfile checksumming: - cfile_write_checksums (default false) - cfile_verify_checksums (default true) cfile_write_checksums is used in the CFileWriter to enable computing and appending Crc32 checksums to the end of each cfile block, header and footer cfile_write_checksums is defaulted to false to ensure upgrades don't immediately result in performance degredation and incompatible data on downgrade. It can and likely should be defaulted to true in a later release. When cfile_write_checksums is set to true, the existing incompatible_features bitset in the cfile footer is used. A "checksum" bit is set to ensure a clear error message when older versions of Kudu try to read the file. If checksums are not written the incompatible_features "checksum" bit is not set. cfile_verify_checksums is used in the CFileReader to enable validating the data against the written checksum. cfile_verify_checksums is defaulted to true since validation only occurs if checksums are written. Any data that was written before checksumming was an option or when cfile_write_checksums was false will not be verified. TODO: - Finish testing - Change cfile block to page to avoid term overload? Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 --- M docs/design-docs/cfile.md M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/cfile/cfile_util.h M src/kudu/cfile/cfile_writer.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/fs-test-util.h M src/kudu/fs/log_block_manager.cc M src/kudu/util/crc.cc M src/kudu/util/crc.h M src/kudu/util/env-test.cc M src/kudu/util/env.h M src/kudu/util/env_posix.cc M src/kudu/util/env_util.h M src/kudu/util/file_cache.cc 17 files changed, 539 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/6630/6 -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [docs] Refresh and augment the known issues
Ambreen Kazi has posted comments on this change. Change subject: [docs] Refresh and augment the known issues .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6699/2/docs/known_issues.adoc File docs/known_issues.adoc: PS2, Line 135: Supported This seems out of place in the Known Issues doc - There is a requirements section at the top of the installation doc that already lists the supported OSs. The supported filesystems could be added there. PS2, Line 168: == Spark The Admin topic also has a list of limitations for Spark integration - could you take a look and make sure they're still applicable? We should also maintain this information in 1 topic, rather than two. Perhaps move the content from the Admin topic to this one and just link back from the Admin topic to this section (or vice versa). https://kudu.apache.org/docs/developing.html#_spark_integration_known_issues_and_limitations PS2, Line 172: Impala Same comment as Spark - Consider maintaining in 1 place rather than two. https://kudu.apache.org/docs/kudu_impala_integration.html#_known_issues_and_limitations -- To view, visit http://gerrit.cloudera.org:8080/6699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Tweak ListMasters RPC error handling
Adar Dembo has posted comments on this change. Change subject: Tweak ListMasters RPC error handling .. Patch Set 2: Maybe add a small unit test in master-test? -- To view, visit http://gerrit.cloudera.org:8080/6704 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I28e175df2cecad4f62806bc3ea0c314e5b861e40 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] Tweak ListMasters RPC error handling
Adar Dembo has posted comments on this change. Change subject: Tweak ListMasters RPC error handling .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/6704/1/src/kudu/master/master_service.cc File src/kudu/master/master_service.cc: Line 365: resp->mutable_error()->mutable_status()->set_code(AppStatusPB::UNKNOWN_ERROR); Doesn't this overwrite the work of StatusToPB()? Line 370: resp->mutable_deprecated_error()->set_code(AppStatusPB::UNKNOWN_ERROR); This too? -- To view, visit http://gerrit.cloudera.org:8080/6704 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I28e175df2cecad4f62806bc3ea0c314e5b861e40 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] Add 'kudu master list' tool
Adar Dembo has posted comments on this change. Change subject: Add 'kudu master list' tool .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9e77d41b244143d1258b26b50c4d2a68dedd8f00 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] Add 'kudu tserver list' tool
Dan Burkert has submitted this change and it was merged. Change subject: Add 'kudu tserver list' tool .. Add 'kudu tserver list' tool This adds a new tool action to list tablet servers, and associated information. There are a few output formatting options: the default is a human-readable table in the psql style; the others are meant for machine parsing. The formatting and leader RPC proxy code is abstracted so that future list tools (e.g. table, tablet, and master) can reuse them. Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee Reviewed-on: http://gerrit.cloudera.org:8080/6654 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo--- M src/kudu/client/client.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action.cc M src/kudu/tools/tool_action.h M src/kudu/tools/tool_action_common.cc M src/kudu/tools/tool_action_common.h M src/kudu/tools/tool_action_tserver.cc 7 files changed, 447 insertions(+), 28 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Add 'kudu tserver list' tool
Adar Dembo has posted comments on this change. Change subject: Add 'kudu tserver list' tool .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] WIP KUDU-579 [java client] Scanner fault tolerance
Hao Hao has posted comments on this change. Change subject: WIP KUDU-579 [java_client] Scanner fault tolerance .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/6566/3/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java: Line 474: private final CallbackgotNextRow = > Changed it because nextRowErrback is returning Deferred obj, so to be consi Done Line 506: if (e instanceof ScannerExpiredException) { > My understanding of how Java client error handling works for tablet server Done Line 511: return Deferred.fromError(e); // Let the error propagate. > 'propagate' was more correct. Done Line 835: return new Pair (null, error); > Is this case being handled? Done -- To view, visit http://gerrit.cloudera.org:8080/6566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I89d3634c4255b69e28f2de5412e6a5a9d34e931b Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [docs] Refresh and augment the known issues
Hello Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6699 to look at the new patch set (#2). Change subject: [docs] Refresh and augment the known issues .. [docs] Refresh and augment the known issues We've learned a lot about Kudu since people have started using it. I've gathered in this patch what I think should be the new recommendations we make to users. Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d --- M docs/known_issues.adoc 1 file changed, 111 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/6699/2 -- To view, visit http://gerrit.cloudera.org:8080/6699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5d8e817a402f419aeb5ed9d700a8207ad9f91e4d Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Jean-Daniel CryansGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#4). Change subject: KUDU-1952 Don't use round-robin for block placement .. KUDU-1952 Don't use round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 29 files changed, 701 insertions(+), 164 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/4 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#3). Change subject: KUDU-1952 Don't use round-robin for block placement .. KUDU-1952 Don't use round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 29 files changed, 701 insertions(+), 164 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/3 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Andrew Wong has abandoned this change. Change subject: KUDU-1952 Don't use round-robin for block placement .. Abandoned Duplicate of 6636. -- To view, visit http://gerrit.cloudera.org:8080/6706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I3ede0ea2b0299fda097baf79d93eeac36c2e0765 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#2). Change subject: KUDU-1952 Don't use round-robin for block placement .. KUDU-1952 Don't use round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 29 files changed, 702 insertions(+), 161 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/2 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Improve block placement
Andrew Wong has posted comments on this change. Change subject: KUDU-1952 Improve block placement .. Patch Set 1: (17 comments) http://gerrit.cloudera.org:8080/#/c/6636/1//COMMIT_MSG Commit Message: Line 10: single-disk failure. Throughout the code, the term "DataDir" refers to > would be good to link to the design doc somewhere here Done PS1, Line 14: This patch adds a mapping from tablet to a set of disks and uses it to : replace the existing round-robin placement of blocks. Tablets are : m > it would be good in the commit message to explain the impact of this patch. Good points about both regression and upgrade impact. I'll set a default to use all disks to avoid the performance regression. As for upgrades/backwards compatibility, this is something that I haven't covered and should. If no disk groups are defined for a tablet, I think we can assume (if they're not tombstoned) that the tablet data was written with a previous version of kudu, in which case just add all directories to the tablet's group on boostrap. http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: Line 57: "Indicates the target number of data dirs to spread each " > what's the behavior here if the configuration is larger than the number of If this is larger than the number of data dirs, the data group size will be set to the max number of directories currently available when creating a new group. Fair point. Will set a default for now indicating the full range of data dirs should be used for each tablet, which does at least address KUDU-1952, although it does still have added memory/storage usage. Will add a note about this. Line 402: if (group_by_tablet_map_.find(tablet_id) == group_by_tablet_map_.end()) { > Take a look at map-util.h; it's got a lot of idiomatic methods for dealing Done. Used InsertOrReturnExisting; I don't think you can verify whether LookupOrInsert actually inserts. Line 403: group_by_tablet_map_[tablet_id] = std::move(group_to_add); > warning: std::move of the const variable has no effect; remove std::move() Done Line 428: if (group_for_tablet->size() == 0) { > warning: the 'empty' method should be used to check for emptiness instead o Done Line 524: } else if (iter2 == dir_uuids.size() || > warning: don't use else after return [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: Line 202: bool GetDirForGroup(DataDirGroup& group, uint16_t* uuid); > warning: non-const reference parameter 'group', make it const or use a poin Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tablet/delta_compaction.cc File src/kudu/tablet/delta_compaction.cc: Line 43: using cfile::CFileIterator; > warning: using decl 'CFileIterator' is unused [misc-unused-using-decls] Done Line 44: using cfile::CFileReader; > warning: using decl 'CFileReader' is unused [misc-unused-using-decls] Done Line 45: using cfile::IndexTreeIterator; > warning: using decl 'IndexTreeIterator' is unused [misc-unused-using-decls] Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tablet/delta_compaction.h File src/kudu/tablet/delta_compaction.h: Line 67: Status Compact(const fs::CreateBlockOptions& opts); > A compaction is only ever for a single tablet; can we just get the tablet I Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tablet/multi_column_writer.h File src/kudu/tablet/multi_column_writer.h: Line 40: struct CreateBlockOptions; > In general it feels weird to use CreateBlockOptions as the vehicle for plum Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tablet/tablet_metadata.cc File src/kudu/tablet/tablet_metadata.cc: Line 60: using kudu::consensus::RaftConfigPB; > warning: using decl 'RaftConfigPB' is unused [misc-unused-using-decls] Done http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tablet/tablet_metadata.h File src/kudu/tablet/tablet_metadata.h: Line 270: TabletMetadata(TabletSuperBlockPB superblock); > warning: single-argument constructors must be marked explicit to avoid unin Unused. Will remove. Line 335: fs::DataDirGroup* data_dir_group_; > It would simplify a lot if we could avoid this link between the tablet meta Done. Now just calls dd_manager()->GetDirGroupForTablet() http://gerrit.cloudera.org:8080/#/c/6636/1/src/kudu/tserver/tablet_copy_client.cc File src/kudu/tserver/tablet_copy_client.cc: Line 81: using consensus::RaftPeerPB; > warning: using decl 'RaftPeerPB' is unused [misc-unused-using-decls] Done -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong
[kudu-CR] KUDU-1952 Don't use round-robin for block placement
Andrew Wong has uploaded a new change for review. http://gerrit.cloudera.org:8080/6706 Change subject: KUDU-1952 Don't use round-robin for block placement .. KUDU-1952 Don't use round-robin for block placement This is the first of a multi-patch patchset to mitigate the single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's superblock will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Use DataDirGroupPB and add locks Change-Id: Ifaf50772cfa71c3ee5f61db9c2a4ae85b0d6e2b2 No longer output DataDirGroup Favor Create call and then Get calls. Still TODO: handle upgrade case Change-Id: I4d42d1444c60450ced2450b4ae9c0a51a3a30ef2 Add use_all_dirs flag to CreateDataDirGroup Add a flag to CreateDataDirGroup to use all dirs in cases where old tablet data is being bootstrapped. Since we can't tell which dirs the old tablets are striped across, we must assume the data spans all tablets. Change-Id: Ibdaa2c53ef2c596e6d473a663c5c3d3e5593875b Avoid unnecessary CreateBlockOption plumbing Replace CreateBlockOption with tablet_id string where possible. Change-Id: I3ede0ea2b0299fda097baf79d93eeac36c2e0765 --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dir_group.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 29 files changed, 702 insertions(+), 161 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/6706/1 -- To view, visit http://gerrit.cloudera.org:8080/6706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3ede0ea2b0299fda097baf79d93eeac36c2e0765 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong
[kudu-CR] Add 'kudu tserver list' tool
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6654 to look at the new patch set (#7). Change subject: Add 'kudu tserver list' tool .. Add 'kudu tserver list' tool This adds a new tool action to list tablet servers, and associated information. There are a few output formatting options: the default is a human-readable table in the psql style; the others are meant for machine parsing. The formatting and leader RPC proxy code is abstracted so that future list tools (e.g. table, tablet, and master) can reuse them. Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee --- M src/kudu/client/client.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action.cc M src/kudu/tools/tool_action.h M src/kudu/tools/tool_action_common.cc M src/kudu/tools/tool_action_common.h M src/kudu/tools/tool_action_tserver.cc 7 files changed, 447 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/6654/7 -- To view, visit http://gerrit.cloudera.org:8080/6654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Tweak ListMasters RPC error handling
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6704 to review the following change. Change subject: Tweak ListMasters RPC error handling .. Tweak ListMasters RPC error handling Unlike other master-specific RPCs, the ListMasters response contains an AppStatusPB error type instead of a MasterErrorPB. This makes it harder to use ListMasters in a generic context with other master RPCs. This commit changes the error type, while preserving the old type as a deprecated field. There are no known users of the ListMasters RPC yet, but maintaing backwards compat is still part of our contract. Change-Id: I28e175df2cecad4f62806bc3ea0c314e5b861e40 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/master.proto M src/kudu/master/master_service.cc 3 files changed, 12 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/6704/1 -- To view, visit http://gerrit.cloudera.org:8080/6704 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I28e175df2cecad4f62806bc3ea0c314e5b861e40 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo
[kudu-CR] Add 'kudu master list' tool
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6705 to review the following change. Change subject: Add 'kudu master list' tool .. Add 'kudu master list' tool Change-Id: I9e77d41b244143d1258b26b50c4d2a68dedd8f00 --- M src/kudu/client/client-internal.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_common.cc M src/kudu/tools/tool_action_master.cc 4 files changed, 126 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/6705/1 -- To view, visit http://gerrit.cloudera.org:8080/6705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9e77d41b244143d1258b26b50c4d2a68dedd8f00 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo
[kudu-CR] Add 'kudu tserver list' tool
Dan Burkert has posted comments on this change. Change subject: Add 'kudu tserver list' tool .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6654/4/src/kudu/tools/tool_action_tserver.cc File src/kudu/tools/tool_action_tserver.cc: Line 189: .Description("Operate on a Kudu Tablet Server") > Sure, but you're missing the gist of my suggestion: the existing descriptio I think it's fine as is. -- To view, visit http://gerrit.cloudera.org:8080/6654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I047a7675c186a02dd5d8b405431ae885159fcfee Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] make site: only build necessary binaries
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: make_site: only build necessary binaries .. make_site: only build necessary binaries make_site relies on these binaries to build docs for them, but doesn't rely on any other binaries (or the tests). This speeds up the site build. Change-Id: Icbcd5b0327a6419fe732daa05e064aa9249e8298 Reviewed-on: http://gerrit.cloudera.org:8080/6695 Reviewed-by: Jean-Daniel CryansTested-by: Kudu Jenkins --- M docs/support/scripts/make_site.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6695 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Icbcd5b0327a6419fe732daa05e064aa9249e8298 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] WIP KUDU-579 [java client] Scanner fault tolerance
Hao Hao has abandoned this change. Change subject: WIP KUDU-579 [java_client] Scanner fault tolerance .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/6701 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I2155060c25184d08efcd9f3351dba2adf6069175 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Kudu Jenkins
[kudu-CR] WIP KUDU-579 [java client] Scanner fault tolerance
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6566 to look at the new patch set (#5). Change subject: WIP KUDU-579 [java_client] Scanner fault tolerance .. WIP KUDU-579 [java_client] Scanner fault tolerance This patch adds java client support to restart scanners if they fail in the middle of table scanning. AsyncKuduScanner records the final primary key retrieved in the previous batch. If a tserver fails while scanning, the client marks the tserver as failed and retries the scan elsewhere, providing its last primary key. Change-Id: I89d3634c4255b69e28f2de5412e6a5a9d34e931b --- M java/kudu-client/src/main/java/org/apache/kudu/client/AbstractKuduScannerBuilder.java M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java M java/kudu-client/src/main/java/org/apache/kudu/client/RowResultIterator.java A java/kudu-client/src/main/java/org/apache/kudu/client/ScannerExpiredException.java M java/kudu-client/src/main/java/org/apache/kudu/client/TabletClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/apache/kudu/client/ITScannerMultiTablet.java 8 files changed, 240 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6566/5 -- To view, visit http://gerrit.cloudera.org:8080/6566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89d3634c4255b69e28f2de5412e6a5a9d34e931b Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] WIP KUDU-579 [java client] Scanner fault tolerance
Hao Hao has uploaded a new change for review. http://gerrit.cloudera.org:8080/6701 Change subject: WIP KUDU-579 [java_client] Scanner fault tolerance .. WIP KUDU-579 [java_client] Scanner fault tolerance This patch adds java client support to restart scanners if they fail in the middle of table scanning. AsyncKuduScanner records the final primary key retrieved in the previous batch. If a tserver fails while scanning, the client marks the tserver as failed and retries the scan elsewhere, providing its last primary key. Change-Id: I2155060c25184d08efcd9f3351dba2adf6069175 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AbstractKuduScannerBuilder.java M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanner.java M java/kudu-client/src/main/java/org/apache/kudu/client/RowResultIterator.java A java/kudu-client/src/main/java/org/apache/kudu/client/ScannerExpiredException.java M java/kudu-client/src/main/java/org/apache/kudu/client/TabletClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/apache/kudu/client/ITScannerMultiTablet.java 8 files changed, 240 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/6701/1 -- To view, visit http://gerrit.cloudera.org:8080/6701 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2155060c25184d08efcd9f3351dba2adf6069175 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao