[kudu-CR] Integrate the ResultTracker into the rpc subsystem and add a test
David Ribeiro Alves has posted comments on this change. Change subject: Integrate the ResultTracker into the rpc subsystem and add a test .. Patch Set 12: Verified+1 (12 comments) http://gerrit.cloudera.org:8080/#/c/3192/6//COMMIT_MSG Commit Message: Line 10: to specify a new method option when defining rpc service methods. > I think you missed this. hrm, you're right. I could swear I had fixed this. must have lost a patch along the way. Done http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/common/common.proto File src/kudu/common/common.proto: Line 316 > I dont think this should be in 'common'. 'common' is for data-model type st Done http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/CMakeLists.txt File src/kudu/rpc/CMakeLists.txt: Line 89: rpc_header_proto > oh, now I see why it builds. This dependency seems misplaced. (eg keep in m Done http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/protoc-gen-krpc.cc File src/kudu/rpc/protoc-gen-krpc.cc: Line 46: #include "kudu/rpc/rpc_header.pb.h" > nit: sorting Done Line 467: "mi->result_tracker.reset($result_tracker$);\n" > we have a separate ResultTracker per RPC type? that doesn't seem right. we do, why doesn't it seem right? easier to track where memory goes , maps will be smaller so faster. locks will be less contended... why wouldn't we want to have one per rpc? http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc-stress-test.cc File src/kudu/rpc/rpc-stress-test.cc: PS11, Line 73: random amount > update Done Line 134: vectoradders; > use unique_ptr? Done http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc-test-base.h File src/kudu/rpc/rpc-test-base.h: Line 288: std::atomic_int exactly_once_test_val_; > hrm, is this atomic? haven't seen this yeah it's typedefd, want me to change it? http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc_context.cc File src/kudu/rpc/rpc_context.cc: PS11, Line 66: call_->RespondSuccess(*response_pb_); : delete this; : > per the comment on the other review, it seems weird we now have this code s well yeah, for the case when the result aren't being tracked. ideally we'd like to at least coalesce the log statements at least, was kinda bugging me but didn't spend too much time on it. Any ideas? http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rtest.proto File src/kudu/rpc/rtest.proto: Line 129: } > add something to design-docs/rpc.md about this feature? Done http://gerrit.cloudera.org:8080/#/c/3192/6/src/kudu/rpc/service_if.cc File src/kudu/rpc/service_if.cc: PS6, Line 96: req.release(), :resp, : call->header().has_request_id() ? method_info->result_tracker : : nullptr); : > Missed this? Done Line 116: } else { > Log the state too. Done -- To view, visit http://gerrit.cloudera.org:8080/3192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1d624810350feceefe244e0319e22cac241bf0d6 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add a ResultTracker class that will track server side results
David Ribeiro Alves has posted comments on this change. Change subject: Add a ResultTracker class that will track server side results .. Patch Set 13: Verified+1 unrelated flake ReplicatedAlterTableTest.TestReplicatedAlter -- To view, visit http://gerrit.cloudera.org:8080/3190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6718951a9998a6c9b0db35e8f09ff8304591e8b1 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Add a RpcContext::RespondFailure() method
Kudu Jenkins has posted comments on this change. Change subject: Add a RpcContext::RespondFailure() method .. Patch Set 12: Build Started http://104.196.14.100/job/kudu-gerrit/1962/ -- To view, visit http://gerrit.cloudera.org:8080/3191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9f1387ba0f837046a8056e77f73a3982f06c73a2 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Integrate the ResultTracker into the rpc subsystem and add a test
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3192 to look at the new patch set (#12). Change subject: Integrate the ResultTracker into the rpc subsystem and add a test .. Integrate the ResultTracker into the rpc subsystem and add a test This integrates the ResultTracker into the rpc subsystem by allowing to specify a new method option when defining rpc service methods. When this method option is specificed _and_ the call's rpc header has a RequestIdPB the results for the call will be tracked and the call may have exactly once semantics. This allows to have the simplest form of exactly once semantics for single server calls. Of course limitations apply, like response persistence across restarts is not supported, neither is propagating/rebuilding responses to/on replicas for replicated calls. If support for this is needed it will have to be done ad-hoc, case by case. Change-Id: I1d624810350feceefe244e0319e22cac241bf0d6 --- M docs/design-docs/rpc.md M src/kudu/rpc/CMakeLists.txt M src/kudu/rpc/inbound_call.cc M src/kudu/rpc/protoc-gen-krpc.cc A src/kudu/rpc/rpc-stress-test.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc_context.cc M src/kudu/rpc/rpc_context.h M src/kudu/rpc/rpc_header.proto M src/kudu/rpc/rtest.proto M src/kudu/rpc/service_if.cc M src/kudu/rpc/service_if.h 12 files changed, 356 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/3192/12 -- To view, visit http://gerrit.cloudera.org:8080/3192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1d624810350feceefe244e0319e22cac241bf0d6 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Integrate the ResultTracker into the rpc subsystem and add a test
Kudu Jenkins has posted comments on this change. Change subject: Integrate the ResultTracker into the rpc subsystem and add a test .. Patch Set 12: Build Started http://104.196.14.100/job/kudu-gerrit/1961/ -- To view, visit http://gerrit.cloudera.org:8080/3192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1d624810350feceefe244e0319e22cac241bf0d6 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Kudu Jenkins has posted comments on this change. Change subject: Allow for reserving disk space for non-Kudu processes .. Patch Set 7: Build Started http://104.196.14.100/job/kudu-gerrit/1960/ -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3135 to look at the new patch set (#7). Change subject: Allow for reserving disk space for non-Kudu processes .. Allow for reserving disk space for non-Kudu processes Adds gflags to reserve disk space such that Kudu will not use more than specified. Hadoop calls this functionality "du.reserved". If a WAL preallocation is attempted while the log disk is past its reservation limit, the log write will fail. As a result, RaftConsensus will cause the process to crash. The log block manager will use non-full disks if possible until all of the disks are full. If a flush or compaction is attempted when all disks are beyond their configured capacity then the LogBlockManager will return an error. As a result, the maintenance manager task will cause the process to crash. This initial implementation provides a "best effort" approach. Disk space checks are only done at preallocation time, and if writes continue beyond the preallocated point (for both a WAL segment and a data block) those writes will not be prevented. This makes it easier to provide a "friendly" option where the block manager will divert new writes to non-full disks, avoiding a hard crash when only one disk is past its reservation limit. In the future, we may want to add "hard" and "soft" limits, such that going beyond the soft limit will do what we do today, and going beyond the hard limit (say, by writing a very large data block past its preallocation point) will result in a crash. This patch includes: * Unit tests. * End-to-end test for flushing / compaction falling back to non-full disks due to disk space backpressure and finally crashing when there is no space left in any data dir. * End-to-end test for writes failing due to WAL disk space backpressure, causing a crash. Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 --- M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_util.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_reservation-itest.cc M src/kudu/tablet/tablet_peer_mm_ops.cc M src/kudu/util/CMakeLists.txt A src/kudu/util/env_util-test.cc M src/kudu/util/env_util.cc M src/kudu/util/env_util.h M src/kudu/util/scoped_cleanup.h 14 files changed, 610 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/7 -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Add a ResultTracker class that will track server side results
David Ribeiro Alves has posted comments on this change. Change subject: Add a ResultTracker class that will track server side results .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/3190/12/src/kudu/rpc/result_tracker.cc File src/kudu/rpc/result_tracker.cc: Line 221: } > lots of dup code in the above methods. any refactor doable? (even one that Done -- To view, visit http://gerrit.cloudera.org:8080/3190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6718951a9998a6c9b0db35e8f09ff8304591e8b1 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Control how content is cached on the Kudu web site
Todd Lipcon has submitted this change and it was merged. Change subject: Control how content is cached on the Kudu web site .. Control how content is cached on the Kudu web site Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Reviewed-on: http://gerrit.cloudera.org:8080/3462 Reviewed-by: Todd LipconTested-by: Todd Lipcon --- A .htaccess 1 file changed, 18 insertions(+), 0 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/3462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Mike Percy Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Control how content is cached on the Kudu web site
Todd Lipcon has posted comments on this change. Change subject: Control how content is cached on the Kudu web site .. Patch Set 1: Code-Review+2 Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/3462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Mike PercyGerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3135 to look at the new patch set (#6). Change subject: Allow for reserving disk space for non-Kudu processes .. Allow for reserving disk space for non-Kudu processes Adds gflags to reserve disk space such that Kudu will not use more than specified. Hadoop calls this functionality "du.reserved". If a WAL preallocation is attempted while the log disk is past its reservation limit, the log write will fail. As a result, RaftConsensus will cause the process to crash. The log block manager will use non-full disks if possible until all of the disks are full. If a flush or compaction is attempted when all disks are beyond their configured capacity then the LogBlockManager will return an error. As a result, the maintenance manager task will cause the process to crash. This initial implementation provides a "best effort" approach. Disk space checks are only done at preallocation time, and if writes continue beyond the preallocated point (for both a WAL segment and a data block) those writes will not be prevented. This makes it easier to provide a "friendly" option where the block manager will divert new writes to non-full disks, avoiding a hard crash when only one disk is past its reservation limit. In the future, we may want to add "hard" and "soft" limits, such that going beyond the soft limit will do what we do today, and going beyond the hard limit (say, by writing a very large data block past its preallocation point) will result in a crash. This patch includes: * Unit tests. * End-to-end test for flushing / compaction falling back to non-full disks due to disk space backpressure and finally crashing when there is no space left in any data dir. * End-to-end test for writes failing due to WAL disk space backpressure, causing a crash. Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 --- M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_util.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_reservation-itest.cc M src/kudu/tablet/tablet_peer_mm_ops.cc M src/kudu/util/CMakeLists.txt A src/kudu/util/env_util-test.cc M src/kudu/util/env_util.cc M src/kudu/util/env_util.h M src/kudu/util/scoped_cleanup.h 14 files changed, 608 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/6 -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Mike Percy has posted comments on this change. Change subject: Allow for reserving disk space for non-Kudu processes .. Patch Set 5: (24 comments) http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: Line 111: DEFINE_int64(log_dir_reserved_bytes, 0, > Nit: I think this would be more clear as fs_wal_dir_reserved_bytes, as it w Oh, that's the one I had intended to mirror. Done. Line 113: TAG_FLAG(log_dir_reserved_bytes, runtime); > Hmm, I'm not sure if we should allow changing these values at runtime. If s That's ok. They are already "soft" in terms of enforcement. In any case, this one will cause a crash. I am using runtime modification in the integration tests. http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: PS5, Line 951: We use a COARSE clock so we need to let a : // little bit of time pass so we get at least one unit of time greater than : // before when we call MonoTime::Now() again. > Let's switch to FINE clock so we won't have this limitation. I was hoping using COARSE would help keep the code paths fast but I agree it's probably overkill. http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: Line 64: DEFINE_int32(full_disk_cache_seconds, 30, > Nit: please prefix with log_block_manager. It's super long, yes, but it's t Done Line 70: DEFINE_int64(data_dirs_reserved_bytes, 0, > Please rename to fs_data_dirs_reserved_bytes (see my comment on log_dir_res Done Line 96: "Number of unavailable log block containers", > Nit: Number of Unavailable Block Containers (to be consistent with the exis Done Line 98: "Number of non-full Block Containers that are under root paths that " > Nit: "Number of non-full log block containers that are under root paths who Done Line 265: RWFile* data_file() const { return data_file_.get(); } > We're only using this for the filename. Could you make narrow accessor that Done Line 267: int64_t preallocated_offset() const { return preallocated_offset_; } > Where is this used? Looks unused. Removed Line 273: Env* env() const { return block_manager_->env(); } > Is this ever used? Nope, also removed Line 312: int64_t preallocated_offset_ = 0; > Huh, didn't know C++11 allowed this. Could you make the same change for tot Done Line 1230: // Indexes of root paths that are below their reserved space threshold. > Can we defer this work to the case where GetAvailableContainer() has return I had to change the implementation because of a race I discovered in the cache expiration logic, so if we prepopulate this cache then it has to happen here. The race caused an infinite loop when the cache expiration was set to 0. Line 1250: return Status::IOError("Unable to allocate block: All data directories are full. " > We should add ENOSPC to this status, right? Done Line 1292: FLAGS_log_container_preallocate_bytes, > Preallocate() may not actually preallocate FLAGS_log_container_preallocate_ Yes, preallocation sounds a little bit broken. However, what you're describing further on is a "hard" limit, as opposed to a "soft" limit, which is what I've implemented. A hard limit is much harder (ha ha) to implement because the failure cases are broad. We have already allocated a block. Now we're applying backpressure on writes. Doing a reservation at block allocation time, which is what this patch does, is much easier because block manager API clients don't even know that it's happening. In short, if we want to do a hard limit, it's a whole 'nother thing, and could reasonably be implemented with a whole 'nother set of gflags. Also, if you hit a hard limit, the most reasonable thing to do is crash. That said, unless I missed your point (it's certainly possible), I don't think it's particularly bad to overshoot on verifying sufficient space here with FLAGS_log_container_preallocate_bytes since even if we intended to preallocate less space than that, we're close to the limit and it's reasonable to stop writing to this disk. Line 1413: MonoTime now = MonoTime::Now(MonoTime::COARSE); > FINE here too. And this can be done outside the lock. Done Line 1432: } // Release lock_. > Nit: FWIW, I think this is implied by the scope (i.e. don't really see the agree Line 1758: if (expires->ComesBefore(MonoTime::Now(MonoTime::COARSE))) return false; // Expired. > Use FINE here and below; that's what we default to when introducing timeout Done http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: Line 156: // Returns true if the given 'root_path' has been marked full and 'now' is > Nit: 'now' (with single
[kudu-CR] catalog manager: prevent spurious dirty callbacks from crashing the process
Adar Dembo has uploaded a new patch set (#2). Change subject: catalog_manager: prevent spurious dirty callbacks from crashing the process .. catalog_manager: prevent spurious dirty callbacks from crashing the process The recent change to switch from LocalConsensus to RaftConsensus has led to an increased number of dirty callback calls when using just one master. Some of these calls occur with no term change; the catalog manager treats them as any other calls and reloads the on-disk metadata. In theory that's just unnecessary work, but CheckIsLeaderAndReady() doesn't provide adequate protection for the rest of the master when in this state, so nearly every RPC is allowed in during this time. That's an absolute disaster for correctness: imagine a GetTableLocations() returning only a subset of a table's tablets because the rest were still being loaded from disk. Luckily for us it can also manifest as a crash [1] so we noticed it quickly. I chose to fix it by ignoring these calls within CatalogManager::VisitTablesAndTabletsTask and not SysCatalogTable::SysCatalogStateChanged because the synchronization in the former is more straight forward thanks to the size of worker_pool_. The new test led to a crash 100% of the time without this fix. There's an argument to be made for changing TableInfo::TabletMap's raw pointers to shared pointers thus avoiding this crash altogether, but it's still an incorrect state to be in, so I don't see the value in doing that. While I was here, I snuck a few other changes in: - Remove a lock acquisition from ElectedAsLeaderCb; it did nothing. - Remove old_role_ from SysCatalogTable; it also did nothing. - Narrow the lock acquisition in VisitTablesAndTabletsTask; it only needs to protect the visiting logic. 1. Sample crash output: F0618 05:47:41.795367 9330 ref_counted.cc:74] Check failed: !in_dtor_ *** Check failure stack trace: *** @ 0x7f99fab05f7d google::LogMessage::Fail() at ??:0 @ 0x7f99fab07e7d google::LogMessage::SendToLog() at ??:0 @ 0x7f99fab05ab9 google::LogMessage::Flush() at ??:0 @ 0x7f99fab0891f google::LogMessageFatal::~LogMessageFatal() at ??:0 @ 0x7f99fae8a637 kudu::subtle::RefCountedThreadSafeBase::AddRef() at ??:0 @ 0x7f9a07c486d8 make_scoped_refptr<>() at ??:0 @ 0x7f9a07c2f7f7 kudu::master::TableInfo::GetTabletsInRange() at ??:0 @ 0x7f9a07c2ec35 kudu::master::CatalogManager::GetTableLocations() at ??:0 @ 0x51962e kudu::master::WaitForRunningTabletCount() at ... @ 0x51ce59 kudu::CreateTableStressTest_RestartMasterDuringCreation_Test::TestBody() at ... @ 0x7f99fc7f5b48 testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7f99fc7ea012 testing::Test::Run() at ??:0 @ 0x7f99fc7ea158 testing::TestInfo::Run() at ??:0 @ 0x7f99fc7ea235 testing::TestCase::Run() at ??:0 @ 0x7f99fc7ea518 testing::internal::UnitTestImpl::RunAllTests() at ??:0 @ 0x7f99fc7f6058 testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7f99fc7ea7fd testing::UnitTest::Run() at ??:0 @ 0x7f9a08520cf6 main at ??:0 @ 0x7f99f97a9d5d __libc_start_main at ??:0 @ 0x43e06d (unknown) at ??:0 Change-Id: I5bc602a51a4914ec13f926bfca555c816a2ee509 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master-test.cc M src/kudu/master/sys_catalog.cc M src/kudu/master/sys_catalog.h 5 files changed, 71 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/3465/2 -- To view, visit http://gerrit.cloudera.org:8080/3465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5bc602a51a4914ec13f926bfca555c816a2ee509 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] catalog manager: prevent spurious dirty callbacks from crashing the process
Hello Mike Percy, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3465 to review the following change. Change subject: catalog_manager: prevent spurious dirty callbacks from crashing the process .. catalog_manager: prevent spurious dirty callbacks from crashing the process The recent change to switch from LocalConsensus to RaftConsensus has led to an increased number of dirty callback calls when using just one master. Some of these calls occur with no term change; the catalog manager treats them as any other calls and reloads the on-disk metadata. In theory that's just unnecessary work, but CheckIsLeaderAndReady() doesn't provide adequate protection for the rest of the master when in this state, so nearly every RPC is allowed in during this time. That's an absolute disaster for correctness: imagine a GetTableLocations() returning only a subset of a table's tablets because the rest were still being loaded from disk. Luckily for us it can also manifest as a crash [1] so we noticed it quickly. I chose to fix it by ignoring these calls within CatalogManager::VisitTablesAndTabletsTask and not SysCatalogTable::SysCatalogStateChanged because the synchronization in the former is more straight forward thanks to the size of worker_pool_. The new test led to a crash 100% of the time without this fix. There's an argument to be made for changing TableInfo::TabletMap's raw pointers to shared pointers thus avoiding this crash altogether, but it's still an incorrect state to be in, so I don't see the value in doing that. While I was here, I snuck a few other changes in: - Remove a lock acquisition from ElectedAsLeaderCb; it did nothing. - Remove old_role_ from SysCatalogTable; it also did nothing. - Narrow the lock acquisition in VisitTablesAndTabletsTask; it only needs to protect the visiting logic. 1. Sample crash output: F0618 05:47:41.795367 9330 ref_counted.cc:74] Check failed: !in_dtor_ *** Check failure stack trace: *** @ 0x7f99fab05f7d google::LogMessage::Fail() at ??:0 @ 0x7f99fab07e7d google::LogMessage::SendToLog() at ??:0 @ 0x7f99fab05ab9 google::LogMessage::Flush() at ??:0 @ 0x7f99fab0891f google::LogMessageFatal::~LogMessageFatal() at ??:0 @ 0x7f99fae8a637 kudu::subtle::RefCountedThreadSafeBase::AddRef() at ??:0 @ 0x7f9a07c486d8 make_scoped_refptr<>() at ??:0 @ 0x7f9a07c2f7f7 kudu::master::TableInfo::GetTabletsInRange() at ??:0 @ 0x7f9a07c2ec35 kudu::master::CatalogManager::GetTableLocations() at ??:0 @ 0x51962e kudu::master::WaitForRunningTabletCount() at /data1/jenkins-workspace/kudu@ 0x51ce59 kudu::CreateTableStressTest_RestartMasterDuringCreation_Test::TestBody() a@ 0x7f99fc7f5b48 testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7f99fc7ea012 testing::Test::Run() at ??:0 @ 0x7f99fc7ea158 testing::TestInfo::Run() at ??:0 @ 0x7f99fc7ea235 testing::TestCase::Run() at ??:0 @ 0x7f99fc7ea518 testing::internal::UnitTestImpl::RunAllTests() at ??:0 @ 0x7f99fc7f6058 testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7f99fc7ea7fd testing::UnitTest::Run() at ??:0 @ 0x7f9a08520cf6 main at ??:0 @ 0x7f99f97a9d5d __libc_start_main at ??:0 @ 0x43e06d (unknown) at ??:0 Change-Id: I5bc602a51a4914ec13f926bfca555c816a2ee509 --- M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h M src/kudu/master/master-test.cc M src/kudu/master/sys_catalog.cc M src/kudu/master/sys_catalog.h 5 files changed, 71 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/3465/1 -- To view, visit http://gerrit.cloudera.org:8080/3465 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I5bc602a51a4914ec13f926bfca555c816a2ee509 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) Release notes for 0.9.1
Jean-Daniel Cryans has posted comments on this change. Change subject: Release notes for 0.9.1 .. Patch Set 1: > Does this belong on the branch or on master? wasn't sure Doh saw this after writing my review. It should be both I think. -- To view, visit http://gerrit.cloudera.org:8080/3463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I797980d105e0e1000420aedf58e094f9505053c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3464 Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup This fixes a bug seen in a recent YCSB stress test that I ran in which I was accidentally writing tens of thousands of duplicate keys per second. After a tablet server restarted, it failed to come up due to a pending commit which referred to no mutated stores (e.g. because all of the operations were duplicate key inserts). This patch tweaks the logic for this safety check: a commit with no mutated stores trivially has "no active stores". However, that's not the same as having "only inactive stores" -- the subtlety is in the difference in behavior when a commit has no stores at all. The patch adds a new targeted test in tablet_bootstrap-test as well as a more end-to-end test in ts_recovery-itest. Both reproduced the bug reliably before this patch. Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Reviewed-on: http://gerrit.cloudera.org:8080/3321 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves(cherry picked from commit 6894438a406a635dc8a8f3bd77862294163cc7fb) --- M src/kudu/consensus/log-test-base.h M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/test_workload.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_bootstrap.cc 7 files changed, 148 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/3464/1 -- To view, visit http://gerrit.cloudera.org:8080/3464 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon
[kudu-CR](branch-0.9.x) Release notes for 0.9.1
Kudu Jenkins has posted comments on this change. Change subject: Release notes for 0.9.1 .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1954/ -- To view, visit http://gerrit.cloudera.org:8080/3463 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I797980d105e0e1000420aedf58e094f9505053c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) catalog manager: fix a locking error
Adar Dembo has submitted this change and it was merged. Change subject: catalog_manager: fix a locking error .. catalog_manager: fix a locking error This lock acquisition took the wrong lock, which meant it didn't add any protection against table/tablet visition races at all. See commit 1681d9a for context. Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Reviewed-on: http://gerrit.cloudera.org:8080/3425 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves(cherry picked from commit 5f7d79910d8da492195b5c22a6d9a7911f950107) Reviewed-on: http://gerrit.cloudera.org:8080/3461 Tested-by: Adar Dembo --- M src/kudu/master/catalog_manager.cc 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: David Ribeiro Alves: Looks good to me, approved Adar Dembo: Verified -- To view, visit http://gerrit.cloudera.org:8080/3461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves
[kudu-CR] KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
David Ribeiro Alves has posted comments on this change. Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/3321/1/src/kudu/tablet/tablet_bootstrap.cc File src/kudu/tablet/tablet_bootstrap.cc: Line 271: // At least one operation resulted in a mutation to a store, and at least > Changed the API as you suggested, though not sure what you mean about the " though there would be some cases where we'd do both. even if not like it better like this. thanks for fixing. -- To view, visit http://gerrit.cloudera.org:8080/3321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Integrate the ResultTracker into the rpc subsystem and add a test
Todd Lipcon has posted comments on this change. Change subject: Integrate the ResultTracker into the rpc subsystem and add a test .. Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/common/common.proto File src/kudu/common/common.proto: Line 316: optional bool track_rpc_result = 50006 [default=false]; I dont think this should be in 'common'. 'common' is for data-model type stuff. Perhaps 'rpc_header.proto' or a new proto file in the rpc/ subdir. (rpc doesn't depend on common, so actually slightly surprised this builds properly) http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/CMakeLists.txt File src/kudu/rpc/CMakeLists.txt: Line 89: kudu_common_proto oh, now I see why it builds. This dependency seems misplaced. (eg keep in mind that Impala's thinking about using krpc, so we shouldn't have upwards dependencies from RPC into common/) http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/protoc-gen-krpc.cc File src/kudu/rpc/protoc-gen-krpc.cc: Line 46: #include "kudu/common/common.pb.h" nit: sorting Line 467: "mi->result_tracker.reset($result_tracker$);\n" we have a separate ResultTracker per RPC type? that doesn't seem right. http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc-stress-test.cc File src/kudu/rpc/rpc-stress-test.cc: PS11, Line 73: random amount update Line 134: vectoradders; use unique_ptr? http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc-test-base.h File src/kudu/rpc/rpc-test-base.h: Line 288: std::atomic_int exactly_once_test_val_; hrm, is this atomic? haven't seen this http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rpc_context.cc File src/kudu/rpc/rpc_context.cc: PS11, Line 66: call_->RespondSuccess(*response_pb_); : delete this; : per the comment on the other review, it seems weird we now have this code split between ResultTracker and here. http://gerrit.cloudera.org:8080/#/c/3192/11/src/kudu/rpc/rtest.proto File src/kudu/rpc/rtest.proto: Line 129: option (kudu.track_rpc_result) = true; add something to design-docs/rpc.md about this feature? -- To view, visit http://gerrit.cloudera.org:8080/3192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1d624810350feceefe244e0319e22cac241bf0d6 Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](branch-0.9.x) catalog manager: fix a locking error
David Ribeiro Alves has posted comments on this change. Change subject: catalog_manager: fix a locking error .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) KUDU-1469. Fix handling of fully-deduped requests after a leader change
Todd Lipcon has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Fix stray memory writes due to tcmalloc profiling
Todd Lipcon has posted comments on this change. Change subject: Fix stray memory writes due to tcmalloc profiling .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) KUDU-1469. Fix handling of fully-deduped requests after a leader change
Todd Lipcon has submitted this change and it was merged. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. KUDU-1469. Fix handling of fully-deduped requests after a leader change This fixes KUDU-1469, a bug in which the leader and follower could get into a tight loop of RPCs making no progress replicating operations. The newly included test fails without the bug fix, and passes with it. See its comments for details on the bug itself. Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Reviewed-on: http://gerrit.cloudera.org:8080/3228 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro AlvesReviewed-by: Mike Percy (cherry picked from commit d685747426472d428b1d071df00d112d9f775117) Reviewed-on: http://gerrit.cloudera.org:8080/3455 Reviewed-by: Todd Lipcon --- M src/kudu/consensus/consensus.proto M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/log_cache.cc M src/kudu/consensus/raft_consensus-test.cc M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus_state.cc M src/kudu/consensus/raft_consensus_state.h M src/kudu/integration-tests/raft_consensus-itest.cc 8 files changed, 158 insertions(+), 24 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) [java-client] RPC timeout may sometimes be reported as max attempts violation
Todd Lipcon has submitted this change and it was merged. Change subject: [java-client] RPC timeout may sometimes be reported as max attempts violation .. [java-client] RPC timeout may sometimes be reported as max attempts violation Fixes an issue where an RPC timeout could be reported in the error message as too many attempts. Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Reviewed-on: http://gerrit.cloudera.org:8080/3330 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert(cherry picked from commit d002e3257d49fe8e420c3d50eac54bb2d1952722) Reviewed-on: http://gerrit.cloudera.org:8080/3457 Reviewed-by: Jean-Daniel Cryans --- M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java 1 file changed, 5 insertions(+), 4 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) Fix stray memory writes due to tcmalloc profiling
Todd Lipcon has submitted this change and it was merged. Change subject: Fix stray memory writes due to tcmalloc profiling .. Fix stray memory writes due to tcmalloc profiling This fixes an issue that has been causing frequent crashes in JD.com's production cluster as well as various Cloudera test clusters. The crashes would be in various different places, but the key signature was that offset 120 in some data structure or array (eg the 16th element of a vector) would be corrupted. After doing a git bisect using an integration testing cluster running an ITBLL workload, I found that this was a regression caused by the introduction of tcmalloc contention profiling[1]. The short explanation is that, if we experienced contention while freeing a Trace object, we could in some cases increment offset 120 of some other allocation which occurred soon after the deallocation of the Trace. The issue is described in more detail in a new comment in trace.h. With this patch, I was unable to reproduce the issue on the test cluster. No new test is added since this is quite timing-dependent and not amenable to unit testing or even stress testing. [1] commit f6691e744b9cb796e1bbc6e07953f21f387c9a88 Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Reviewed-on: http://gerrit.cloudera.org:8080/3445 Reviewed-by: David Ribeiro AlvesReviewed-by: Mike Percy Tested-by: Kudu Jenkins (cherry picked from commit 0bb59836fb3e4b92e2d88674e8b88546faac5c8b) Reviewed-on: http://gerrit.cloudera.org:8080/3456 Reviewed-by: Todd Lipcon --- M src/kudu/util/trace.h 1 file changed, 23 insertions(+), 3 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) [java-client] RPC timeout may sometimes be reported as max attempts violation
Jean-Daniel Cryans has posted comments on this change. Change subject: [java-client] RPC timeout may sometimes be reported as max attempts violation .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) catalog manager: fix a locking error
Kudu Jenkins has posted comments on this change. Change subject: catalog_manager: fix a locking error .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1952/ -- To view, visit http://gerrit.cloudera.org:8080/3461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar DemboGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) KUDU-1469. Fix handling of fully-deduped requests after a leader change
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1950/ -- To view, visit http://gerrit.cloudera.org:8080/3455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Fix stray memory writes due to tcmalloc profiling
Kudu Jenkins has posted comments on this change. Change subject: Fix stray memory writes due to tcmalloc profiling .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1949/ -- To view, visit http://gerrit.cloudera.org:8080/3456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) [java-client] RPC timeout may sometimes be reported as max attempts violation
Kudu Jenkins has posted comments on this change. Change subject: [java-client] RPC timeout may sometimes be reported as max attempts violation .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1951/ -- To view, visit http://gerrit.cloudera.org:8080/3457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Bump version to 0.9.1-SNAPSHOT
Todd Lipcon has submitted this change and it was merged. Change subject: Bump version to 0.9.1-SNAPSHOT .. Bump version to 0.9.1-SNAPSHOT Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Reviewed-on: http://gerrit.cloudera.org:8080/3458 Reviewed-by: Jean-Daniel CryansTested-by: Todd Lipcon --- M java/interface-annotations/pom.xml M java/kudu-client-tools/pom.xml M java/kudu-client/pom.xml M java/kudu-csd/pom.xml M java/kudu-flume-sink/pom.xml M java/kudu-mapreduce/pom.xml M java/kudu-spark/pom.xml M java/pom.xml M version.txt 9 files changed, 10 insertions(+), 10 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Todd Lipcon: Verified -- To view, visit http://gerrit.cloudera.org:8080/3458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) Bump version to 0.9.1-SNAPSHOT
Todd Lipcon has posted comments on this change. Change subject: Bump version to 0.9.1-SNAPSHOT .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/3458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) update dist-test configuration for new client.py location
Todd Lipcon has submitted this change and it was merged. Change subject: update dist-test configuration for new client.py location .. update dist-test configuration for new client.py location client.py has been moved to bin/client Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Reviewed-on: http://gerrit.cloudera.org:8080/3332 Reviewed-by: Adar DemboTested-by: Kudu Jenkins (cherry picked from commit e4833d2df35721c03955ef0902a70c2de2536f99) Reviewed-on: http://gerrit.cloudera.org:8080/3459 Reviewed-by: Todd Lipcon --- M build-support/dist_test.py M build-support/jenkins/build-and-test.sh 2 files changed, 3 insertions(+), 3 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) update dist-test configuration for new client.py location
Todd Lipcon has posted comments on this change. Change subject: update dist-test configuration for new client.py location .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](gh-pages) Control how content is cached on the Kudu web site
Mike Percy has posted comments on this change. Change subject: Control how content is cached on the Kudu web site .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/3462/1/.htaccess File .htaccess: Line 1: # Don't cache content or styles / scripts served by us. > I forget, do we serve our own bootstrap/etc? or use a cdn? Mostly CDN, but it's a mix. We should probably try to get more from a CDN. e.g. right now we serve bootstrap, but jquery is on a CDN. If we had our own CDN, we could also put our own (versioned) files up there and have it cache the files indefinitely. Line 5: Header unset ETag > why not use etag based caching? ETags are not distributed; different servers will give different ETags for the same content. Since the upstream site appears to have mirrors (based on nslookup) then I don't think ETags will work. See also https://developer.yahoo.com/performance/rules.html#etags= However I didn't do a thorough investigation of how ASF load balancing is set up. -- To view, visit http://gerrit.cloudera.org:8080/3462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Mike PercyGerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1398 CFile index blocks can store shortest separating prefix
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3304 to look at the new patch set (#10). Change subject: KUDU-1398 CFile index blocks can store shortest separating prefix .. KUDU-1398 CFile index blocks can store shortest separating prefix (No changes: resubmitting to trigger Jenkins build) This changes the values stored as index keys to be a shortest key between the first key of the data block and the last key of the previous data block. However, this change does not apply to deltafiles, because deltafiles expect to be able to decode an index key into a DeltaKey. The way the change works is illustrated with the example from the JIRA, extended: Block 1: apple, banana, cardamom Block 2: carrot, epazote, fennel Block 3: fig, guava, kiwi Before: ['apple' -> block 1, 'carrot' -> block 2, 'fig' -> block 3] After: ['a' -> block 1, 'carr' -> block 2, 'fi' -> block 3] Change-Id: I68ae9146fabd4a19b17d103d118d2d60e28bb315 --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_util.cc M src/kudu/cfile/cfile_util.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/cfile/gvint_block.cc M src/kudu/cfile/gvint_block.h M src/kudu/cfile/index-test.cc M src/kudu/cfile/index_block.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h M src/kudu/tablet/deltafile.cc 23 files changed, 234 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/3304/10 -- To view, visit http://gerrit.cloudera.org:8080/3304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I68ae9146fabd4a19b17d103d118d2d60e28bb315 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Todd Lipcon has submitted this change and it was merged. Change subject: KUDU-1491. Address TSAN warning for compaction .. KUDU-1491. Address TSAN warning for compaction This adds a missing memory barrier that was exposed when I removed TSAN suppressions for mutation lists. Without the patch, TSAN failed on mt-tablet-test a few percent of the time. With the patch, I looped it 50 times with no failures: http://dist-test.cloudera.org/job?job_id=todd.1466638821.4940 Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Reviewed-on: http://gerrit.cloudera.org:8080/3454 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M src/kudu/tablet/compaction.cc 1 file changed, 4 insertions(+), 3 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR](gh-pages) Control how content is cached on the Kudu web site
Todd Lipcon has posted comments on this change. Change subject: Control how content is cached on the Kudu web site .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/3462/1/.htaccess File .htaccess: Line 1: # Don't cache content or styles / scripts served by us. I forget, do we serve our own bootstrap/etc? or use a cdn? Line 5: Header unset ETag why not use etag based caching? -- To view, visit http://gerrit.cloudera.org:8080/3462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Mike PercyGerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Control how content is cached on the Kudu web site
Hello Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3462 to review the following change. Change subject: Control how content is cached on the Kudu web site .. Control how content is cached on the Kudu web site Change-Id: I08619464d8d458582282044a7db685f8fbbca483 --- A .htaccess 1 file changed, 18 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/62/3462/1 -- To view, visit http://gerrit.cloudera.org:8080/3462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I08619464d8d458582282044a7db685f8fbbca483 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-Owner: Mike PercyGerrit-Reviewer: Todd Lipcon
[kudu-CR] Avoid missing 'override' keyword warnings in raft consensus-test.cc
Todd Lipcon has posted comments on this change. Change subject: Avoid missing 'override' keyword warnings in raft_consensus-test.cc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3289/2/src/kudu/consensus/CMakeLists.txt File src/kudu/consensus/CMakeLists.txt: Line 140: set_source_files_properties(raft_consensus-test.cc OK. fair enough on not upgrading, but maybe drop a comment here explaining why we are suppressing? -- To view, visit http://gerrit.cloudera.org:8080/3289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I687a05ed560003721d4a05bd7a01261e03bd77d9 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1398 CFile index blocks can store shortest separating prefix
Todd Lipcon has posted comments on this change. Change subject: KUDU-1398 CFile index blocks can store shortest separating prefix .. Patch Set 9: Looks like a small compile error here -- To view, visit http://gerrit.cloudera.org:8080/3304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I68ae9146fabd4a19b17d103d118d2d60e28bb315 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] Allow to set RequestId in the RPC RequestHeader
Todd Lipcon has posted comments on this change. Change subject: Allow to set RequestId in the RPC RequestHeader .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I92566250683ee69fad57a9ae694842d4a0b17ab4 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Clarify the difference between 'call id' and 'request id'
Todd Lipcon has posted comments on this change. Change subject: Clarify the difference between 'call_id' and 'request_id' .. Patch Set 6: Code-Review+2 Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/3409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If040e79baea55f4d16b43ff64b7ec27a403fc18f Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Clarify the difference between 'call id' and 'request id'
Todd Lipcon has submitted this change and it was merged. Change subject: Clarify the difference between 'call_id' and 'request_id' .. Clarify the difference between 'call_id' and 'request_id' This adds a bit more information on the difference between 'call_id' and 'request_id' to reduce confusion. Change-Id: If040e79baea55f4d16b43ff64b7ec27a403fc18f Reviewed-on: http://gerrit.cloudera.org:8080/3409 Reviewed-by: Todd LipconTested-by: Todd Lipcon --- M src/kudu/rpc/rpc_header.proto 1 file changed, 9 insertions(+), 3 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/3409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: If040e79baea55f4d16b43ff64b7ec27a403fc18f Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-0.9.x) KUDU-1473: fix some tablet lock usage in CatalogManager
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1473: fix some tablet lock usage in CatalogManager .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1946/ -- To view, visit http://gerrit.cloudera.org:8080/3460 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8e24f6035f4d778995ea3f295396f5fbd760d6c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar DemboGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) catalog manager: fix a locking error
Hello David Ribeiro Alves, Kudu Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3461 to review the following change. Change subject: catalog_manager: fix a locking error .. catalog_manager: fix a locking error This lock acquisition took the wrong lock, which meant it didn't add any protection against table/tablet visition races at all. See commit 1681d9a for context. Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Reviewed-on: http://gerrit.cloudera.org:8080/3425 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves(cherry picked from commit 5f7d79910d8da492195b5c22a6d9a7911f950107) --- M src/kudu/master/catalog_manager.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/3461/1 -- To view, visit http://gerrit.cloudera.org:8080/3461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](branch-0.9.x) catalog manager: fix a locking error
Kudu Jenkins has posted comments on this change. Change subject: catalog_manager: fix a locking error .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1947/ -- To view, visit http://gerrit.cloudera.org:8080/3461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8db7a78309bb9182ae1fdb92485e8467b90a84b2 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Adar DemboGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) update dist-test configuration for new client.py location
Kudu Jenkins has posted comments on this change. Change subject: update dist-test configuration for new client.py location .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1945/ -- To view, visit http://gerrit.cloudera.org:8080/3459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) update dist-test configuration for new client.py location
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3459 Change subject: update dist-test configuration for new client.py location .. update dist-test configuration for new client.py location client.py has been moved to bin/client Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Reviewed-on: http://gerrit.cloudera.org:8080/3332 Reviewed-by: Adar DemboTested-by: Kudu Jenkins (cherry picked from commit e4833d2df35721c03955ef0902a70c2de2536f99) --- M build-support/dist_test.py M build-support/jenkins/build-and-test.sh 2 files changed, 3 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/3459/1 -- To view, visit http://gerrit.cloudera.org:8080/3459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I48e4617222b8b503b21eeb560800dc5619d7ac26 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Dan Burkert
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Adar Dembo has posted comments on this change. Change subject: KUDU-1491. Address TSAN warning for compaction .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Integrate the request tracker with the client
Todd Lipcon has posted comments on this change. Change subject: Integrate the request tracker with the client .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/rpc/retriable_rpc.h File src/kudu/rpc/retriable_rpc.h: Line 210: request_tracker_->RpcCompleted(sequence_number_); > nit: mid-comment as a safety check, I think we should probably set the seq number back to NO_SEQ_NO, and in the dtor, DCHECK that it's NO_SEQ_NO. Otherwise we might have some very tricky-to-catch "leaks" of RPCs in the request tracker. -- To view, visit http://gerrit.cloudera.org:8080/3080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94207c294452fcbdb3a7901fdb702674d47553ee Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Integrate the request tracker with the client
Todd Lipcon has posted comments on this change. Change subject: Integrate the request tracker with the client .. Patch Set 8: (10 comments) http://gerrit.cloudera.org:8080/#/c/3080/8//COMMIT_MSG Commit Message: Line 15: any specific tests. it's not possible to use RetriableRpc within rpc_stub-test to test the client part? http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: Line 194:const scoped_refptr& request_tracker, what about storing this in the Batcher? or grabbing it via batcher->client->data->request_tracker as necessary? seems like an unnecessary extra parameter (and refcount incr/decr) http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: Line 72: using rpc::RequestTracker; spurious changes in this file? http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/client/client-internal.h File src/kudu/client/client-internal.h: Line 27: #include "kudu/rpc/request_tracker.h" can be a forward decl, no? http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/client/meta_cache.h File src/kudu/client/meta_cache.h: Line 34: #include "kudu/rpc/request_tracker.h" spurious? http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/rpc/retriable_rpc.h File src/kudu/rpc/retriable_rpc.h: Line 125: // in flight, so we retry with a delay. I think this merits a warning. Line 193: request_id->set_first_incomplete_seq_no(internal::NO_SEQ_NO); why not just leave it unset in the PB? Line 210: request_tracker_->RpcCompleted(sequence_number_); nit: mid-comment http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/rpc/rpc_controller.cc File src/kudu/rpc/rpc_controller.cc: Line 103: return *request_id_.get(); nit: I think you dont need .get() http://gerrit.cloudera.org:8080/#/c/3080/8/src/kudu/rpc/rpc_controller.h File src/kudu/rpc/rpc_controller.h: Line 114: const RequestIdPB& request_id() const; I think this is somewhat error prone - IIRC the RPC layer "steals" the request ID out of the RpcController at send time, no? Seems like it's worth a doc comment that these are not accessible after an RPC has been sent. And maybe a DCHECK like some of the other accessors have? -- To view, visit http://gerrit.cloudera.org:8080/3080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94207c294452fcbdb3a7901fdb702674d47553ee Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3454 to look at the new patch set (#2). Change subject: KUDU-1491. Address TSAN warning for compaction .. KUDU-1491. Address TSAN warning for compaction This adds a missing memory barrier that was exposed when I removed TSAN suppressions for mutation lists. Without the patch, TSAN failed on mt-tablet-test a few percent of the time. With the patch, I looped it 50 times with no failures: http://dist-test.cloudera.org/job?job_id=todd.1466638821.4940 Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 --- M src/kudu/tablet/compaction.cc 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/3454/2 -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1491. Address TSAN warning for compaction .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1944/ -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Todd Lipcon has posted comments on this change. Change subject: KUDU-1491. Address TSAN warning for compaction .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/3454/1/src/kudu/tablet/compaction.cc File src/kudu/tablet/compaction.cc: Line 508: static void AdvanceToLastInList(const Mutation** m) { > Do the next() calls here need to be converted into acquire_next()? hm good catch... I think so. -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR](branch-0.9.x) Bump version to 0.9.1-SNAPSHOT
Kudu Jenkins has posted comments on this change. Change subject: Bump version to 0.9.1-SNAPSHOT .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1943/ -- To view, visit http://gerrit.cloudera.org:8080/3458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Bump version to 0.9.1-SNAPSHOT
Jean-Daniel Cryans has posted comments on this change. Change subject: Bump version to 0.9.1-SNAPSHOT .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Bump version to 0.9.1-SNAPSHOT
Hello Jean-Daniel Cryans, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3458 to review the following change. Change subject: Bump version to 0.9.1-SNAPSHOT .. Bump version to 0.9.1-SNAPSHOT Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 --- M java/interface-annotations/pom.xml M java/kudu-client-tools/pom.xml M java/kudu-client/pom.xml M java/kudu-csd/pom.xml M java/kudu-flume-sink/pom.xml M java/kudu-mapreduce/pom.xml M java/kudu-spark/pom.xml M java/pom.xml M version.txt 9 files changed, 10 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/58/3458/1 -- To view, visit http://gerrit.cloudera.org:8080/3458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I472c2999c0b13a2c046366a9e1ad7474eedf68d0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Jean-Daniel Cryans
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Adar Dembo has posted comments on this change. Change subject: KUDU-1491. Address TSAN warning for compaction .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/3454/1/src/kudu/tablet/compaction.cc File src/kudu/tablet/compaction.cc: Line 508: static void AdvanceToLastInList(const Mutation** m) { Do the next() calls here need to be converted into acquire_next()? -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR](branch-0.9.x) [java-client] RPC timeout may sometimes be reported as max attempts violation
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3457 Change subject: [java-client] RPC timeout may sometimes be reported as max attempts violation .. [java-client] RPC timeout may sometimes be reported as max attempts violation Fixes an issue where an RPC timeout could be reported in the error message as too many attempts. Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Reviewed-on: http://gerrit.cloudera.org:8080/3330 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert(cherry picked from commit d002e3257d49fe8e420c3d50eac54bb2d1952722) --- M java/kudu-client/src/main/java/org/kududb/client/AsyncKuduClient.java 1 file changed, 5 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/57/3457/1 -- To view, visit http://gerrit.cloudera.org:8080/3457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Dan Burkert
[kudu-CR](branch-0.9.x) [java-client] RPC timeout may sometimes be reported as max attempts violation
Kudu Jenkins has posted comments on this change. Change subject: [java-client] RPC timeout may sometimes be reported as max attempts violation .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1942/ -- To view, visit http://gerrit.cloudera.org:8080/3457 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9c5676ab5bd05170505ef0313b919c5475ae6b37 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) Fix stray memory writes due to tcmalloc profiling
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3456 Change subject: Fix stray memory writes due to tcmalloc profiling .. Fix stray memory writes due to tcmalloc profiling This fixes an issue that has been causing frequent crashes in JD.com's production cluster as well as various Cloudera test clusters. The crashes would be in various different places, but the key signature was that offset 120 in some data structure or array (eg the 16th element of a vector) would be corrupted. After doing a git bisect using an integration testing cluster running an ITBLL workload, I found that this was a regression caused by the introduction of tcmalloc contention profiling[1]. The short explanation is that, if we experienced contention while freeing a Trace object, we could in some cases increment offset 120 of some other allocation which occurred soon after the deallocation of the Trace. The issue is described in more detail in a new comment in trace.h. With this patch, I was unable to reproduce the issue on the test cluster. No new test is added since this is quite timing-dependent and not amenable to unit testing or even stress testing. [1] commit f6691e744b9cb796e1bbc6e07953f21f387c9a88 Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Reviewed-on: http://gerrit.cloudera.org:8080/3445 Reviewed-by: David Ribeiro AlvesReviewed-by: Mike Percy Tested-by: Kudu Jenkins (cherry picked from commit 0bb59836fb3e4b92e2d88674e8b88546faac5c8b) --- M src/kudu/util/trace.h 1 file changed, 23 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/56/3456/1 -- To view, visit http://gerrit.cloudera.org:8080/3456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon
[kudu-CR](branch-0.9.x) Fix stray memory writes due to tcmalloc profiling
Kudu Jenkins has posted comments on this change. Change subject: Fix stray memory writes due to tcmalloc profiling .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1941/ -- To view, visit http://gerrit.cloudera.org:8080/3456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9afca83d9cc24585960f6bf68d8996c4736ce6cb Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-0.9.x) KUDU-1469. Fix handling of fully-deduped requests after a leader change
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3455 Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. KUDU-1469. Fix handling of fully-deduped requests after a leader change This fixes KUDU-1469, a bug in which the leader and follower could get into a tight loop of RPCs making no progress replicating operations. The newly included test fails without the bug fix, and passes with it. See its comments for details on the bug itself. Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Reviewed-on: http://gerrit.cloudera.org:8080/3228 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro AlvesReviewed-by: Mike Percy (cherry picked from commit d685747426472d428b1d071df00d112d9f775117) --- M src/kudu/consensus/consensus.proto M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/log_cache.cc M src/kudu/consensus/raft_consensus-test.cc M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus_state.cc M src/kudu/consensus/raft_consensus_state.h M src/kudu/integration-tests/raft_consensus-itest.cc 8 files changed, 158 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/3455/1 -- To view, visit http://gerrit.cloudera.org:8080/3455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Hello Mike Percy, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/3454 to review the following change. Change subject: KUDU-1491. Address TSAN warning for compaction .. KUDU-1491. Address TSAN warning for compaction This adds a missing memory barrier that was exposed when I removed TSAN suppressions for mutation lists. Without the patch, TSAN failed on mt-tablet-test a few percent of the time. With the patch, I looped it 50 times with no failures: http://dist-test.cloudera.org/job?job_id=todd.1466638821.4940 Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 --- M src/kudu/tablet/compaction.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/3454/1 -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-1491. Address TSAN warning for compaction
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1491. Address TSAN warning for compaction .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1939/ -- To view, visit http://gerrit.cloudera.org:8080/3454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I42d2566eca096da172440ce3a88d0393dda9d324 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. Patch Set 3: Build Started http://104.196.14.100/job/kudu-gerrit/1938/ -- To view, visit http://gerrit.cloudera.org:8080/3321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1469. Fix handling of fully-deduped requests after a leader change
Mike Percy has submitted this change and it was merged. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. KUDU-1469. Fix handling of fully-deduped requests after a leader change This fixes KUDU-1469, a bug in which the leader and follower could get into a tight loop of RPCs making no progress replicating operations. The newly included test fails without the bug fix, and passes with it. See its comments for details on the bug itself. Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Reviewed-on: http://gerrit.cloudera.org:8080/3228 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro AlvesReviewed-by: Mike Percy --- M src/kudu/consensus/consensus.proto M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/log_cache.cc M src/kudu/consensus/raft_consensus-test.cc M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus_state.cc M src/kudu/consensus/raft_consensus_state.h M src/kudu/integration-tests/raft_consensus-itest.cc 8 files changed, 158 insertions(+), 24 deletions(-) Approvals: David Ribeiro Alves: Looks good to me, approved Mike Percy: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3321 to look at the new patch set (#2). Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup This fixes a bug seen in a recent YCSB stress test that I ran in which I was accidentally writing tens of thousands of duplicate keys per second. After a tablet server restarted, it failed to come up due to a pending commit which referred to no mutated stores (e.g. because all of the operations were duplicate key inserts). This patch tweaks the logic for this safety check: a commit with no mutated stores trivially has "no active stores". However, that's not the same as having "only inactive stores" -- the subtlety is in the difference in behavior when a commit has no stores at all. The patch adds a new targeted test in tablet_bootstrap-test as well as a more end-to-end test in ts_recovery-itest. Both reproduced the bug reliably before this patch. Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 --- M src/kudu/consensus/log-test-base.h M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/integration-tests/test_workload.cc M src/kudu/integration-tests/test_workload.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_bootstrap.cc 7 files changed, 148 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/3321/2 -- To view, visit http://gerrit.cloudera.org:8080/3321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1937/ -- To view, visit http://gerrit.cloudera.org:8080/3321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup
Todd Lipcon has posted comments on this change. Change subject: KUDU-1477. Pending COMMIT message for failed write operation can prevent tablet startup .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/3321/1//COMMIT_MSG Commit Message: Line 12: up due to a pending commit which referred to no mutated stores. > ... because all the rows had errors? please clarify Done http://gerrit.cloudera.org:8080/#/c/3321/1/src/kudu/consensus/log-test-base.h File src/kudu/consensus/log-test-base.h: Line 273: void AppendFailedCommit(const OpId& original_opid) { > what's a "FailedCommit" this seems to imply that a we're appending a commit Done http://gerrit.cloudera.org:8080/#/c/3321/1/src/kudu/integration-tests/test_workload.h File src/kudu/integration-tests/test_workload.h: Line 116: void set_write_pattern(WritePattern pattern) { > nit: blank line above Done http://gerrit.cloudera.org:8080/#/c/3321/1/src/kudu/integration-tests/ts_recovery-itest.cc File src/kudu/integration-tests/ts_recovery-itest.cc: Line 148:1, // TODO > TODO what? Done http://gerrit.cloudera.org:8080/#/c/3321/1/src/kudu/tablet/tablet_bootstrap.cc File src/kudu/tablet/tablet_bootstrap.cc: Line 271: bool AreAllStoresInactive(const CommitMsg& commit); > by "iterating through the stores" I meant through the ops on the commit mes Changed the API as you suggested, though not sure what you mean about the "multiple iterations". We only use this method for some sanity checks on orphan/pending commits. -- To view, visit http://gerrit.cloudera.org:8080/3321 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8ecf8d780de1aa89fae4e0510d8291eb1f1cee11 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Adar Dembo has posted comments on this change. Change subject: Allow for reserving disk space for non-Kudu processes .. Patch Set 5: (24 comments) http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: Line 111: DEFINE_int64(log_dir_reserved_bytes, 0, Nit: I think this would be more clear as fs_wal_dir_reserved_bytes, as it would closely parallel fs_wal_dir. Not sure if that implies it ought to be defined in the fs module and declared here, I don't have a strong opinion on that. Line 113: TAG_FLAG(log_dir_reserved_bytes, runtime); Hmm, I'm not sure if we should allow changing these values at runtime. If someone increases the reservation, we'll stop allowing new containers (good), but we have no way of reducing Kudu's disk space usage to meet the reservation (bad). Meaning, we can't fully satisfy the operator's desired semantics for a runtime change of these flags. http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: PS5, Line 951: We use a COARSE clock so we need to let a : // little bit of time pass so we get at least one unit of time greater than : // before when we call MonoTime::Now() again. Let's switch to FINE clock so we won't have this limitation. http://gerrit.cloudera.org:8080/#/c/3135/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: Line 64: DEFINE_int32(full_disk_cache_seconds, 30, Nit: please prefix with log_block_manager. It's super long, yes, but it's the only way we have to "namespace" the flags. Line 70: DEFINE_int64(data_dirs_reserved_bytes, 0, Please rename to fs_data_dirs_reserved_bytes (see my comment on log_dir_reserved_bytes for the rationale). Also indicate in the help that it only works with the log block manager. Line 96: "Number of unavailable log block containers", Nit: Number of Unavailable Block Containers (to be consistent with the existing container metrics). Line 98: "Number of non-full Block Containers that are under root paths that " Nit: "Number of non-full log block containers that are under root paths whose disks are full" (TBCWTECM). Line 265: RWFile* data_file() const { return data_file_.get(); } We're only using this for the filename. Could you make narrow accessor that just returns the data file's name? You could call it DataFilePath() to parallel MetadataFilePath(). As for the implementation, since it'll parallel MetadataFilePath(), could you use the same implementation for both? That is, either file.filename(), or append the right suffix to path_. Line 267: int64_t preallocated_offset() const { return preallocated_offset_; } Where is this used? Line 273: Env* env() const { return block_manager_->env(); } Is this ever used? Line 312: int64_t preallocated_offset_ = 0; Huh, didn't know C++11 allowed this. Could you make the same change for total_bytes_written_? Would be nice for both primitives to be initialized consistently. Line 1230: // Indexes of root paths that are below their reserved space threshold. Can we defer this work to the case where GetAvailableContainer() has returned null? That would speed up CreateBlock() for the common case. Line 1250: return Status::IOError("Unable to allocate block: All data directories are full. " We should add ENOSPC to this status, right? Line 1292: FLAGS_log_container_preallocate_bytes, Preallocate() may not actually preallocate FLAGS_log_container_preallocate_bytes bytes. When the container is first created it will, but after that, the preallocate base and bounds are total_bytes_written_ and FLAGS_log_container_preallocate_bytes respectively. Which means, each successive Preallocate() will only preallocate by the size of the last block to be written to the container. To be honest, I'm not even sure if the above produces desirable results. More specifically, suppose we have a container backed by a single ext4 extent that was preallocated when the container was created. Will subsequent preallocations "extend" that extent? Or will they add new extents, each the size of the last block written? If it's the latter, then the preallocation logic is hot garbage and should be replaced. It's not fair of me to ask you to fix preallocation (though you're certainly welcome to if you prefer to do that), but I do think we need to pass the right value to VerifySufficientDiskSpace(). In the context of preallocation here, I'm not really sure what it should be. As an alternative, what do you think of moving disk space enforcement into Container::WriteData()? It's not ideal because that's not where disk space is actually consumed, and we'd need to verify that a failure there doesn't orphan blocks (I think the failure should
[kudu-CR] Add Env::GetBytesFree() method
Mike Percy has submitted this change and it was merged. Change subject: Add Env::GetBytesFree() method .. Add Env::GetBytesFree() method This method returns the number of bytes free on the filesystem represented by its path argument. Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Reviewed-on: http://gerrit.cloudera.org:8080/3451 Reviewed-by: Adar DemboTested-by: Mike Percy --- M src/kudu/util/env-test.cc M src/kudu/util/env.h M src/kudu/util/env_posix.cc 3 files changed, 57 insertions(+), 1 deletion(-) Approvals: Mike Percy: Verified Adar Dembo: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Mike Percy
[kudu-CR] Add Env::GetBytesFree() method
Mike Percy has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 3: Verified+1 Overriding unrelated failure -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] Add Env::GetBytesFree() method
Mike Percy has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 3: Filed a JIRA for the data race discovered by Jenkins: https://issues.apache.org/jira/browse/KUDU-1491 -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] Add Env::GetBytesFree() method
Adar Dembo has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] KUDU-1469. Fix handling of fully-deduped requests after a leader change
Mike Percy has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3228/4/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: Line 653: last_received_op_id_current_leader_ = last_received_op_id_; > gotcha. ok Err. Is that something that would not be hard to fix? It seems like missing coverage. -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1469. Fix handling of fully-deduped requests after a leader change
Mike Percy has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3228/4/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: Line 653: last_received_op_id_current_leader_ = last_received_op_id_; > yea, I think because restarting the server in the test reset last_received_ gotcha. ok -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add Env::GetBytesFree() method
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3451 to look at the new patch set (#3). Change subject: Add Env::GetBytesFree() method .. Add Env::GetBytesFree() method This method returns the number of bytes free on the filesystem represented by its path argument. Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff --- M src/kudu/util/env-test.cc M src/kudu/util/env.h M src/kudu/util/env_posix.cc 3 files changed, 57 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/3451/3 -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] Add Env::GetBytesFree() method
Mike Percy has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3451/2/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 930: *bytes_free = buf.f_frsize * buf.f_ffree; > f_ffree is wrong; that's the number of free inodes. I think you meant f_bfr Wow, embarrassing. Fixed. I'm not a fan of ternary in this case (for readability reasons). But I removed the PREDICT_FALSE, seemed like a bad place for it. I also made the unit test more strict so this error wouldn't pass the test. -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-1469. Fix handling of fully-deduped requests after a leader change
Todd Lipcon has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3228/4/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: Line 653: last_received_op_id_current_leader_ = last_received_op_id_; > Do you have any idea why it was passing tests before? yea, I think because restarting the server in the test reset last_received_op_id_ back down to 0 or somesuch -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add Env::GetBytesFree() method
Adar Dembo has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3451/2/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 930: *bytes_free = buf.f_frsize * buf.f_ffree; f_ffree is wrong; that's the number of free inodes. I think you meant f_bfree. Also, would be more concise (and hopefully not more confusing) as a ternary. -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-1469. Fix handling of fully-deduped requests after a leader change
Mike Percy has posted comments on this change. Change subject: KUDU-1469. Fix handling of fully-deduped requests after a leader change .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3228/4/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: Line 653: last_received_op_id_current_leader_ = last_received_op_id_; > hm yea I think you're right. will dig Do you have any idea why it was passing tests before? -- To view, visit http://gerrit.cloudera.org:8080/3228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iced21ae1b69c1079efc9aa9cf23e2fa592b8bebd Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Add Env::GetBytesFree() method
Kudu Jenkins has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/1935/ -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] Add Env::GetBytesFree() method
Mike Percy has posted comments on this change. Change subject: Add Env::GetBytesFree() method .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/3451/1/src/kudu/util/env-test.cc File src/kudu/util/env-test.cc: Line 753: TEST_F(TestEnv, TestGetBytesFree) { > I appreciate the intent behind this test (i.e. every Env method should be t Good catch. I hadn't considered parallel execution when writing this. http://gerrit.cloudera.org:8080/#/c/3451/1/src/kudu/util/env.h File src/kudu/util/env.h: PS1, Line 173: // The implementation is likely to allocate bytes in sizes of blocks, so : // the value returned in 'bytes_free' will likely not smoothly decrease as : // bytes are written to the filesystem. > True, but the level of detail in the explanation is high enough to be a lit I think it's helpful to know. I'll make it more concise. http://gerrit.cloudera.org:8080/#/c/3451/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 918: RETRY_ON_EINTR(ret, statvfs(path.c_str(), buf)); > Could you make sure statvfs() exists on OS X? I did: https://developer.apple.com/library/ios/documentation/System/Conceptual/ManPages_iPhoneOS/man3/statvfs.3.html Line 929: *bytes_free = buf.f_frsize * buf.f_bavail; > Do you think we should check if we're running as uid 0, and if we are, use I guess so. Done. Verified that the unit test still passes when run as sudo. -- To view, visit http://gerrit.cloudera.org:8080/3451 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica484be987312b275bafdd6e6fa79239ed3ac1ff Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] log: Mark allocation finished even if allocation had an error
Mike Percy has submitted this change and it was merged. Change subject: log: Mark allocation finished even if allocation had an error .. log: Mark allocation finished even if allocation had an error This fixes two bugs: 1. If a disk preallocation fails we will enter a "stuck" state where we cannot preallocate a new segment. Errors to allocate or append should be propagated up. 2. If anything fails in DoAppend() we will attempt to delete ReplicateMsg member objects in LogEntryPB twice, resulting in a segfault. Added a test that crashes without the code changes in log.cc Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Reviewed-on: http://gerrit.cloudera.org:8080/3234 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/log-test-base.h M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/consensus/log_reader.cc M src/kudu/consensus/log_reader.h M src/kudu/consensus/raft_consensus.cc 8 files changed, 50 insertions(+), 26 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Allow to set RequestId in the RPC RequestHeader
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3179 to look at the new patch set (#9). Change subject: Allow to set RequestId in the RPC RequestHeader .. Allow to set RequestId in the RPC RequestHeader This allows to set a RequestId in the RpcController that will in turn be set in the RequestHeader. This doesn't have specific tests, as other patches test this functionality, but having this be stand-alone allows to work on the client and the server simultaneously. Change-Id: I92566250683ee69fad57a9ae694842d4a0b17ab4 --- M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/rpc_controller.cc M src/kudu/rpc/rpc_controller.h 3 files changed, 34 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/3179/9 -- To view, visit http://gerrit.cloudera.org:8080/3179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I92566250683ee69fad57a9ae694842d4a0b17ab4 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: Exactly once semantics for writes
David Ribeiro Alves has abandoned this change. Change subject: WIP: Exactly once semantics for writes .. Abandoned superceded by other patches -- To view, visit http://gerrit.cloudera.org:8080/3403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I99df757741057bc140272959576bd10cb63d7448 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Mike Percy has posted comments on this change. Change subject: Allow for reserving disk space for non-Kudu processes .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3135/4/src/kudu/util/env_util.cc File src/kudu/util/env_util.cc: PS4, Line 37: DEFINE_int64(disk_reserved_bytes, 0, : "Number of bytes to reserve on each filesystem for non-Kudu usage"); > I don't know whether this will be sufficient. I'm thinking we may want to c As discussed, I added separate flags for log_dir and data_dirs. -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Kudu Jenkins has posted comments on this change. Change subject: Allow for reserving disk space for non-Kudu processes .. Patch Set 5: Build Started http://104.196.14.100/job/kudu-gerrit/1932/ -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] log: Mark allocation finished even if allocation had an error
Adar Dembo has posted comments on this change. Change subject: log: Mark allocation finished even if allocation had an error .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Allow for reserving disk space for non-Kudu processes
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3135 to look at the new patch set (#5). Change subject: Allow for reserving disk space for non-Kudu processes .. Allow for reserving disk space for non-Kudu processes Adds gflags to reserve disk space such that Kudu will not use more than specified. Hadoop calls this functionality "du.reserved". If a WAL preallocation is attempted while the log disk is past its reservation limit, the log write will fail. As a result, RaftConsensus will cause the process to crash. The log block manager will use non-full disks if possible until all of the disks are full. If a flush or compaction is attempted when all disks are beyond their configured capacity then the LogBlockManager will return an error. As a result, the maintenance manager task will cause the process to crash. This initial implementation provides a "best effort" approach. Disk space checks are only done at preallocation time, and if writes continue beyond the preallocated point (for both a WAL segment and a data block) those writes will not be prevented. This makes it easier to provide a "friendly" option where the block manager will divert new writes to non-full disks, avoiding a hard crash when only one disk is past its reservation limit. In the future, we may want to add "hard" and "soft" limits, such that going beyond the soft limit will do what we do today, and going beyond the hard limit (say, by writing a very large data block past its preallocation point) will result in a crash. This patch includes: * Unit tests. * End-to-end test for flushing / compaction falling back to non-full disks due to disk space backpressure and finally crashing when there is no space left in any data dir. * End-to-end test for writes failing due to WAL disk space backpressure, causing a crash. Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 --- M src/kudu/consensus/log-test.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_util.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_reservation-itest.cc M src/kudu/tablet/tablet_peer_mm_ops.cc M src/kudu/util/CMakeLists.txt A src/kudu/util/env_util-test.cc M src/kudu/util/env_util.cc M src/kudu/util/env_util.h M src/kudu/util/scoped_cleanup.h 14 files changed, 573 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/3135/5 -- To view, visit http://gerrit.cloudera.org:8080/3135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log: Mark allocation finished even if allocation had an error
Mike Percy has posted comments on this change. Change subject: log: Mark allocation finished even if allocation had an error .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/3234/4/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: Line 983: if (entry_batch_pb_) { > combine above two lines with && Done -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] log: Mark allocation finished even if allocation had an error
Kudu Jenkins has posted comments on this change. Change subject: log: Mark allocation finished even if allocation had an error .. Patch Set 5: Build Started http://104.196.14.100/job/kudu-gerrit/1930/ -- To view, visit http://gerrit.cloudera.org:8080/3234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If22bf946a42d0ec32c35164acd9e6e6cef18dcc3 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] consensus: Crash if we fail to append to the WAL
Mike Percy has submitted this change and it was merged. Change subject: consensus: Crash if we fail to append to the WAL .. consensus: Crash if we fail to append to the WAL Being unable to write to the WAL is a serious problem with complicated consequences. Let's always crash when that happens. This patch adds a fault injection test and a callback that simply crashes on failure. Change-Id: I9c5ffa6d60f46ce4d6c3da4c6e7878ab07d99c65 Reviewed-on: http://gerrit.cloudera.org:8080/3406 Tested-by: Kudu Jenkins Reviewed-by: Todd LipconReviewed-by: Adar Dembo --- M src/kudu/consensus/consensus_queue.cc M src/kudu/consensus/log.cc M src/kudu/consensus/log_cache.cc M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/tablet/transactions/transaction_driver.cc M src/kudu/util/status_callback.cc M src/kudu/util/status_callback.h 9 files changed, 139 insertions(+), 9 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9c5ffa6d60f46ce4d6c3da4c6e7878ab07d99c65 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Integrate the result tracker with writes
David Ribeiro Alves has posted comments on this change. Change subject: Integrate the result tracker with writes .. Patch Set 1: this is good to be reviewed, but it's still missing a flag to disable it until we've finished garbage collection -- To view, visit http://gerrit.cloudera.org:8080/3449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1fa2f8db33653960f4749237b8993baba0929893 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] Integrate the request tracker with the client
David Ribeiro Alves has uploaded a new patch set (#8). Change subject: Integrate the request tracker with the client .. Integrate the request tracker with the client This integrates the request tracker with the client, making sure that write requests sent by the client are tracked in the RequestTracker and contain the information necessary for exactly once semantics. Since this is hard to test in isolation (without the server part) and it is already tested by the next patch in the sequence, this doesn't contain any specific tests. Change-Id: I94207c294452fcbdb3a7901fdb702674d47553ee --- M src/kudu/client/batcher.cc M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/client.cc M src/kudu/client/meta_cache.h M src/kudu/rpc/retriable_rpc.h M src/kudu/rpc/rpc_controller.cc M src/kudu/rpc/rpc_controller.h 8 files changed, 74 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/3080/8 -- To view, visit http://gerrit.cloudera.org:8080/3080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I94207c294452fcbdb3a7901fdb702674d47553ee Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Integrate the result tracker with writes
Kudu Jenkins has posted comments on this change. Change subject: Integrate the result tracker with writes .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/1929/ -- To view, visit http://gerrit.cloudera.org:8080/3449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1fa2f8db33653960f4749237b8993baba0929893 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro AlvesGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1398 CFile index blocks can store shortest separating prefix
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3304 to look at the new patch set (#9). Change subject: KUDU-1398 CFile index blocks can store shortest separating prefix .. KUDU-1398 CFile index blocks can store shortest separating prefix (No changes: resubmitting to trigger Jenkins build) This changes the values stored as index keys to be a shortest key between the first key of the data block and the last key of the previous data block. However, this change does not apply to deltafiles, because deltafiles expect to be able to decode an index key into a DeltaKey. The way the change works is illustrated with the example from the JIRA, extended: Block 1: apple, banana, cardamom Block 2: carrot, epazote, fennel Block 3: fig, guava, kiwi Before: ['apple' -> block 1, 'carrot' -> block 2, 'fig' -> block 3] After: ['a' -> block 1, 'carr' -> block 2, 'fi' -> block 3] Change-Id: I68ae9146fabd4a19b17d103d118d2d60e28bb315 --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_util.cc M src/kudu/cfile/cfile_util.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/cfile/gvint_block.cc M src/kudu/cfile/gvint_block.h M src/kudu/cfile/index-test.cc M src/kudu/cfile/index_block.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h M src/kudu/tablet/deltafile.cc 23 files changed, 234 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/3304/9 -- To view, visit http://gerrit.cloudera.org:8080/3304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I68ae9146fabd4a19b17d103d118d2d60e28bb315 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley