[kudu-CR] [tools] Fail ksck if fetching consensus state fails

2018-02-14 Thread Will Berkeley (Code Review)
Will Berkeley has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9316 )

Change subject: [tools] Fail ksck if fetching consensus state fails
..

[tools] Fail ksck if fetching consensus state fails

Commonly, users run ksck as someone other than the kudu superuser.
This means that ksck can't gather consensus state. It still does its
other checks, and will exit with an OK status if there are no
problems with them, just printing some easily-missed warnings at
the top of the output. This patch changes ksck so it fails when
it cannot gather all the information it needs to do all the checks.

Note that if the user specifies consensus=false, they can run ksck
without missing consensus checks causing a failure.

Change-Id: Id3efc9342c3cb3f9652bb8c9789fe20ecf12ff55
Reviewed-on: http://gerrit.cloudera.org:8080/9316
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M src/kudu/tools/ksck.cc
1 file changed, 11 insertions(+), 11 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id3efc9342c3cb3f9652bb8c9789fe20ecf12ff55
Gerrit-Change-Number: 9316
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR](branch-1.5.x) KUDU-2238. DMS not flush under memory pressure.

2018-02-14 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9322

to review the following change.


Change subject: KUDU-2238. DMS not flush under memory pressure.
..

KUDU-2238. DMS not flush under memory pressure.

When we choose DMS to flush, now we always pick the DMS with
highest log retention. However, as KUDU-2238 shows, in some cases
DMS with highest log retention may only consume little memory, and
other DMSs may consume much more memory, but get no chance to be
flushed, even under memory pressure.

This patch gives the ability to take memory consumption into
consideration when we choose DMS.

Change-Id: I0c04d76aa0e3888352dc56eeb493a5437ef47e42
Reviewed-on: http://gerrit.cloudera.org:8080/8904
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
3 files changed, 26 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/9322/1
--
To view, visit http://gerrit.cloudera.org:8080/9322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c04d76aa0e3888352dc56eeb493a5437ef47e42
Gerrit-Change-Number: 9322
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: zhen.zhang 


[kudu-CR](branch-1.6.x) KUDU-2238. DMS not flush under memory pressure.

2018-02-14 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9323


Change subject: KUDU-2238. DMS not flush under memory pressure.
..

KUDU-2238. DMS not flush under memory pressure.

When we choose DMS to flush, now we always pick the DMS with
highest log retention. However, as KUDU-2238 shows, in some cases
DMS with highest log retention may only consume little memory, and
other DMSs may consume much more memory, but get no chance to be
flushed, even under memory pressure.

This patch gives the ability to take memory consumption into
consideration when we choose DMS.

Change-Id: I0c04d76aa0e3888352dc56eeb493a5437ef47e42
Reviewed-on: http://gerrit.cloudera.org:8080/8904
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit f8d18693546cd518c8873b7a5f4c08579f85199a)
---
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
3 files changed, 18 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/9323/1
--
To view, visit http://gerrit.cloudera.org:8080/9323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.6.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c04d76aa0e3888352dc56eeb493a5437ef47e42
Gerrit-Change-Number: 9323
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: zhen.zhang 


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 5: Verified+1

(1 comment)

Unreleated flakes in

  DiskReservationITest.TestFillMultipleDisks
  DiskReservationITest.TestWalWriteToFullDiskAborts

due to NTP failure

F0214 06:53:26.686384 28842 master_main.cc:74] Check failed: _s.ok() Bad 
status: Service unavailable: Cannot initialize clock: Error reading clock. 
Clock considered unsynchronized

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@113
PS3, Line 113:
> Since we are not using tablet_id_ etc, can we instead use ExternalMiniClust
This test uses tablet_servers_ and the code to populate that.  I'm also 
thinking that we might expand functionality of this test in the future, so I 
would prefer to keep RaftConsensusITestBase here.

However, if you feel strongly about that, I'll update this to use 
ExternalMiniClusterITestBase.



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:49:02 +
Gerrit-HasComments: Yes


[kudu-CR] [catalog manager] check re-replication scheme upon table creation

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9321


Change subject: [catalog_manager] check re-replication scheme upon table 
creation
..

[catalog_manager] check re-replication scheme upon table creation

Output actionable warning message upon creation of a new table when
the number of alive tablet servers is not enough to re-replicate
tablet replicas in case of a failure.

Since the 3-4-3 replication scheme is now enabled by default, this might
be useful in case if running a cluster with just N tablet servers when
newly created tables have replication factor of N
(e.g., imagine running with just 3 tablet servers total and creating
tables with the default replication factor of 3).

This is a follow-up for c40e0587bf6a6aa55e5bd72dd2dd9356b1507f2e.

Change-Id: I5a59451ad876d9864a90607187a05f8526cd8cbf
---
M src/kudu/master/catalog_manager.cc
1 file changed, 36 insertions(+), 16 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/9321/1
--
To view, visit http://gerrit.cloudera.org:8080/9321
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5a59451ad876d9864a90607187a05f8526cd8cbf
Gerrit-Change-Number: 9321
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [tablet service] remove useless call

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9324


Change subject: [tablet_service] remove useless call
..

[tablet_service] remove useless call

Removed useless call of TabletReplica::permanent_uuid()
in ConsensusServiceImpl::UpdateConsensus().

Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6
---
M src/kudu/tserver/tablet_service.cc
1 file changed, 0 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/9324/1
--
To view, visit http://gerrit.cloudera.org:8080/9324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6
Gerrit-Change-Number: 9324
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [catalog manager] check re-replication scheme upon table creation

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9321 )

Change subject: [catalog_manager] check re-replication scheme upon table 
creation
..


Patch Set 1: Verified+1

Unrelated fake in AdminCliTest.TestMoveTablet_KUDU_1097:

Illegal state: Leader has not yet committed an operation in its own term
/home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/kudu-admin-test.cc:290:
 Failure
Failed   
Bad status: Runtime error: /tmp/dist-test-taskHH2URb/build/asan/bin/kudu: 
process exited with non-zero status 1


--
To view, visit http://gerrit.cloudera.org:8080/9321
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a59451ad876d9864a90607187a05f8526cd8cbf
Gerrit-Change-Number: 9321
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 22:20:48 +
Gerrit-HasComments: No


[kudu-CR] [tablet service] remove useless call

2018-02-14 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9324 )

Change subject: [tablet_service] remove useless call
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6
Gerrit-Change-Number: 9324
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 22:31:18 +
Gerrit-HasComments: No


[kudu-CR] [catalog manager] check re-replication scheme upon table creation

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9321 )

Change subject: [catalog_manager] check re-replication scheme upon table 
creation
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9321
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I5a59451ad876d9864a90607187a05f8526cd8cbf
Gerrit-Change-Number: 9321
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] [tablet service] remove useless call

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9324 )

Change subject: [tablet_service] remove useless call
..

[tablet_service] remove useless call

Removed useless call of TabletReplica::permanent_uuid()
in ConsensusServiceImpl::UpdateConsensus().

Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6
Reviewed-on: http://gerrit.cloudera.org:8080/9324
Reviewed-by: Mike Percy 
Tested-by: Kudu Jenkins
---
M src/kudu/tserver/tablet_service.cc
1 file changed, 0 insertions(+), 2 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3869b423a7a8f32b64201ec83d8ad4d61e2ee9f6
Gerrit-Change-Number: 9324
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] gutil: properly hook up ANNOTATE HAPPENS BEFORE/AFTER

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9325

to review the following change.


Change subject: gutil: properly hook up ANNOTATE_HAPPENS_BEFORE/AFTER
..

gutil: properly hook up ANNOTATE_HAPPENS_BEFORE/AFTER

This updates dynamic_annotations to hook up ANNOTATE_HAPPENS_BEFORE and
ANNOTATE_HAPPENS_AFTER to the appropriate annotations in the TSAN
runtime library. According to [1] (committed in 2011) the functions have
been implemented in TSAN for quite some time, and it was an error that
we weren't using them.

The previous implementation of the function called some
condition-variable annotations in the TSAN runtime library instead.
Those functions actually turn out to be implemented as no-ops. So, I
looked for existing call sites to ANNOTATE_HAPPENS_BEFORE and AFTER and
removed them. This cleaned up atomic_refcount pretty substantially.
Again I found a matching change[2] in Chromium, from which this code is
derived.

[1] https://codereview.chromium.org/6982022
[2] https://codereview.chromium.org/580813002

Change-Id: Ida27aff6b9771c0009fba5e31ec7a0c7c53caa59
---
M src/kudu/gutil/atomic_refcount.h
M src/kudu/gutil/dynamic_annotations.c
M src/kudu/gutil/dynamic_annotations.h
M src/kudu/gutil/once.cc
M src/kudu/gutil/once.h
5 files changed, 20 insertions(+), 57 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/9325/1
--
To view, visit http://gerrit.cloudera.org:8080/9325
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ida27aff6b9771c0009fba5e31ec7a0c7c53caa59
Gerrit-Change-Number: 9325
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2291 (part 6) stacks: use libunwind directly for stack tracing

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9319

to look at the new patch set (#2).

Change subject: KUDU-2291 (part 6) stacks: use libunwind directly for stack 
tracing
..

KUDU-2291 (part 6) stacks: use libunwind directly for stack tracing

Previously we used glog's wrapper around libunwind for stack tracing.
However that has a deficiency that it assumes that, process wide, only
one thread can be inside libunwind at a time.

It appears that this is left over from some very old versions of
libunwind, or was already unnecessarily conservative. libunwind is meant
to be thread safe, and we have tests that will trigger if it is not.

This just extracts the function body of the glog function we were using
and does the same work manually.

Without this fix, the "collect from all the threads at the same time"
code path resulted in most of the threads collecting an empty trace
since they tried to call libunwind at the same time.

Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
---
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
3 files changed, 36 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/9319/2
--
To view, visit http://gerrit.cloudera.org:8080/9319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
Gerrit-Change-Number: 9319
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] rw semaphore: dont include debug-util.h when not necessary

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9328

to review the following change.


Change subject: rw_semaphore: dont include debug-util.h when not necessary
..

rw_semaphore: dont include debug-util.h when not necessary

Just avoids an unnecessary include which was making my compilation slow
down every time I modified debug-util.h.

Change-Id: I8a194bbffbe413eaffe0449639e3b4aa35a89900
---
M src/kudu/util/rw_semaphore.h
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/28/9328/1
--
To view, visit http://gerrit.cloudera.org:8080/9328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8a194bbffbe413eaffe0449639e3b4aa35a89900
Gerrit-Change-Number: 9328
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 1): refactor metrics log out of ServerBase

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9326

to review the following change.


Change subject: KUDU-2297 (part 1): refactor metrics log out of ServerBase
..

KUDU-2297 (part 1): refactor metrics log out of ServerBase

This is preparing for putting more information into the metrics log
instead of just metrics snapshots. The logging code will get more
complex as it gains features, so this makes a new class for it.

This is slightly incompatible since I also changed the name on disk to
'diagnostics' instead of 'metrics' and updated the documentation, but I
am not aware of people using this in the wild. So long as we
release-note it, I think it's reasonable to expect any sophisticated
operators to adjust their scripting accordingly.

Change-Id: I1f7bc7335b1f4c2a5c62fc5fe672289d5027a10d
---
M docs/administration.adoc
M src/kudu/server/CMakeLists.txt
A src/kudu/server/diagnostics_log.cc
A src/kudu/server/diagnostics_log.h
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
6 files changed, 221 insertions(+), 71 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/9326/1
--
To view, visit http://gerrit.cloudera.org:8080/9326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1f7bc7335b1f4c2a5c62fc5fe672289d5027a10d
Gerrit-Change-Number: 9326
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 3): refactor process-wide stack collection out of /stacks

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9329

to review the following change.


Change subject: KUDU-2297 (part 3): refactor process-wide stack collection out 
of /stacks
..

KUDU-2297 (part 3): refactor process-wide stack collection out of /stacks

Previously a bunch of logic to collect all the stacks from the process
was in the /stacks path handler. This logic is relatively generic and
shouldn't be intermingled with the formatting code. In particular I'd
like to use it in the diagnostics log, where a different output format
is desirable.

Change-Id: Ibb7c6edd31254f3d7e0cbef1eaf575bde3570df6
---
M src/kudu/server/default_path_handlers.cc
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
4 files changed, 178 insertions(+), 105 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/29/9329/1
--
To view, visit http://gerrit.cloudera.org:8080/9329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibb7c6edd31254f3d7e0cbef1eaf575bde3570df6
Gerrit-Change-Number: 9329
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 4): periodically dump stacks to diagnostics log

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9330

to review the following change.


Change subject: KUDU-2297 (part 4): periodically dump stacks to diagnostics log
..

KUDU-2297 (part 4): periodically dump stacks to diagnostics log

This modifies the diagnostics log to periodically dump stack traces.
This is slightly complicated by the fact that symbolized stack traces
can be relatively large. So, we separate the logging of symbols and
stack traces. When an address first appears in a log file, it is logged
as part of a symbol line. Later logs of the same address do not need
to re-log the symbol.

With this, a typical stack trace dump of an idle tserver is about 4KB
pre-compression, and a 'symbols' dump is about 6KB. So logging stacks
reasonably often should not use much disk space or IO.

Currently this is enabled on the same interval as the metrics log, but
only if a new experimental flag --diagnostics-log-stack-traces is
enabled. I'm planning to move it to a different flag in a later commit,
but wanted to keep this one simple and incremental.

Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
---
M src/kudu/server/diagnostics_log.cc
M src/kudu/server/diagnostics_log.h
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/rolling_log.h
6 files changed, 213 insertions(+), 27 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/9330/1
--
To view, visit http://gerrit.cloudera.org:8080/9330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
Gerrit-Change-Number: 9330
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2291 (part 5): allow collecting stack traces asynchronously

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9318

to look at the new patch set (#2).

Change subject: KUDU-2291 (part 5): allow collecting stack traces asynchronously
..

KUDU-2291 (part 5): allow collecting stack traces asynchronously

This removes the global variable previously used to communicate between
a stack trace collector and the stack trace target. The global variable
had the downside of allowing only a single stack trace operation in
flight at a time, which means that for use cases where we want to
capture the trace of a bunch of threads, we needed to do so serially.

For use cases like process-wide traces this could make debugging a bit
tricky. For example, because of the skew in time between collecting
the stack of thread A and the stack of thread B, it's possible that
the traces could show them holding the same lock, which we know to be
impossible. Getting the collection time as close as possible to
"simultaneous" can reduce the possibility of these kind of
strange-looking results.

The actual mechanism to do this is slightly tricky. Rather than
duplicate the full description here, check the comments in the
implementation. In particular, StackTraceCollector::RevokeSigData()
explains the complexity of not knowing whether our signal will ever be
delivered.

A new unit test shows the pattern, and this also updates the stack
servlet to send the signals to all of the threads as close as possible
to "the same time".

Change-Id: Ie4d7789bb1272db033df4685f46eaf8483ec7be1
---
M src/kudu/server/default_path_handlers.cc
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
4 files changed, 306 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/9318/2
--
To view, visit http://gerrit.cloudera.org:8080/9318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie4d7789bb1272db033df4685f46eaf8483ec7be1
Gerrit-Change-Number: 9318
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 2). Convert diagnostics log to a format closer to glog

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Mike Percy,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9327

to review the following change.


Change subject: KUDU-2297 (part 2). Convert diagnostics log to a format closer 
to glog
..

KUDU-2297 (part 2). Convert diagnostics log to a format closer to glog

Previously the metrics/diagnostics log used a custom log format designed
to be readable by scripts. The only timestamps were unix microtimes,
which are convenient for computers but not for people. In order to make
the log more easily greppable, this patch converts it over to use a
glog-compatible prefix on each log line.

The aim is that, if an admin sees some issues at 10/23 15:03 they can simply
grep the diagnostics log for '1021 15:0[0-6]' rather than having to go
figure out the appropriate range of numeric timestamps.

This patch also updates the docs accordingly.

Change-Id: I10d93c84f8c738c26850ae3d1ae6afbb5519c7ad
---
M docs/administration.adoc
M src/kudu/server/diagnostics_log.cc
M src/kudu/util/logging.cc
M src/kudu/util/logging.h
M src/kudu/util/trace.cc
5 files changed, 43 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/9327/1
--
To view, visit http://gerrit.cloudera.org:8080/9327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I10d93c84f8c738c26850ae3d1ae6afbb5519c7ad
Gerrit-Change-Number: 9327
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1704: add c++ client support for READ YOUR WRITES mode

2018-02-14 Thread Hao Hao (Code Review)
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8823

to look at the new patch set (#7).

Change subject: KUDU-1704: add c++ client support for READ_YOUR_WRITES mode
..

KUDU-1704: add c++ client support for READ_YOUR_WRITES mode

Change-Id: I34214245a78aed172a28fbdb395ff5bccd0fc0e1
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/scan_token-internal.cc
M src/kudu/client/scan_token-test.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/consistency-itest.cc
M src/kudu/integration-tests/delete_table-itest.cc
10 files changed, 300 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/8823/7
--
To view, visit http://gerrit.cloudera.org:8080/8823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34214245a78aed172a28fbdb395ff5bccd0fc0e1
Gerrit-Change-Number: 8823
Gerrit-PatchSet: 7
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1704: add READ YOUR WRITES scan mode

2018-02-14 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Mike Percy, Dan Burkert, David Ribeiro Alves, 
Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8804

to look at the new patch set (#13).

Change subject: KUDU-1704: add READ_YOUR_WRITES scan mode
..

KUDU-1704: add READ_YOUR_WRITES scan mode

This patch adds a new read mode READ_YOUR_WRITES on tserver. In this mode,
the server will pick a snapshot in the past, subject to the propagated
timestamp bound, and perform a read. Moreover, the chosen snapshot
timestamp is returned back to the client.

The major difference between READ_AT_SNAPSHOT scan without a timestamp
and READ_YOUR_WRITES scan is the latter will choose the newest timestamp
within the staleness bound that allows execution of the reads without
being blocked by the in-flight transactions if possible. READ_YOUR_WRITES
mode is not repeatable. However, it allows client local
read-your-writes/read-your-reads.

Design doc: 
https://docs.google.com/document/d/1WRLzKdCmRxXjUpi-DZsz7l2gl6215BsNdORtDXrmgl0/edit?usp=sharing

Change-Id: I84ddb981a1a0f199d4e66f5d5097318f8c785a48
---
M src/kudu/common/common.proto
M src/kudu/tserver/tablet_server-test-base.cc
M src/kudu/tserver/tablet_server-test-base.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
6 files changed, 353 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/04/8804/13
--
To view, visit http://gerrit.cloudera.org:8080/8804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84ddb981a1a0f199d4e66f5d5097318f8c785a48
Gerrit-Change-Number: 8804
Gerrit-PatchSet: 13
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1704: add READ YOUR WRITES scan mode

2018-02-14 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8804 )

Change subject: KUDU-1704: add READ_YOUR_WRITES scan mode
..


Patch Set 12:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/8804/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8804/12//COMMIT_MSG@15
PS12, Line 15:  choose the newest timestamp
 : within the staleness bound that allows execution of the reads 
without
 : being blocked by the in-flight transactions
> might be worth adding "if possible"
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@236
PS12, Line 236: subject to the propagated timestamp bound
> This wording is pretty hard to a casual reader (or even me) to understand.
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@237
PS12, Line 237: newest timestamp within
> latest timestamp higher than
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@242
PS12, Line 242: the timestamp
> a timestamp
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@244
PS12, Line 244: should use it
> Is this something that end-users have to worry about, or can we word this s
Updated.

Currently, we use current clock time of the server as the propagation time. 
This would bump up the propagation time unnecessarily for the next read in this 
mode.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@245
PS12, Line 245: wait
> nit: waiting
Done


http://gerrit.cloudera.org:8080/#/c/8804/11/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/8804/11/src/kudu/tserver/tablet_server-test.cc@185
PS11, Line 185:   void ScanYourWritesTest(uint64_t propagated_timestamp, 
ScanResponsePB* resp);
> warning: parameter 'propagated_timestamp' is const-qualified in the functio
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

PS12:
> We are missing a C++ client test in this patch. Is that excluded on purpose
C++ client test in the following CR https://gerrit.cloudera.org/#/c/8823/


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1924
PS12, Line 1924: TEST_F(TabletServerTest, TestScanYourWrites) {
> nit: add a comment at the top of this test describing what the test does
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1924
PS12, Line 1924: TestScanYourWrites
> It worries me a bit that most tests of this functionality overwhelmingly te
Added more integration tests in C++ client which would definitely fail with 
READ_LATEST.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1924
PS12, Line 1924: TestScanYourWrites
> I agree that a deterministic test would be nice and I think it should be po
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1927
PS12, Line 1927: e
> nit: missing period
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1958
PS12, Line 1958: TEST_F(TabletServerTest, 
TestScanYourWrites_WithoutPropagatedTimestamp) {
> nit: add test comment
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1960
PS12, Line 1960: perform a write
> nit: use capitalization and punctuation for code comments per https://googl
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1972
PS12, Line 1972: perform a write
> punctuation
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h
File src/kudu/tserver/tablet_service.h:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h@155
PS12, Line 155:   // it exceeds the maximum allowed clock synchronization error 
time.
> Can we add a note about the context, i.e. why this matters?
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h@158
PS12, Line 158: if
> s/if/that/
Done


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2113
PS12, Line 2113:
> add: DCHECK(s.ok()) ?
I think even for status that returns 'NotSupported' can reach here.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2114
PS12, Line 2114:   // Note: if 'max_allowed_ts' is not obtained from 
clock_->GetGlobalLatest() it's guaranteed
   :   // to be higher than 'tmp_snap_timestamp'.
> I'm having trouble following this:
Good question. TBH, this is refactored from previous code, so I am not sure 
about 

[kudu-CR] gutil: properly hook up ANNOTATE HAPPENS BEFORE/AFTER

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9325

to look at the new patch set (#2).

Change subject: gutil: properly hook up ANNOTATE_HAPPENS_BEFORE/AFTER
..

gutil: properly hook up ANNOTATE_HAPPENS_BEFORE/AFTER

This updates dynamic_annotations to hook up ANNOTATE_HAPPENS_BEFORE and
ANNOTATE_HAPPENS_AFTER to the appropriate annotations in the TSAN
runtime library. According to [1] (committed in 2011) the functions have
been implemented in TSAN for quite some time, and it was an error that
we weren't using them.

The previous implementation of the function called some
condition-variable annotations in the TSAN runtime library instead.
Those functions actually turn out to be implemented as no-ops. So, I
looked for existing call sites to ANNOTATE_HAPPENS_BEFORE and AFTER and
removed them. This cleaned up atomic_refcount pretty substantially.
Again I found a matching change[2] in Chromium, from which this code is
derived.

[1] https://codereview.chromium.org/6982022
[2] https://codereview.chromium.org/580813002

Change-Id: Ida27aff6b9771c0009fba5e31ec7a0c7c53caa59
---
M src/kudu/gutil/atomic_refcount.h
M src/kudu/gutil/dynamic_annotations.c
M src/kudu/gutil/dynamic_annotations.h
M src/kudu/gutil/once.cc
M src/kudu/gutil/once.h
5 files changed, 22 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/9325/2
--
To view, visit http://gerrit.cloudera.org:8080/9325
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ida27aff6b9771c0009fba5e31ec7a0c7c53caa59
Gerrit-Change-Number: 9325
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2291 (part 5): allow collecting stack traces asynchronously

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9318

to look at the new patch set (#3).

Change subject: KUDU-2291 (part 5): allow collecting stack traces asynchronously
..

KUDU-2291 (part 5): allow collecting stack traces asynchronously

This removes the global variable previously used to communicate between
a stack trace collector and the stack trace target. The global variable
had the downside of allowing only a single stack trace operation in
flight at a time, which means that for use cases where we want to
capture the trace of a bunch of threads, we needed to do so serially.

For use cases like process-wide traces this could make debugging a bit
tricky. For example, because of the skew in time between collecting
the stack of thread A and the stack of thread B, it's possible that
the traces could show them holding the same lock, which we know to be
impossible. Getting the collection time as close as possible to
"simultaneous" can reduce the possibility of these kind of
strange-looking results.

The actual mechanism to do this is slightly tricky. Rather than
duplicate the full description here, check the comments in the
implementation. In particular, StackTraceCollector::RevokeSigData()
explains the complexity of not knowing whether our signal will ever be
delivered.

A new unit test shows the pattern, and this also updates the stack
servlet to send the signals to all of the threads as close as possible
to "the same time".

Change-Id: Ie4d7789bb1272db033df4685f46eaf8483ec7be1
---
M src/kudu/server/default_path_handlers.cc
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
4 files changed, 308 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/9318/3
--
To view, visit http://gerrit.cloudera.org:8080/9318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie4d7789bb1272db033df4685f46eaf8483ec7be1
Gerrit-Change-Number: 9318
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 3): refactor process-wide stack collection out of /stacks

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9329

to look at the new patch set (#2).

Change subject: KUDU-2297 (part 3): refactor process-wide stack collection out 
of /stacks
..

KUDU-2297 (part 3): refactor process-wide stack collection out of /stacks

Previously a bunch of logic to collect all the stacks from the process
was in the /stacks path handler. This logic is relatively generic and
shouldn't be intermingled with the formatting code. In particular I'd
like to use it in the diagnostics log, where a different output format
is desirable.

Change-Id: Ibb7c6edd31254f3d7e0cbef1eaf575bde3570df6
---
M src/kudu/server/default_path_handlers.cc
M src/kudu/server/diagnostics_log.cc
M src/kudu/server/diagnostics_log.h
M src/kudu/server/server_base.cc
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
M src/kudu/util/logging.cc
M src/kudu/util/trace.cc
9 files changed, 200 insertions(+), 119 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/29/9329/2
--
To view, visit http://gerrit.cloudera.org:8080/9329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb7c6edd31254f3d7e0cbef1eaf575bde3570df6
Gerrit-Change-Number: 9329
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2297 (part 4): periodically dump stacks to diagnostics log

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9330

to look at the new patch set (#2).

Change subject: KUDU-2297 (part 4): periodically dump stacks to diagnostics log
..

KUDU-2297 (part 4): periodically dump stacks to diagnostics log

This modifies the diagnostics log to periodically dump stack traces.
This is slightly complicated by the fact that symbolized stack traces
can be relatively large. So, we separate the logging of symbols and
stack traces. When an address first appears in a log file, it is logged
as part of a symbol line. Later logs of the same address do not need
to re-log the symbol.

With this, a typical stack trace dump of an idle tserver is about 4KB
pre-compression, and a 'symbols' dump is about 6KB. So logging stacks
reasonably often should not use much disk space or IO.

Currently this is enabled on the same interval as the metrics log, but
only if a new experimental flag --diagnostics-log-stack-traces is
enabled. I'm planning to move it to a different flag in a later commit,
but wanted to keep this one simple and incremental.

Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
---
M src/kudu/server/diagnostics_log.cc
M src/kudu/server/diagnostics_log.h
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/rolling_log.h
6 files changed, 195 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/9330/2
--
To view, visit http://gerrit.cloudera.org:8080/9330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
Gerrit-Change-Number: 9330
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2291 (part 6) stacks: use libunwind directly for stack tracing

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9319

to look at the new patch set (#3).

Change subject: KUDU-2291 (part 6) stacks: use libunwind directly for stack 
tracing
..

KUDU-2291 (part 6) stacks: use libunwind directly for stack tracing

Previously we used glog's wrapper around libunwind for stack tracing.
However that has a deficiency that it assumes that, process wide, only
one thread can be inside libunwind at a time[1]

It appears that this is left over from some very old versions of
libunwind, or was already unnecessarily conservative. libunwind is meant
to be thread safe, and we have tests that will trigger if it is not.

This just extracts the function body of the glog function we were using
and does the same work manually.

Without this fix, the "collect from all the threads at the same time"
code path resulted in most of the threads collecting an empty trace
since they tried to call libunwind at the same time.

[1] https://github.com/google/glog/issues/298
Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
---
M build-support/iwyu/iwyu.sh
A build-support/iwyu/mappings/libunwind.imp
M src/kudu/util/debug-util-test.cc
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
5 files changed, 75 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/9319/3
--
To view, visit http://gerrit.cloudera.org:8080/9319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
Gerrit-Change-Number: 9319
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9333 )

Change subject: [CMakeLists] un-break build with gcc 4.9.2
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 15 Feb 2018 05:33:40 +
Gerrit-HasComments: No


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9333 )

Change subject: [CMakeLists] un-break build with gcc 4.9.2
..


Patch Set 1:

I ran a test build on SLES12 with gcc 4.8.5 (Impala has 4.9.2) and everything 
worked fine. I am still curios why the check doesn't work in that environment.


--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 15 Feb 2018 05:33:09 +
Gerrit-HasComments: No


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9333


Change subject: [CMakeLists] un-break build with gcc 4.9.2
..

[CMakeLists] un-break build with gcc 4.9.2

This is a quick fix to un-break build of Kudu as a part of Impala's
build with gcc 4.9.2.

For some reason, the compiler is happy with the -fsized-deallocation
option while compiling the small test, but then the compilation fails
with errors like below:

  g++: error: unrecognized command line option ‘-fsized-deallocation’
  g++: error: unrecognized command line option ‘-fno-sized-deallocation’

Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
---
M CMakeLists.txt
1 file changed, 29 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/9333/1
--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9333 )

Change subject: [CMakeLists] un-break build with gcc 4.9.2
..


Patch Set 1:

(1 comment)

This change makes sense. I am testing on an SLES instance to see if I can 
understand why the check didn't work quickly too.

http://gerrit.cloudera.org:8080/#/c/9333/1/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9333/1/CMakeLists.txt@1102
PS1, Line 1102: set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} 
-fsized-deallocation")
Can this and line  be moved into  if(COMPILER_SUPPORTS_SIZED_DEALLOCATION) 
below?



--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 15 Feb 2018 03:47:51 +
Gerrit-HasComments: Yes


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9333 )

Change subject: [CMakeLists] un-break build with gcc 4.9.2
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9333/1/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9333/1/CMakeLists.txt@1102
PS1, Line 1102:   endif()
> Can this and line  be moved into  if(COMPILER_SUPPORTS_SIZED_DEALLOCATI
Done



--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 15 Feb 2018 05:24:38 +
Gerrit-HasComments: Yes


[kudu-CR] [CMakeLists] un-break build with gcc 4.9.2

2018-02-14 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9333

to look at the new patch set (#2).

Change subject: [CMakeLists] un-break build with gcc 4.9.2
..

[CMakeLists] un-break build with gcc 4.9.2

This is a quick fix to un-break build of Kudu as a part of Impala's
build with gcc 4.9.2.

For some reason, the compiler is happy with the -fsized-deallocation
option while compiling the small test, but then the compilation fails
with errors like below:

  g++: error: unrecognized command line option ‘-fsized-deallocation’
  g++: error: unrecognized command line option ‘-fno-sized-deallocation’

Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
---
M CMakeLists.txt
1 file changed, 27 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/9333/2
--
To view, visit http://gerrit.cloudera.org:8080/9333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie02ae6229023101cca9f3a39292a648d20349a88
Gerrit-Change-Number: 9333
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2297 (part 4): periodically dump stacks to diagnostics log

2018-02-14 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9330

to look at the new patch set (#3).

Change subject: KUDU-2297 (part 4): periodically dump stacks to diagnostics log
..

KUDU-2297 (part 4): periodically dump stacks to diagnostics log

This modifies the diagnostics log to periodically dump stack traces.
This is slightly complicated by the fact that symbolized stack traces
can be relatively large. So, we separate the logging of symbols and
stack traces. When an address first appears in a log file, it is logged
as part of a symbol line. Later logs of the same address do not need
to re-log the symbol.

With this, a typical stack trace dump of an idle tserver is about 4KB
pre-compression, and a 'symbols' dump is about 6KB. So logging stacks
reasonably often should not use much disk space or IO.

Currently this is enabled on the same interval as the metrics log, but
only if a new experimental flag --diagnostics-log-stack-traces is
enabled. I'm planning to move it to a different flag in a later commit,
but wanted to keep this one simple and incremental.

Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
---
M src/kudu/server/diagnostics_log.cc
M src/kudu/server/diagnostics_log.h
M src/kudu/util/debug-util.cc
M src/kudu/util/debug-util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/rolling_log.h
6 files changed, 195 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/9330/3
--
To view, visit http://gerrit.cloudera.org:8080/9330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic32abf2c48ac6a5f3c384e58838b34671bcaf147
Gerrit-Change-Number: 9330
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley