[kudu-CR] docs: update docs for update dirs tool

2018-01-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9110 )

Change subject: docs: update docs for update_dirs tool
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9110/1/docs/administration.adoc
File docs/administration.adoc:

http://gerrit.cloudera.org:8080/#/c/9110/1/docs/administration.adoc@714
PS1, Line 714: NOTE: If removing a data directory, all tablets configured to 
use that
 : directory will fail upon startup and be replicated elsewhere.
Should probably mention --force and the check that happens before it is called.



--
To view, visit http://gerrit.cloudera.org:8080/9110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c139326aee35f72495a1cb1aaf8df1d58776cb
Gerrit-Change-Number: 9110
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 23 Jan 2018 23:18:03 +
Gerrit-HasComments: Yes


[kudu-CR] Upgrade to tcmalloc 2.6.3 and enabled sized-deallocation

2018-01-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9108 )

Change subject: Upgrade to tcmalloc 2.6.3 and enabled sized-deallocation
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9108/1/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9108/1/CMakeLists.txt@610
PS1, Line 610: Exported variants should conform to the C++11 ABI
Technically they conform to the pre-C++11 ABI, no? Well, at least all of the 
exported ABI symbols must adhere to the C++03 symbols; not sure about "hidden" 
symbols.



--
To view, visit http://gerrit.cloudera.org:8080/9108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b76c9d71b29d58f0b6dade3fc33b32d0c3de23b
Gerrit-Change-Number: 9108
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 23 Jan 2018 23:16:26 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. The DataDirManager and FsManager's update_on_disk boolean option has been
   converted into a tri-state. The new IGNORE_INCONSISTENCY state is used by
   the CLI tool to create a "speculative" FsManager using the new data dir
   configuration. This FsManager is used to check whether existing tablets
   depend on the data dir to be removed without making destructive changes
   to on-disk data structures.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Reviewed-on: http://gerrit.cloudera.org:8080/8978
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
7 files changed, 441 insertions(+), 92 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 11
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-21 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#10).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. The DataDirManager and FsManager's update_on_disk boolean option has been
   converted into a tri-state. The new IGNORE_INCONSISTENCY state is used by
   the CLI tool to create a "speculative" FsManager using the new data dir
   configuration. This FsManager is used to check whether existing tablets
   depend on the data dir to be removed without making destructive changes
   to on-disk data structures.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
7 files changed, 441 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/10
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 10
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8978/8/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/8/src/kudu/fs/data_dirs.cc@699
PS8, Line 699:
 :
 :   // All instances are present and accounted for. Time to create 
the in-memory
 :   // data directory structures.
> Sorry I wasn't explicit, I meant moving this outside of the if block and re
Oh, I understand what you mean now. Yes, I agree this makes more sense. Done.



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 10
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 22 Jan 2018 03:44:50 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8978/8/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/8/src/kudu/fs/data_dirs.cc@699
PS8, Line 699: RETURN_NOT_OK_PREPEND(
 : 
PathInstanceMetadataFile::CheckIntegrity(loaded_instances),
 : Substitute("could not verify integrity of files after 
data directory update: $0",
 :JoinStrings(GetDataDirs(), ",")));
> nit: could merge this CheckIntegrity with the above with:
Yeah the added encapsulation isn't a bad thing, but bear in mind that we'll 
only end up down here if we're in UPDATE_ON_DISK mode (L659), so the 
consistency_check condition is guaranteed to be true.


http://gerrit.cloudera.org:8080/#/c/8978/8/src/kudu/fs/data_dirs.cc@812
PS8, Line 812: IGNORE_CONSISTENCY
> nit: IGNORE_INCONSISTENCY
Done



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sun, 21 Jan 2018 20:14:51 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-21 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#9).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. The DataDirManager and FsManager's update_on_disk boolean option has been
   converted into a tri-state. The new IGNORE_INCONSISTENCY state is used by
   the CLI tool to create a "speculative" FsManager using the new data dir
   configuration. This FsManager is used to check whether existing tablets
   depend on the data dir to be removed without making destructive changes
   to on-disk data structures.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
7 files changed, 443 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/9
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 9
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tablet: squelch unused variable warning

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9087 )

Change subject: tablet: squelch unused variable warning
..


Patch Set 1: Verified+1

One alter_table-randomized-test timed out, certainly not related.


--
To view, visit http://gerrit.cloudera.org:8080/9087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8302eb0902e005d1d193eea1bac87c83dfb27d22
Gerrit-Change-Number: 9087
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sun, 21 Jan 2018 20:12:47 +
Gerrit-HasComments: No


[kudu-CR] tablet: squelch unused variable warning

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9087 )

Change subject: tablet: squelch unused variable warning
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8302eb0902e005d1d193eea1bac87c83dfb27d22
Gerrit-Change-Number: 9087
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-21 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8376 )

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..

fs: defer failure from metadata load to bootstrap when data dir is missing

An upcoming patch adds a CLI tool action to remove data directories. When a
data dir is removed, all tablets with data on it will fail. Today that
failure manifests as a FindOrDie in DataDirGroup::FromPB; we need to make
that a little bit more graceful.

This patch modifies the DataDirGroup FromPB/CopyToPB methods to return a
failure when a data dir is missing. It further changes TabletMetadata to
treat such failures as non-fatal, and adds checks to TabletBootstrap so that
the failures manifest there instead.

No tests in this patch because:
1. Andrew has already merged tablet-level tests for failed disks, and
2. The CLI tool patch adds coverage at the itest-level.

Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Reviewed-on: http://gerrit.cloudera.org:8080/8376
Reviewed-by: Todd Lipcon 
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 136 insertions(+), 69 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 10
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2265 CA-signed server certs for non-leader masters

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9076 )

Change subject: KUDU-2265 CA-signed server certs for non-leader masters
..


Patch Set 6: Code-Review+1

Will defer to Dan/Todd since this is an authn-related change.


--
To view, visit http://gerrit.cloudera.org:8080/9076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3539d58d10ed319ad1d8686c1259c92822fb710
Gerrit-Change-Number: 9076
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 20 Jan 2018 01:01:21 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-19 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#8).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. The DataDirManager and FsManager's update_on_disk boolean option has been
   converted into a tri-state. The new IGNORE_INCONSISTENCY state is used by
   the CLI tool to create a "speculative" FsManager using the new data dir
   configuration. This FsManager is used to check whether existing tablets
   depend on the data dir to be removed without making destructive changes
   to on-disk data structures.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
7 files changed, 443 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/8
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tablet: squelch unused variable warning

2018-01-19 Thread Adar Dembo (Code Review)
Hello Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9087

to review the following change.


Change subject: tablet: squelch unused variable warning
..

tablet: squelch unused variable warning

  ../../src/kudu/tablet/tablet.cc: In member function 
‘std::shared_ptr 
kudu::tablet::Tablet::FindBestDMSToFlush(const ReplaySizeMap&) const’:
  ../../src/kudu/tablet/tablet.cc:1893:11: warning: variable ‘retention_size’ 
set but not used [-Wunused-but-set-variable]
 int64_t retention_size = 0;
 ^

Change-Id: I8302eb0902e005d1d193eea1bac87c83dfb27d22
---
M src/kudu/tablet/tablet.cc
1 file changed, 0 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/9087/1
--
To view, visit http://gerrit.cloudera.org:8080/9087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8302eb0902e005d1d193eea1bac87c83dfb27d22
Gerrit-Change-Number: 9087
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-19 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8376

to look at the new patch set (#9).

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..

fs: defer failure from metadata load to bootstrap when data dir is missing

An upcoming patch adds a CLI tool action to remove data directories. When a
data dir is removed, all tablets with data on it will fail. Today that
failure manifests as a FindOrDie in DataDirGroup::FromPB; we need to make
that a little bit more graceful.

This patch modifies the DataDirGroup FromPB/CopyToPB methods to return a
failure when a data dir is missing. It further changes TabletMetadata to
treat such failures as non-fatal, and adds checks to TabletBootstrap so that
the failures manifest there instead.

No tests in this patch because:
1. Andrew has already merged tablet-level tests for failed disks, and
2. The CLI tool patch adds coverage at the itest-level.

Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 136 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/8376/9
--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 9
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2265 CA-signed server certs for non-leader masters

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9076 )

Change subject: KUDU-2265 CA-signed server certs for non-leader masters
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9076/5/src/kudu/integration-tests/security-master-certificates-itest.cc
File src/kudu/integration-tests/security-master-certificates-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9076/5/src/kudu/integration-tests/security-master-certificates-itest.cc@57
PS5, Line 57: opts.master_rpc_ports = { 55010, 55011, 55012, 55013, 55014, 
};
These are ephemeral ports which means they may be in use while the test runs. 
Don't we use 11010-11012 for our other multi-master tests?



--
To view, visit http://gerrit.cloudera.org:8080/9076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3539d58d10ed319ad1d8686c1259c92822fb710
Gerrit-Change-Number: 9076
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 20 Jan 2018 00:26:51 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/fs/fs_manager.h@104
PS7, Line 104:   // If 'update_on_disk' and 'read_only' are both true, the 
FsManager allows
 :   // 'data_roots' to deviate from their on-disk data structures 
without
 :   // updating the latter to match the former.
> I'm not really a fan of these semantics -- it seems somewhat confusing to b
I had been thinking of just renaming update_on_disk to something like 
ignore_inconsistencies (and doc that it may update the on-disk structures), but 
I like your suggestion more. Done.


http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/fs/fs_manager.cc@347
PS7, Line 347:   if (!missing_roots.empty() && !opts_.read_only) {
> missing_roots will only be non-empty if opts.update_on_disk is true (as per
Yeah, I had added a DCHECK and comment to that effect to the equivalent change 
in data_dirs.cc, but not here:

  // Add new or remove existing data directories, if desired.
  if (!opts_.read_only &&
  (!missing_roots.empty() || has_missing_instance)) {
DCHECK(opts_.update_on_disk); // guaranteed by LoadInstances and
  // the ternary in CheckIntegrity above

Anyway, we don't need the read_only check any more because read_only + 
update_on_disk is forbidden once again. I'll update this.


http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/tools/kudu-tool-test.cc@2251
PS7, Line 2251:   // second table should have all failed.
> can we add one more step in this test which re-adds the force-removed direc
It should work, but the current code fails in the integrity checks when 
re-adding a previously removed data directory.

So I modified DataDirManager::Open to elide all integrity checks when updating. 
This also allowed me to get rid of the CheckIntegrity changes which were pretty 
ugly to begin with. I also beefed up this test and added a couple more 
fs_manager-test cases.

Note that in real life, although the tserver can unfail these tablets, the 
master may kick off rereplication for them anyway.


http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/7/src/kudu/tools/tool_action_fs.cc@330
PS7, Line 330:   // If the user specifies --force, we assume they know what 
they're doing and
 :   // skip this check.
> would it be possible to do the check in a new function, and something like:
Done



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 20 Jan 2018 00:12:41 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2265 CA-signed server certs for non-leader masters

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9076 )

Change subject: KUDU-2265 CA-signed server certs for non-leader masters
..


Patch Set 4:

> I'm not a huge fan

Sorry, I thought I moved this partial comment to catalog_manager.cc but 
evidently I did not. Please ignore.


--
To view, visit http://gerrit.cloudera.org:8080/9076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3539d58d10ed319ad1d8686c1259c92822fb710
Gerrit-Change-Number: 9076
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 19 Jan 2018 22:53:56 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2265 CA-signed server certs for non-leader masters

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9076 )

Change subject: KUDU-2265 CA-signed server certs for non-leader masters
..


Patch Set 4:

(6 comments)

I'm not a huge fan

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/integration-tests/CMakeLists.txt@98
PS4, Line 98: ADD_KUDU_TEST(security-master-certificates-itest RESOURCE_LOCK 
"master-rpc-ports")
Nit: retain alphabetical sorting.


http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/integration-tests/security-master-certificates-itest.cc
File src/kudu/integration-tests/security-master-certificates-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/integration-tests/security-master-certificates-itest.cc@49
PS4, Line 49: class SecurityMasterCertsTest : public KuduTest {
Is there a fixture you could inherit from that would set up the cluster for you?


http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/integration-tests/security-master-certificates-itest.cc@51
PS4, Line 51:   SecurityMasterCertsTest() {
:   }
Can omit?


http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/master/catalog_manager.h@668
PS4, Line 668:   bool NeedToPrepareFollower();
Doc?


http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/master/catalog_manager.cc@539
PS4, Line 539:   } else if (catalog_manager_->NeedToPrepareFollower() && 
l.owns_lock()) {
I'm not a huge fan of this poll-based approach to figuring out whether we need 
to read the CA cert from the master tablet; I would prefer something where the 
leader "publishes" the CA cert to the master tablet and the followers 
"subscribe" to it and receive a notification.

That said, what I prefer is far more work and would be a large-scale 
rearchitecture of the master, and since the poll is cheap, it seems fine for 
this use case.


http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/mini-cluster/internal_mini_cluster.h
File src/kudu/mini-cluster/internal_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/9076/4/src/kudu/mini-cluster/internal_mini_cluster.h@80
PS4, Line 80:   // Whether to wait while catalog manager is started and 
properly initialized
Nit: specify the default value here.



--
To view, visit http://gerrit.cloudera.org:8080/9076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3539d58d10ed319ad1d8686c1259c92822fb710
Gerrit-Change-Number: 9076
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 19 Jan 2018 22:53:05 +
Gerrit-HasComments: Yes


[kudu-CR] WIP [master] no half-baked responses on ConnectoToMaster

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9052 )

Change subject: WIP [master] no half-baked responses on ConnectoToMaster
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc
File src/kudu/master/master_service.cc:

http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc@463
PS2, Line 463: // Do not send half-baked responses from leader master when 
its catalog
 : // manager is not in proper state yet.
> I'm not sure this is by design.  My point is that in case if a leader is no
So it seems like we thought we had a world with the two states { FOLLOWER, 
LEADER } but we actually have a world with the three states { FOLLOWER, 
FOLLOWER_TRANSITIONING_TO_LEADER, LEADER }. This patch identifies the new 
state, but what I don't like about it is that it returns an error while in that 
state when no error is returned while in the other two states.

Instead of doing this, what if we treated role == LEADER && !leader_status.ok() 
the same as role == FOLLOWER? That is, do resp->set_role(FOLLOWER), do 
resp->add_master_addrs() as before, and skip the KUDU-1924 stuff below? It's 
not "wrong" per se; it just means this master will be considered a FOLLOWER by 
the client during the (short) transition to LEADER. Clients should already 
handle the scenario when all masters are FOLLOWER because they're doing a 
leader election; this just prolongs that state slightly.


http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc@488
PS2, Line 488: // TODO(KUDU-1924): it seems there is some window when 
'role' is LEADER but
> That's about some different case, actually -- when leader_status.ok() but n
Hmm, but the message is consistent with the bug you're addressing ("a short 
window wherein the master is LEADER but isn't done initializing").



--
To view, visit http://gerrit.cloudera.org:8080/9052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9c906863f5f0e1995041281b122135e1b2cd3a4
Gerrit-Change-Number: 9052
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 19 Jan 2018 22:32:50 +
Gerrit-HasComments: Yes


[kudu-CR] docs: update docs for metadata dir

2018-01-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9068 )

Change subject: docs: update docs for metadata dir
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I333d32080106cd83b92ad2060f3239b9c44d201b
Gerrit-Change-Number: 9068
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Fri, 19 Jan 2018 18:51:39 +
Gerrit-HasComments: No


[kudu-CR] docs: update docs for metadata dir

2018-01-18 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9068 )

Change subject: docs: update docs for metadata dir
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I333d32080106cd83b92ad2060f3239b9c44d201b
Gerrit-Change-Number: 9068
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Fri, 19 Jan 2018 00:20:58 +
Gerrit-HasComments: No


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-18 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8376 )

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/fs/data_dirs.cc@327
PS8, Line 327: FindOrDie(uuid_by_uuid_idx, uuid_idx)
> is this not just 'uuid' above?
Yeah, not sure what happened here.


http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/fs/data_dirs.cc@330
PS8, Line 330:   pb->Swap();
> I think now that we are on newish protobuf we could do '*pb = std::move(gro
Done


http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/tablet/tablet_bootstrap.cc
File src/kudu/tablet/tablet_bootstrap.cc:

http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/tablet/tablet_bootstrap.cc@557
PS8, Line 557:   // Ensure the tablet's data dirs are present and healthy 
before it is opened.
> not 100% sure I follow why you defer it all the way until bootstrap time vs
The old location was in TSTabletManager::OpenTablet, which is run out of the 
bootstrap threadpool too. If you're suggesting I do it  in 
TabletMetadata::LoadFromSuperBlock (called synchronously for each tablet in 
TSTabletManager::Init), it'd require more plumbing.

Why did I move it at all? My preference is for TSTabletManager to hold 
tserver-specific behavior only, and while the "disk failure handling" logic 
isn't particularly useful for the master, it could be in the future, so 
locating it in TabletBootstrap is an easy way to ensure it makes it into every 
place that uses a tablet. Tablet::Open would be equally reasonable, I think.


http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/tablet/tablet_bootstrap.cc@561
PS8, Line 561: error retrieving tablet data dir group
> will this be easy enough for an operator to understand? perhaps we should s
Done


http://gerrit.cloudera.org:8080/#/c/8376/8/src/kudu/tablet/tablet_bootstrap.cc@563
PS8, Line 563: return Status::IOError("tablet data is in a failed 
directory");
> perhaps more accurate to say that _some_ tablet data is in a failed directo
I'll clarify with 'some', but if you follow the return chain up the stack 
you'll see that TSTabletManager::OpenTablet prefixes the message with the 
tablet ID before it logs it.



--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 18 Jan 2018 22:57:43 +
Gerrit-HasComments: Yes


[kudu-CR] WIP [master] no half-baked responses on ConnectoToMaster

2018-01-18 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9052 )

Change subject: WIP [master] no half-baked responses on ConnectoToMaster
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc
File src/kudu/master/master_service.cc:

http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc@463
PS2, Line 463: // Do not send half-baked responses from leader master when 
its catalog
 : // manager is not in proper state yet.
Are you sure that's not by design? Todd added ConnectToMaster and I think the 
intent was for it to always "succeed" and supply enough information to explain 
the state the master was in.


http://gerrit.cloudera.org:8080/#/c/9052/2/src/kudu/master/master_service.cc@488
PS2, Line 488: // TODO(KUDU-1924): it seems there is some window when 
'role' is LEADER but
This comment seems relevant.



--
To view, visit http://gerrit.cloudera.org:8080/9052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9c906863f5f0e1995041281b122135e1b2cd3a4
Gerrit-Change-Number: 9052
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 18 Jan 2018 21:24:11 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1489: allow configuration of metadata dir

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9027 )

Change subject: KUDU-1489: allow configuration of metadata dir
..


Patch Set 12: Code-Review+2

Since this is a pretty significant change, maybe get another reviewer too?


--
To view, visit http://gerrit.cloudera.org:8080/9027
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375c6b2eb283db5fa9c956135d98252c8781f5db
Gerrit-Change-Number: 9027
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 18 Jan 2018 03:36:24 +
Gerrit-HasComments: No


[kudu-CR] WIP: KUDU-2264. Automatically attempt to re-acquire Kerberos credentials before expiration

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9050 )

Change subject: WIP: KUDU-2264. Automatically attempt to re-acquire Kerberos 
credentials before expiration
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9050/1/src/kudu/tools/tool_action_test.cc
File src/kudu/tools/tool_action_test.cc:

http://gerrit.cloudera.org:8080/#/c/9050/1/src/kudu/tools/tool_action_test.cc@283
PS1, Line 283: case ControlShellRequestPB::kKinit:
Just passing through, but could you add a test case for this in kudu-tool-test? 
There's a control shell test you can piggy-back on.



--
To view, visit http://gerrit.cloudera.org:8080/9050
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I514253e0a7f067dbc8ffe4eaf5a7a2c32900b539
Gerrit-Change-Number: 9050
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 18 Jan 2018 02:38:08 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9041 )

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Reviewed-on: http://gerrit.cloudera.org:8080/9041
Reviewed-by: Dan Burkert 
Tested-by: Kudu Jenkins
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/server/webserver.cc
M src/kudu/tserver/tablet_server-test.cc
7 files changed, 90 insertions(+), 17 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] rpc: micro-optimize delayed task handling

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9048 )

Change subject: rpc: micro-optimize delayed task handling
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9048/1/src/kudu/rpc/periodic-test.cc
File src/kudu/rpc/periodic-test.cc:

http://gerrit.cloudera.org:8080/#/c/9048/1/src/kudu/rpc/periodic-test.cc@274
PS1, Line 274: [&] {
 :   // No-op.
 : },
Nit: combine into one line:

  [&] {}, // no-op


http://gerrit.cloudera.org:8080/#/c/9048/1/src/kudu/rpc/periodic-test.cc@278
PS1, Line 278: PeriodicTimer::Options()));
You can omit this.



--
To view, visit http://gerrit.cloudera.org:8080/9048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b6be5ef7e8f464f3bc4c62f904e2692b30ddc65
Gerrit-Change-Number: 9048
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 18 Jan 2018 00:34:40 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9041

to look at the new patch set (#5).

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/server/webserver.cc
M src/kudu/tserver/tablet_server-test.cc
7 files changed, 90 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9041/5
--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] rpc: avoid an extra copy of shared ptr for OutboundCall

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9047 )

Change subject: rpc: avoid an extra copy of shared_ptr for OutboundCall
..


Patch Set 1: Code-Review+1

Looks good to me, but I think Michael is more familiar with this code.


--
To view, visit http://gerrit.cloudera.org:8080/9047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89356bd9c59e612e64ffc4e1ad4a6bfda44512aa
Gerrit-Change-Number: 9047
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 18 Jan 2018 00:30:48 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9041 )

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9041/4/src/kudu/server/rpc_server.cc
File src/kudu/server/rpc_server.cc:

http://gerrit.cloudera.org:8080/#/c/9041/4/src/kudu/server/rpc_server.cc@213
PS4, Line 213: return Status::IllegalState(Substitute("bad state: $0", 
server_state_));
> I think this might be better as ServiceUnavailable, since it's expected to
I was trying to be consistent with Webserver::GetBoundAddresses, which returns 
IllegalState. But I suppose I can update both.



--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 18 Jan 2018 00:28:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9041

to look at the new patch set (#4).

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/tserver/tablet_server-test.cc
6 files changed, 88 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9041/4
--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9041 )

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc@215
PS2, Line 215:  (s.ok()) {
> Could you then at least add resp.Clear() to pair controller.Reset() ?
Done



--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 17 Jan 2018 23:14:47 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9041

to look at the new patch set (#3).

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/tserver/tablet_server-test.cc
6 files changed, 87 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9041/3
--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9041 )

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9041/1/src/kudu/server/server_base.cc
File src/kudu/server/server_base.cc:

http://gerrit.cloudera.org:8080/#/c/9041/1/src/kudu/server/server_base.cc@522
PS1, Line 522: rpc
> nit: RPC
Done


http://gerrit.cloudera.org:8080/#/c/9041/1/src/kudu/server/server_base.cc@525
PS1, Line 525: rpc
> nit: RPC
Done


http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc@211
PS2, Line 211: controller.Reset();
> nit: maybe, just move RpcController under the scope of the 'while() {}' loo
I appreciate the suggestion, but stylistically I prefer the "declare req, resp, 
and controller together" idiom we commonly use.


http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc@215
PS2, Line 215: CHECK(resp.has_status());
> nit: since there are many iterations of calling the GetStatus() method, may
See above.


http://gerrit.cloudera.org:8080/#/c/9041/2/src/kudu/tserver/tablet_server-test.cc@230
PS2, Line 230: ASSERT_OK(mini_server_->Restart());
> nit: consider adding some scope-related mechanics to join the thread if thi
Done



--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 17 Jan 2018 22:37:16 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-17 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#7).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 344 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/7
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-17 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8376

to look at the new patch set (#8).

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..

fs: defer failure from metadata load to bootstrap when data dir is missing

An upcoming patch adds a CLI tool action to remove data directories. When a
data dir is removed, all tablets with data on it will fail. Today that
failure manifests as a FindOrDie in DataDirGroup::FromPB; we need to make
that a little bit more graceful.

This patch modifies the DataDirGroup FromPB/CopyToPB methods to return a
failure when a data dir is missing. It further changes TabletMetadata to
treat such failures as non-fatal, and adds checks to TabletBootstrap so that
the failures manifest there instead.

No tests in this patch because:
1. Andrew has already merged tablet-level tests for failed disks, and
2. The CLI tool patch adds coverage at the itest-level.

Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 136 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/8376/8
--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8376 )

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8376/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8376/6//COMMIT_MSG@9
PS6, Line 9: remove data di
> nit: just remove at this point
Done


http://gerrit.cloudera.org:8080/#/c/8376/6//COMMIT_MSG@20
PS6, Line 20: has already merged
> nit: has already worked
Done



--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 17 Jan 2018 22:32:09 +
Gerrit-HasComments: Yes


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-17 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8376

to look at the new patch set (#7).

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..

fs: defer failure from metadata load to bootstrap when data dir is missing

An upcoming patch adds CLI tool actions to remove data directories. When a
data dir is removed, all tablets with data on it will fail. Today that
failure manifests as a FindOrDie in DataDirGroup::FromPB; we need to make
that a little bit more graceful.

This patch modifies the DataDirGroup FromPB/CopyToPB methods to return a
failure when a data dir is missing. It further changes TabletMetadata to
treat such failures as non-fatal, and adds checks to TabletBootstrap so that
the failures manifest there instead.

No tests in this patch because:
1. Andrew has already merged tablet-level tests for failed disks, and
2. The CLI tool patch adds coverage at the itest-level.

Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 136 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/8376/7
--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1913: LIFO wake up ordering for threadpool worker threads

2018-01-17 Thread Adar Dembo (Code Review)
Hello Dan Burkert, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9021

to look at the new patch set (#2).

Change subject: KUDU-1913: LIFO wake up ordering for threadpool worker threads
..

KUDU-1913: LIFO wake up ordering for threadpool worker threads

This patch changes the wake-up ordering for idle pool worker threads
to be LIFO. Previously all worker threads idled on a single
ConditionVariable which, by virtue of using pthread_cond_t under the hood,
was FIFO ordered.

In the abstract, FIFO ordering ensures a fair distribution of work amongst a
set of threads, but that's totally undesirable in a thread pool where the
goal is to get work done as quickly and with as few threads as possible. For
example, a fast stream of cheap tasks (e.g. RPCs) should be serviceable via
a single thread or something close to that.

The new unit test was looped 1000 times in TSAN mode with 8 stress threads
to shake out any flakes. Additionally, I ran through a variation of Todd's
test from a past code review [1]. I spun up three tservers, two of which had
failure detection disabled so that the third would get all of the leader
replicas. I created 240 tablets and looked at the size of the Raft thread
pool on that tserver. The steady state thread count was mildly lower at
first (4 vs. 8) but was much lower (4 vs. 30) after a SIGSTOP+SIGCONT cycle.

1. https://gerrit.cloudera.org/c/7331

Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
---
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
5 files changed, 114 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/9021/2
--
To view, visit http://gerrit.cloudera.org:8080/9021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
Gerrit-Change-Number: 9021
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1913: LIFO wake up ordering for threadpool worker threads

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9021 )

Change subject: KUDU-1913: LIFO wake up ordering for threadpool worker threads
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9021/1/src/kudu/util/test_util.h
File src/kudu/util/test_util.h:

http://gerrit.cloudera.org:8080/#/c/9021/1/src/kudu/util/test_util.h@115
PS1, Line 115:   // Sleep for a millisecond while looping.
> nit: it doesn't check every millisecond, but rather sleeps one millisecond
Thanks, will clarify.


http://gerrit.cloudera.org:8080/#/c/9021/1/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

http://gerrit.cloudera.org:8080/#/c/9021/1/src/kudu/util/threadpool.cc@621
PS1, Line 621:   SCOPED_CLEANUP({
> can you use the SCOPED_CLEANUP macro instead if you don't need the named va
I had been cancelling at one point, then didn't need to anymore but forgot to 
revert back to SCOPED_CLEANUP. Done.



--
To view, visit http://gerrit.cloudera.org:8080/9021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
Gerrit-Change-Number: 9021
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 17 Jan 2018 22:30:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1489: allow configuration of metadata dir

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9027 )

Change subject: KUDU-1489: allow configuration of metadata dir
..


Patch Set 11:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/fs/fs_manager-test.cc@265
PS11, Line 265:   // We should be able to explicate that the metadata is in the 
WAL root.
Nit: is 'explicate' the right choice here? I substitute it with 'analyze' or 
'explain' and the sentence doesn't make sense.


http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/fs/fs_manager.h@268
PS11, Line 268:   friend class 
tserver::MiniTabletServerTest_TestFsLayoutEndToEnd_Test;
Isn't this exactly the same as FRIEND_TEST(MiniTabletServerTest...) above?

Actually, maybe the above has no effect because you should be doing 
FRIEND_TEST(tserver::MiniTabletServerTest, ...)?


http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tools/kudu-tool-test.cc@2214
PS11, Line 2214:   env_->CreateDir(kTestDir);
ASSERT_OK


http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tools/kudu-tool-test.cc@2229
PS11, Line 2229:   RunTool(Substitute("fs dump uuid --fs_wal_dir=$0", 
opts.wal_root), nullptr, , {}, {});
ASSERT on the expected status?


http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tools/tool_action_fs.cc@764
PS11, Line 764:   .AddOptionalParameter("fs_metadata_dir")
Since you're adding these, mind reordering the optional parameters to be 
alphabetically sorted? No backwards compatibility concerns; it'll just show 
them differently in the help.

Same for the other files.


http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tserver/mini_tablet_server-test.cc
File src/kudu/tserver/mini_tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9027/11/src/kudu/tserver/mini_tablet_server-test.cc@60
PS11, Line 60: younger
Nit: older



--
To view, visit http://gerrit.cloudera.org:8080/9027
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375c6b2eb283db5fa9c956135d98252c8781f5db
Gerrit-Change-Number: 9027
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 17 Jan 2018 22:24:44 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2195 (part 1): always sync PBC-format metadata files

2018-01-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9043 )

Change subject: KUDU-2195 (part 1): always sync PBC-format metadata files
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9043/1//COMMIT_MSG@15
PS1, Line 15: So, adding the explicit fsync isn't likely to slow down ext4
: much.
Could you test this hypothesis?


http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc
File src/kudu/util/pb_util.cc:

http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc@493
PS1, Line 493:   RETURN_NOT_OK_PREPEND(env->SyncDir(DirName(path)), "Failed to 
SyncDir() parent of " + path);
I think this part is the only real loss of expressivity in the patch, because 
we lose the ability to create several PB files in the same directory and "batch 
up" the syncing of the directory by only passing pb_util::SYNC in the last 
call. Of course, with the old code you'd need to sync all but the last newly 
created PB file out-of-band so this metaphor certainly wasn't perfect (and 
likely unused by Kudu). Probably not worth keeping.


http://gerrit.cloudera.org:8080/#/c/9043/1/src/kudu/util/pb_util.cc@993
PS1, Line 993:   // TODO(todd) consider using O_DIRECT to avoid the "write, 
fsync, rename, syncdir" dance.
I scanned the linked e-mail thread but it wasn't clear to me whether Ted's 
suggestion was specific to ext4 or applies to any POSIX filesystem (which would 
presumably include macOS as well). Do you know?



--
To view, visit http://gerrit.cloudera.org:8080/9043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4f0c911662b2ff35fcf3915790248ba85bf6026f
Gerrit-Change-Number: 9043
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 17 Jan 2018 21:52:53 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-16 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9041

to look at the new patch set (#2).

Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/tserver/tablet_server-test.cc
6 files changed, 82 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9041/2
--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2148: do not crash on GetStatus during server startup

2018-01-16 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Dan Burkert,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9041

to review the following change.


Change subject: KUDU-2148: do not crash on GetStatus during server startup
..

KUDU-2148: do not crash on GetStatus during server startup

The fix is straight-forward but there are backwards compatibility
implications for old clients who trigger the race: because they're not aware
of the new 'error' field in the protobuf, they'll just see the incomplete
'status' field. To be more specific, 'node_instance' will be set,
'version_info' won't be, and 'bound_{rpc,http}_addresses' may or may not be
set depending on exactly when the race was hit.

An alternative would be to fail the RPC itself in lieu of adding the 'error'
field. That would ensure that old clients register a failure, though it
comes at the expense of not providing rich error information for new
clients, which is why I went with the above solution instead.

Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
---
M src/kudu/server/generic_service.cc
M src/kudu/server/rpc_server.cc
M src/kudu/server/server_base.cc
M src/kudu/server/server_base.h
M src/kudu/server/server_base.proto
M src/kudu/tserver/tablet_server-test.cc
6 files changed, 82 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/9041/1
--
To view, visit http://gerrit.cloudera.org:8080/9041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3b997999d8a1228244e0be0ea61705f2c12e3426
Gerrit-Change-Number: 9041
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 


[kudu-CR] KUDU-1489: allow configuration of metadata dir

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9027 )

Change subject: KUDU-1489: allow configuration of metadata dir
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9027/7/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/9027/7/src/kudu/fs/fs_manager-test.cc@534
PS7, Line 534:   // Adding a new data dir elsewhere in the list is OK.
This isn't what the test does anymore though; it's just shuffling the order of 
the directories in data_roots. Is that worth testing? If so, reword the comment 
further. If not, remove this part of the test.


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@256
PS4, Line 256: const string meta_dir_in_data_root = 
JoinPathSegments(canonicalized_data_fs_roots_[0].path,
> Is your concern that if the FS layout already partially exists, but not com
My concern is with the following sequence:
  1. I'm going to create an FS with wal_dir=/a, metadata_dir="", data_dirs=/b.
  2. First I do "mkdir -p /b/tablet-meta".
  3. Then I do FSManager.CreateInitialFileSystemLayout.
  4. Then I do FSManager.Open.

I think I'll wind up with metadata_dir=/b/tablet-meta instead of 
/a/tablet-meta, but maybe this is too contrived due to step #2. What would be 
interesting is if step #2 could be substituted with an otherwise valid Kudu 
operation.

Anyway, my question was "will step #3 fail because of the directory created in 
step #2"? If so, this is a non-issue.



--
To view, visit http://gerrit.cloudera.org:8080/9027
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375c6b2eb283db5fa9c956135d98252c8781f5db
Gerrit-Change-Number: 9027
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 16 Jan 2018 21:23:46 +
Gerrit-HasComments: Yes


[kudu-CR] mini-cluster: rename data root to cluster root

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9033 )

Change subject: mini-cluster: rename data_root to cluster_root
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8ccc9e232f47a684d6b9226fd84639c4c94d0c3
Gerrit-Change-Number: 9033
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 21:12:44 +
Gerrit-HasComments: No


[kudu-CR] build: Move fake XML file generation to run-test.sh

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8984 )

Change subject: build: Move fake XML file generation to run-test.sh
..


Patch Set 3:

(1 comment)

I can think of another reason why it may have been done in build-and-test.sh: 
during precommit, run-test.sh is run on individual distributed test slaves 
rather than the orchestrating Jenkins slave. Could you check that "generated" 
JUnit XML files make their way from dist-test slaves back to the orchestrator?

http://gerrit.cloudera.org:8080/#/c/8984/3/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/8984/3/build-support/run-test.sh@221
PS3, Line 221: if [ ! -f "$XMLFILE" ]; then
Should this be conditioned on STATUS != 0?



--
To view, visit http://gerrit.cloudera.org:8080/8984
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa89c806039a96ac0a6b7262ded74f70f49f87ac
Gerrit-Change-Number: 8984
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:37:53 +
Gerrit-HasComments: Yes


[kudu-CR] heavy-update-compaction-itest

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9037 )

Change subject: heavy-update-compaction-itest
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9037/1/src/kudu/integration-tests/heavy-update-compaction-itest.cc
File src/kudu/integration-tests/heavy-update-compaction-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9037/1/src/kudu/integration-tests/heavy-update-compaction-itest.cc@92
PS1, Line 92: 
b.AddColumn("val_a")->Type(KuduColumnSchema::STRING)->NotNull();
Maybe parameterize (or at least store in a class constant) the number of string 
rows? Would also help avoid the "i <= 5" magic numbers below.


http://gerrit.cloudera.org:8080/#/c/9037/1/src/kudu/integration-tests/heavy-update-compaction-itest.cc@225
PS1, Line 225: const auto &
Nit: const auto&


http://gerrit.cloudera.org:8080/#/c/9037/1/src/kudu/integration-tests/heavy-update-compaction-itest.cc@230
PS1, Line 230:   EXPECT_EQ(actual_val, 
final_values[final_values_offset++]);
Nit: just curious why you used EXPECT here and not ASSERT, given that the other 
checks in this test are all ASSERT?



--
To view, visit http://gerrit.cloudera.org:8080/9037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2e30c63e1fedafe1eeb672346700ee8c5e2bb2c
Gerrit-Change-Number: 9037
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:34:37 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-16 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#5).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 344 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/5
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8978/4/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/4/src/kudu/fs/data_dirs.cc@810
PS4, Line 810: list of data directories now
 : // includes the missing directory.
> nit: If I'm understanding this correctly, the update should have removed th
Ah yes, will update.



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:17:50 +
Gerrit-HasComments: Yes


[kudu-CR] mini-cluster: rename data root to cluster root

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9033 )

Change subject: mini-cluster: rename data_root to cluster_root
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9033/3/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/9033/3/src/kudu/mini-cluster/external_mini_cluster.h@90
PS3, Line 90:   // Directory in which to store data.
Maybe update this?



--
To view, visit http://gerrit.cloudera.org:8080/9033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8ccc9e232f47a684d6b9226fd84639c4c94d0c3
Gerrit-Change-Number: 9033
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:14:58 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.5.x) KUDU-2193 (part 2): avoid holding TSTabletManager::lock for a long time

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8526 )

Change subject: KUDU-2193 (part 2): avoid holding TSTabletManager::lock_ for a 
long time
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f645775e8347a18af112b308dba56a3b4a2c681
Gerrit-Change-Number: 8526
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:10:29 +
Gerrit-HasComments: No


[kudu-CR](branch-1.4.x) KUDU-2193 (part 2): avoid holding TSTabletManager::lock for a long time

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8528 )

Change subject: KUDU-2193 (part 2): avoid holding TSTabletManager::lock_ for a 
long time
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f645775e8347a18af112b308dba56a3b4a2c681
Gerrit-Change-Number: 8528
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 16 Jan 2018 19:10:36 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1489: allow configuration of metadata dir

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9027 )

Change subject: KUDU-1489: allow configuration of metadata dir
..


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/9027/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9027/4//COMMIT_MSG@20
PS4, Line 20: It is up to the user to take caution in changing this flag; 
updating the
: flag without also manually moving any existing metadata will 
cause Kudu
: to fail at startup.
Isn't the same true of fs_wal_dir and fs_data_dirs? Or is changing 
fs_metadata_dir somehow more dangerous?

I guess what I'm asking is: why call this out explicitly if it works the same 
way as those other flags?


http://gerrit.cloudera.org:8080/#/c/9027/4//COMMIT_MSG@29
PS4, Line 29: cases
Nit: case


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/block_manager_util.cc@176
PS4, Line 176:   DCHECK_NE(first_healthy, -1); // Guaranteed by 
DataDirManager::LoadInstnaces().
Can we not depend on this, and instead return a bad status if there are no 
healthy instances? Seems like a weird dependency since this is a "util" 
function that ostensibly could be called from anywhere.


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@258
PS4, Line 258:   env_->DeleteRecursively(GetTestPath("wal"));
 :   env_->DeleteRecursively(GetTestPath("data"));
Maybe use different tests so you needn't clean up in between?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@271
PS4, Line 271:   opts.metadata_root = "";
Maybe opts.metadata_root.clear() would be more idiomatic?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@284
PS4, Line 284:   ASSERT_OK(fs_manager()->Open());
After this could you check that the WAL root and the metadata root aren't the 
same?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@288
PS4, Line 288:   opts.metadata_root = "";
.clear() here too.


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@517
PS4, Line 517:   // Try to open with a new data dir at the front of the list.
Nit: the other "Try to ..." comments here are all followed up by "this should 
fail", but this no longer fails. Reword?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager-test.cc@524
PS4, Line 524:   // But adding a new data dir elsewhere in the list is OK.
And reword this too.


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.h@90
PS4, Line 90: or the first configured data root if it already contains
:   // existing metadata.
Would it be clearer if we additionally specified that this fallback behavior 
only takes effect when opening an existing filesystem? That is, when creating a 
new filesystem, it's just "verbatim, or WAL dir if empty"?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@93
PS4, Line 93: DEFINE_string(fs_metadata_dir, "",
There should be a mention here of the fallback behavior for existing systems.


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@94
PS4, Line 94: Must be equivalent to fs_wal_dir or a
I thought this wasn't true anymore?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@95
PS4, Line 95: fs_dataA_dirs
fs_data_dirs


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@253
PS4, Line 253: // Check the first data root for metadata.
Could you LOG here what we actually used for the metadata directory, a la 
L243-244?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@256
PS4, Line 256: // If there is no metadata in the first data root, use the 
WAL root.
Hmm, but we wouldn't want to do this fallback when creating a brand new 
filesystem. Does it actually matter? Is it possible for CreateFileSystemLayout 
to succeed if meta_dir_in_data_root already exists?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@265
PS4, Line 265: const auto& canonicalized_metadata_root = 
canonicalized_metadata_fs_root_;
Why do we need this local?


http://gerrit.cloudera.org:8080/#/c/9027/4/src/kudu/fs/fs_manager.cc@421
PS4, Line 421:   if 
(dd_manager_->FindUuidIndexByRoot(canonicalized_metadata_fs_root_.path, 
_idx) &&
I'm confused; why would the DataDirManager be aware of the metadata root, since 
it's no longer guaranteed to be colocated with the data directory?

Oh, this is just for the case 

[kudu-CR] tools: Add debug mode to pb dump tool

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9024 )

Change subject: tools: Add debug mode to pb dump tool
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9024/1//COMMIT_MSG
Commit Message:

PS1:
Could you add a short test to kudu-tool-test? A run with --debug and a check 
for the presence of additional output should suffice.


http://gerrit.cloudera.org:8080/#/c/9024/1//COMMIT_MSG@13
PS1, Line 13: TODO: Should this be called "verbose" mode instead of debug mode?
I don't think it really matters, since the additional information is just for 
debugging.


http://gerrit.cloudera.org:8080/#/c/9024/1/src/kudu/tools/tool_action_pbc.cc
File src/kudu/tools/tool_action_pbc.cc:

http://gerrit.cloudera.org:8080/#/c/9024/1/src/kudu/tools/tool_action_pbc.cc@92
PS1, Line 92:   }
Nit: could you add an else that does LOG(FATAL) or something like that?


http://gerrit.cloudera.org:8080/#/c/9024/1/src/kudu/tools/tool_action_pbc.cc@232
PS1, Line 232:   .AddOptionalParameter("oneline")
Should add debug here.


http://gerrit.cloudera.org:8080/#/c/9024/1/src/kudu/util/pb_util.cc
File src/kudu/util/pb_util.cc:

http://gerrit.cloudera.org:8080/#/c/9024/1/src/kudu/util/pb_util.cc@937
PS1, Line 937: "---"
Nit: if this is now being used in three places, perhaps make it a constant so 
there's no worry about the number of dashes being different in each of the 
three places?



--
To view, visit http://gerrit.cloudera.org:8080/9024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica2a74a2e252d140cf7704ff68037a64db19cd80
Gerrit-Change-Number: 9024
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 18:41:35 +
Gerrit-HasComments: Yes


[kudu-CR] mini-cluster: rename data root to cluster root

2018-01-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9033 )

Change subject: mini-cluster: rename data_root to cluster_root
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9033/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9033/1//COMMIT_MSG@17
PS1, Line 17: As such, this patch renames ExternalMiniClusterOptions's 
data_root to
Could you make the same change to the InternalMiniCluster too?



--
To view, visit http://gerrit.cloudera.org:8080/9033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8ccc9e232f47a684d6b9226fd84639c4c94d0c3
Gerrit-Change-Number: 9033
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 16 Jan 2018 18:34:30 +
Gerrit-HasComments: Yes


[kudu-CR] fs: update default data dir group size

2018-01-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8995 )

Change subject: fs: update default data dir group size
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2dd0d3f8cf140f684318c0dffb45a1091302ecdc
Gerrit-Change-Number: 8995
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sun, 14 Jan 2018 06:16:11 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1913: LIFO wake up ordering for threadpool worker threads

2018-01-12 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9021 )

Change subject: KUDU-1913: LIFO wake up ordering for threadpool worker threads
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
Gerrit-Change-Number: 9021
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1913: LIFO wake up ordering for threadpool worker threads

2018-01-12 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9021 )

Change subject: KUDU-1913: LIFO wake up ordering for threadpool worker threads
..


Patch Set 1: Verified+1

One ASAN test timed out in table creation, which I think is unrelated to this 
commit.


--
To view, visit http://gerrit.cloudera.org:8080/9021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
Gerrit-Change-Number: 9021
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 13 Jan 2018 00:58:49 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1913: LIFO wake up ordering for threadpool worker threads

2018-01-12 Thread Adar Dembo (Code Review)
Hello Dan Burkert, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9021

to review the following change.


Change subject: KUDU-1913: LIFO wake up ordering for threadpool worker threads
..

KUDU-1913: LIFO wake up ordering for threadpool worker threads

This patch changes the wake-up ordering for idle pool worker threads
to be LIFO. Previously all worker threads idled on a single
ConditionVariable which, by virtue of using pthread_cond_t under the hood,
was FIFO ordered.

In the abstract, FIFO ordering ensures a fair distribution of work amongst a
set of threads, but that's totally undesirable in a thread pool where the
goal is to get work done as quickly and with as few threads as possible. For
example, a fast stream of cheap tasks (e.g. RPCs) should be serviceable via
a single thread or something close to that.

The new unit test was looped 1000 times in TSAN mode with 8 stress threads
to shake out any flakes. Additionally, I ran through a variation of Todd's
test from a past code review [1]. I spun up three tservers, two of which had
failure detection disabled so that the third would get all of the leader
replicas. I created 240 tablets and looked at the size of the Raft thread
pool on that tserver. The steady state thread count was mildly lower at
first (4 vs. 8) but was much lower (4 vs. 30) after a SIGSTOP+SIGCONT cycle.

1. https://gerrit.cloudera.org/c/7331

Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
---
M src/kudu/util/test_util.cc
M src/kudu/util/test_util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
5 files changed, 114 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/9021/1
--
To view, visit http://gerrit.cloudera.org:8080/9021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8036bf0d15f9ffcb3fa76579e3bfe0a340d38320
Gerrit-Change-Number: 9021
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-10 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#4).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 343 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/4
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8978/3/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/8978/3/src/kudu/fs/data_dirs.h@222
PS3, Line 222: without updating the latter
> nit: the naming here gives me pause since the name is no longer accurate, b
Yeah, I couldn't think of a better name either.


http://gerrit.cloudera.org:8080/#/c/8978/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/3/src/kudu/fs/data_dirs.cc@810
PS3, Line 810:  the missing directory made its way
 : // into the list of data directories.
> nit: at first glance, I read this as, "We have a missing directory and it i
Done



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 11 Jan 2018 03:47:40 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-10 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8978

to look at the new patch set (#3).

Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 343 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/3
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.h@220
PS1, Line 220: the DataDirManager is
 :   // opened as per the new contents of of the provided 
directories but
> nit: extra "of"
Changed to:

  // If 'update_on_disk' and 'read_only' are both true, the directory manager
  // allows the provided directories to deviate from their on-disk data
  // structures without updating the latter to match the former.


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.h@223
PS1, Line 223:   bool update_on_disk;
> nit: should still note the default
Done


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.cc@657
PS1, Line 657: Note: If 'has_missing_instance' is true, opts_.update_on_disk is
 :   // guaranteed to be true.
> nit: Maybe note that this is due to the ternary parameterization of CheckIn
Hmm, I originally included this so that it isn't confusing that L659-660 omits 
a check for opts_.update_on_disk.

I'll change it as per your first suggestion.


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/data_dirs.cc@805
PS1, Line 805: // This uuid's directory is missing outright, which can happen 
when
 : // opts_.read_only and opts_.update_on_disk are both 
true.
> After another look, I think this is ok.
OK, I took another stab at this comment. Let me know what you think.


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/fs/fs_manager.h@102
PS1, Line 102:   bool update_on_disk;
> nit: note the default
Done


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/tools/kudu-tool-test.cc@2232
PS1, Line 2232: ASSERT_OK(mts->WaitStarted());
> Could we check that we still have all the tablets we expected?
We could, but is it necessary? We wouldn't expect tablets to just disappear, 
right? We don't check them after adding a data directory above.


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/tools/kudu-tool-test.cc@2240
PS1, Line 2240:   // Reconfigure the tserver to drop the data directory and 
restart it.
> nit: mind adding to the comment something along the lines of, "waiting for
Yeah, it's not intuitive that WaitStarted will return the first bootstrap 
failure. I'll note that.


http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8978/1/src/kudu/tools/tool_action_fs.cc@854
PS1, Line 854: "Updates the set of data directories in an existing Kudu 
filesystem")
 :   .ExtraDescription("If a data directory is in use by a 
tablet and is "
 :   "removed, the operation will fail unless --force is 
also used")
 :   .AddOptionalParameter("fs_wal_dir")
 :   .AddOptionalParameter("fs_data_dirs")
 :   .AddOptionalParameter("force", boost::none, string("If 
true, permits "
 :   "the removal of a data directory that is configured 
for use by "
 :   "existing tablets. Those tablets will fail the next 
time the server "
 :   "is started"))
> micro-nit: I know in docs we favor being explicit, that local replicas are
I'm not sure; as you can see, List uses 'tablets' instead of 'local replicas'.



--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 11 Jan 2018 00:13:40 +
Gerrit-HasComments: Yes


[kudu-CR] data dirs: fix logging message

2018-01-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8999 )

Change subject: data_dirs: fix logging message
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8135e0ad2411524fcb74b3e349bea7a5828a4237
Gerrit-Change-Number: 8999
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 10 Jan 2018 22:50:20 +
Gerrit-HasComments: No


[kudu-CR] fs: update default data dir group size

2018-01-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8995 )

Change subject: fs: update default data dir group size
..


Patch Set 2: Code-Review+1

(1 comment)

Since this change has non-trivial impact it would be nice to get it reviewed by 
1-2 more folks.

http://gerrit.cloudera.org:8080/#/c/8995/2/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8995/2/src/kudu/fs/data_dirs.cc@73
PS2, Line 73: TAG_FLAG(fs_target_data_dirs_per_tablet, evolving);
Not stable?



--
To view, visit http://gerrit.cloudera.org:8080/8995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2dd0d3f8cf140f684318c0dffb45a1091302ecdc
Gerrit-Change-Number: 8995
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 10 Jan 2018 22:32:20 +
Gerrit-HasComments: Yes


[kudu-CR] data dirs: fix logging message

2018-01-10 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8999 )

Change subject: data_dirs: fix logging message
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8999/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8999/1//COMMIT_MSG@10
PS1, Line 10:
Nit: got an extra space here


http://gerrit.cloudera.org:8080/#/c/8999/1/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8999/1/src/kudu/fs/data_dirs.cc@834
PS1, Line 834:   LOG(INFO) << Substitute(msg);
Why INFO and not WARNING?



--
To view, visit http://gerrit.cloudera.org:8080/8999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8135e0ad2411524fcb74b3e349bea7a5828a4237
Gerrit-Change-Number: 8999
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 10 Jan 2018 22:29:30 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Patch Set 2: Verified+1

The test failures were a batch of clock synchronization errors on one 
distributed slave.


--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 10 Jan 2018 00:12:00 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/8978 )

Change subject: KUDU-2202: support for removing data directories (take two)
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [raft consensus-itest] fix for Test KUDU 1735

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8987 )

Change subject: [raft_consensus-itest] fix for Test_KUDU_1735
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d96b66d528f90f7611ef32726401e3208b0e47d
Gerrit-Change-Number: 8987
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:17:41 +
Gerrit-HasComments: No


[kudu-CR] [tests] update tests for replication scheme consistency

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8979 )

Change subject: [tests] update tests for replication scheme consistency
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f5bbb581abeb3a781b522dff90711b2ffc988df
Gerrit-Change-Number: 8979
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:28:58 +
Gerrit-HasComments: No


[kudu-CR] [tests] update tests for replication scheme consistency

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8979 )

Change subject: [tests] update tests for replication scheme consistency
..


Patch Set 1: Code-Review+2

What about the raft_consensus-itest failure? Is that unrelated? I saw it 
locally, and it also cropped up in our nightly tests.

Here's the output from my local run: 
https://gist.github.com/adembo/3b28823884b457f0c4019e9e17a0ab6a.


--
To view, visit http://gerrit.cloudera.org:8080/8979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f5bbb581abeb3a781b522dff90711b2ffc988df
Gerrit-Change-Number: 8979
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:22:58 +
Gerrit-HasComments: No


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-09 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8376

to look at the new patch set (#3).

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..

fs: defer failure from metadata load to bootstrap when data dir is missing

An upcoming patch adds CLI tool actions to add and remove data directories.
When a data dir is removed, all tablets with data on it will fail. Today
that failure manifests as a FindOrDie in DataDirGroup::FromPB; we need to
make that a little bit more graceful.

This patch modifies the DataDirGroup FromPB/CopyToPB methods to return a
failure when a data dir is missing. It further changes TabletMetadata to
treat such failures as non-fatal, and adds checks to TabletBootstrap so that
the failures manifest there instead.

No tests in this patch because:
1. Andrew is already working on tablet-level tests for failed disks, and
2. The CLI tool patch adds coverage at the itest-level.

Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 136 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/8376/3
--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2202: support for removing data directories (take two)

2018-01-09 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/8978

to review the following change.


Change subject: KUDU-2202: support for removing data directories (take two)
..

KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. PathInstanceMetadataFile::CheckIntegrity can now be run without checking
   for missing instance files. This is used when removing a data dir (its
   instance file will be missing).
2. The DataDirManager and FsManager may now be constructed with both
   read_only=true and update_on_disk=true. The CLI tool leverages this
   functionality to create a "speculative" FsManager using the new data dir
   configuration that is used to check whether existing tablets depend on
   the data dir to be removed.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
10 files changed, 317 insertions(+), 81 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/8978/1
--
To view, visit http://gerrit.cloudera.org:8080/8978
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Gerrit-Change-Number: 8978
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8376 )

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.h@81
PS2, Line 81:   Status LoadFromPB(const UuidIndexByUuidMap& uuid_idx_by_uuid,
> worth short docs explaining the cases of bad-status here.
Done


http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.h@84
PS2, Line 84:   Status CopyToPB(const UuidByUuidIndexMap& uuid_by_uuid_idx,
> and here
Done


http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.h@276
PS2, Line 276: data dir
 :   // is missing.
> only looked at the headers so far (not the impl) but this left me a little
Given my response to your other comment (in the impl), is this still relevant?


http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.cc@318
PS2, Line 318: if (!FindCopy(uuid_by_uuid_idx, uuid_idx, )) {
 :   return Status::NotFound(Substitute(
 :   "could not find data dir with uuid index $0", 
uuid_idx));
> would this not be a programmer error? how would you end up with a uuid_idx
It is, but I thought the symmetry with LoadFromPB was worthwhile because it 
makes the class' behavior more predictable.


http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/fs/data_dirs.cc@831
PS2, Line 831:   RETURN_NOT_OK(group->CopyToPB(uuid_by_idx_, pb));
> I guess my confusion in the header was wrong, but per my comments above, I
Indeed, but since I want CopyToPB to be symmetric with LoadFromPB, a 
RETURN_NOT_OK here seemed more appropriate.


http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8376/2/src/kudu/tablet/tablet_metadata.cc@440
PS2, Line 440: DeleteOrphanedBlocks(orphaned_blocks);
> pondering this a bit more, I wonder whether we need to actually start stori
This was a legitimate issue that Andrew has since fixed in commit 
47b81c452194e75da7fd966f07766de4bdcdeab0.



--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:02:40 +
Gerrit-HasComments: Yes


[kudu-CR] fs: defer failure from metadata load to bootstrap when data dir is missing

2018-01-09 Thread Adar Dembo (Code Review)
Adar Dembo has restored this change. ( http://gerrit.cloudera.org:8080/8376 )

Change subject: fs: defer failure from metadata load to bootstrap when data dir 
is missing
..


Restored
--
To view, visit http://gerrit.cloudera.org:8080/8376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: restore
Gerrit-Change-Id: I1e8d5697c2bb08287cce11fbdab6fb8d6e37d1ad
Gerrit-Change-Number: 8376
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

2018-01-08 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 08 Jan 2018 21:34:30 +
Gerrit-HasComments: No


[kudu-CR] fs: fix cleanup after failure updating data dirs

2018-01-05 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8957 )

Change subject: fs: fix cleanup after failure updating data dirs
..

fs: fix cleanup after failure updating data dirs

The typo here meant that new fs instance files were not being deleted in the
event of an error during the data dir update process. This also prevented
new data dirs from being deleted too, since they weren't empty.

Change-Id: Ic4ec8dc913f9f6dee81a7404b0c0c25f738c9ff1
Reviewed-on: http://gerrit.cloudera.org:8080/8957
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M src/kudu/fs/fs_manager.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic4ec8dc913f9f6dee81a7404b0c0c25f738c9ff1
Gerrit-Change-Number: 8957
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Add 'kudu fs list' tool

2018-01-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 10
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:50:15 +
Gerrit-HasComments: No


[kudu-CR] Add 'kudu fs list' tool

2018-01-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/8/src/kudu/tools/kudu-tool-test.cc@409
PS8, Line 409: "list.*List metadata for on-disk tablets, rowsets, 
blocks"
Nit: could you preserve the alphabetical ordering that was here previously?



--
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 9
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Sat, 06 Jan 2018 00:33:41 +
Gerrit-HasComments: Yes


[kudu-CR] fs: fix cleanup after failure updating data dirs

2018-01-05 Thread Adar Dembo (Code Review)
Hello Andrew Wong,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/8957

to review the following change.


Change subject: fs: fix cleanup after failure updating data dirs
..

fs: fix cleanup after failure updating data dirs

The typo here meant that new fs instance files were not being deleted in the
event of an error during the data dir update process. This also prevented
new data dirs from being deleted too, since they weren't empty.

Change-Id: Ic4ec8dc913f9f6dee81a7404b0c0c25f738c9ff1
---
M src/kudu/fs/fs_manager.cc
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/57/8957/1
--
To view, visit http://gerrit.cloudera.org:8080/8957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic4ec8dc913f9f6dee81a7404b0c0c25f738c9ff1
Gerrit-Change-Number: 8957
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 


[kudu-CR] KUDU-2251: rowset size can overflow int in RowSetInfo

2018-01-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8951 )

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc
File src/kudu/integration-tests/heavy-update-compaction-test.cc:

PS4:
We talked about this a little bit in person: I think it'd be good to produce a 
second, smaller test that's targeted for this particular regression. Hopefully 
such a test could work without actually writing 2 GB to disk, so it could 
really be a simple unit test. The problem is that the generality of this test 
makes it hard to actually ensure that the bug doesn't regress.


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@92
PS4, Line 92: gscoped_ptr 
table_creator(client_->NewTableCreator());
unique_ptr in new code.


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@116
PS4, Line 116:   void MakeRow(int64_t key, KuduPartialRow* row);
For simplicity's sake, perhaps inline the definition of MakeRow here?


http://gerrit.cloudera.org:8080/#/c/8951/4/src/kudu/integration-tests/heavy-update-compaction-test.cc@155
PS4, Line 155:   ASSERT_NO_FATAL_FAILURE(CreateTable());
Nit: use the shorter NO_FATALS macro.



--
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 05 Jan 2018 21:55:13 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.4.x) KUDU-2193 (part 1): switch to a waiting mutex in TSTabletManager

2018-01-04 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8527 )

Change subject: KUDU-2193 (part 1): switch to a waiting mutex in TSTabletManager
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I763abddd74d8b1dabb618318dc84256b533077e3
Gerrit-Change-Number: 8527
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 04 Jan 2018 19:43:50 +
Gerrit-HasComments: No


[kudu-CR](branch-1.5.x) KUDU-2193 (part 1): switch to a waiting mutex in TSTabletManager

2018-01-04 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8525 )

Change subject: KUDU-2193 (part 1): switch to a waiting mutex in TSTabletManager
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I763abddd74d8b1dabb618318dc84256b533077e3
Gerrit-Change-Number: 8525
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 04 Jan 2018 19:43:47 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2191 (5/n): Add Kerberos SASL support to the HMS client

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8692 )

Change subject: KUDU-2191 (5/n): Add Kerberos SASL support to the HMS client
..


Patch Set 4:

(2 comments)

I didn't review sasl_client_transport or the krpc changes.

http://gerrit.cloudera.org:8080/#/c/8692/4/src/kudu/hms/hms_client.cc
File src/kudu/hms/hms_client.cc:

http://gerrit.cloudera.org:8080/#/c/8692/4/src/kudu/hms/hms_client.cc@140
PS4, Line 140: // TODO(dan): check necessary?
Perhaps move this logic (and thus the construction of protocol and client_) 
into Start(), so that you can return a bad Status?

Alternatively, you could make HmsClient use two-phase initialization and add an 
Init() method that can return a bad Status.


http://gerrit.cloudera.org:8080/#/c/8692/4/src/kudu/hms/mini_hms.h
File src/kudu/hms/mini_hms.h:

http://gerrit.cloudera.org:8080/#/c/8692/4/src/kudu/hms/mini_hms.h@58
PS4, Line 58:   // Configures the mini HMS to use Kerberos.
Why build this as a setter vs. adding it to the constructor? There's no use 
case for calling it multiple times, right? Or for calling it after Start?



--
To view, visit http://gerrit.cloudera.org:8080/8692
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f217ae05fd36c8ee88fe20eeccd73d49233a345
Gerrit-Change-Number: 8692
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 04 Jan 2018 00:44:27 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2191 (4/n): HMS Thrift client fault handling

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8494 )

Change subject: KUDU-2191 (4/n): HMS Thrift client fault handling
..


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/hms_client.h
File src/kudu/hms/hms_client.h:

http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/hms_client.h@51
PS7, Line 51: the
to


http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/hms_client.h@84
PS7, Line 84: explicit
No longer needed now that this is multi-arg?


http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/hms_client.h@85
PS7, Line 85:  MonoDelta send_timeout = 
MonoDelta::FromSeconds(60),
Would be more ergonomic to create an HmsClientOptions struct and stuff these 
(with documentation) in there.


http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/mini_hms.cc
File src/kudu/hms/mini_hms.cc:

http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/mini_hms.cc@111
PS7, Line 111: "-p", std::to_string(port_),
Was this change made so that you could use Start() to restart and have it use 
the same port as before? Might want to document that behavior.

Also, how do we prevent test flakiness caused by another application binding to 
our previously bound ephemeral port?


http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/mini_hms.cc@131
PS7, Line 131: WARN_NOT_OK
Why not RETURN_NOT_OK?


http://gerrit.cloudera.org:8080/#/c/8494/7/src/kudu/hms/mini_hms.cc@139
PS7, Line 139: KUDU
You can drop this prefix. In Continue() too.



--
To view, visit http://gerrit.cloudera.org:8080/8494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic48bbb833bbae39b383ae239054b9710da3c746d
Gerrit-Change-Number: 8494
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 04 Jan 2018 00:36:29 +
Gerrit-HasComments: Yes


[kudu-CR] Add 'kudu fs list' tool

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8911 )

Change subject: Add 'kudu fs list' tool
..


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/kudu-tool-test.cc@404
PS6, Line 404: const vector kFsModeRegexes = {
This should be updated.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@396
PS6, Line 396:   switch (group) {
What happens if I add an entry to FieldGroup but forgot to update this switch? 
Do I get a compile error? A warning? Nothing?

If it's not a compile error, is there anything we should add here to guarantee 
good behavior? Like a default statement, a LOG(FATAL), or something like that?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@415
PS6, Line 415:   static const Field kFieldVariants[] = {
Could you use the enum macros from gutil/casts.h to avoid this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@483
PS6, Line 483: // Returns rowset info for the field.
Worth logging the min/max keys of the rowset?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@584
PS6, Line 584: CHECK_OK(fs_manager->OpenBlock(block, _block));
CHECK_OK seems wrong for a CLI tool; why not RETURN_NOT_OK and return an error 
from List if one of these fails?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@620
PS6, Line 620: tablet_ids.emplace_back(FLAGS_tablet_id);
Do we need to ToLowerCase this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@633
PS6, Line 633: WARN_NOT_OK(TabletMetadata::Load(_manager, tablet_id, 
_metadata),
Why not RETURN_NOT_OK on this?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@652
PS6, Line 652:   if (FLAGS_rowset_id > 0 && FLAGS_rowset_id != rowset.id()) 
{
Shouldn't this be FLAGS_rowset_id != -1 && FLAGS_rowset_id != rowset.id() ?


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@688
PS6, Line 688:   // TODO(dan): should orphaned blocks be included, perhaps 
behind a flag?
Perhaps, but this comment is slightly misplaced; orphaned blocks are a 
tablet-level thing, not a rowset-level thing, so the comment should be outside 
the inner (rowset_metadata) loop.


http://gerrit.cloudera.org:8080/#/c/8911/6/src/kudu/tools/tool_action_fs.cc@786
PS6, Line 786:"Possible values: table, 
table-id, tablet-id, partition, "
Seems like it might be easy to accidentally omit an entry; could we construct 
this list on-the-fly by iterating on the Field enum class?



--
To view, visit http://gerrit.cloudera.org:8080/8911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5a63e636d95e3ee55bb4955cece7f5d0b7532d
Gerrit-Change-Number: 8911
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 04 Jan 2018 00:22:53 +
Gerrit-HasComments: Yes


[kudu-CR] periodic: fix a comment

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8931 )

Change subject: periodic: fix a comment
..

periodic: fix a comment

Change-Id: Id04bff4f6505595f60f26abe88de58cdded4d9d0
Reviewed-on: http://gerrit.cloudera.org:8080/8931
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M src/kudu/rpc/periodic.h
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8931
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id04bff4f6505595f60f26abe88de58cdded4d9d0
Gerrit-Change-Number: 8931
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] periodic: fix a comment

2018-01-03 Thread Adar Dembo (Code Review)
Hello Mike Percy, Alexey Serbin,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/8931

to review the following change.


Change subject: periodic: fix a comment
..

periodic: fix a comment

Change-Id: Id04bff4f6505595f60f26abe88de58cdded4d9d0
---
M src/kudu/rpc/periodic.h
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/8931/1
--
To view, visit http://gerrit.cloudera.org:8080/8931
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id04bff4f6505595f60f26abe88de58cdded4d9d0
Gerrit-Change-Number: 8931
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-721: Support for Decimal type, Part 1 (Server, C++ Client)

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8830 )

Change subject: KUDU-721: Support for Decimal type, Part 1 (Server, C++ Client)
..


Patch Set 6:

(4 comments)

I only looked at the public API changes.

http://gerrit.cloudera.org:8080/#/c/8830/6/src/kudu/client/schema.h
File src/kudu/client/schema.h:

http://gerrit.cloudera.org:8080/#/c/8830/6/src/kudu/client/schema.h@64
PS6, Line 64: public:
Nit: indentation


http://gerrit.cloudera.org:8080/#/c/8830/6/src/kudu/client/schema.h@70
PS6, Line 70:   explicit KuduColumnTypeAttributes(int32_t precision, int32_t 
scale)
The 'explicit' keyword is only needed to avoid implicit conversions with 
single-arg constructors, no?


http://gerrit.cloudera.org:8080/#/c/8830/6/src/kudu/client/schema.h@75
PS6, Line 75:   const int32_t precision() const {
These accessors are returning an int32_t 'copy', so what value does the 'const' 
keyword add (as in 'const int32_t ...')?


http://gerrit.cloudera.org:8080/#/c/8830/6/src/kudu/client/schema.h@84
PS6, Line 84:  private:
Do you anticipate KuduColumnTypeAttributes evolving for other use cases in the 
future? If so, you should PIMPL the class: define a private Data nested class, 
store precision/scale within it, and have the public class merely store an 
allocated pointer to the nested class.



--
To view, visit http://gerrit.cloudera.org:8080/8830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b06142f43c66973f36376bd2c88ca6f8d9f7632
Gerrit-Change-Number: 8830
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 03 Jan 2018 19:18:24 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2216. Post process gtest generated xml to include the output from the *.txt files

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8757 )

Change subject: KUDU-2216. Post process gtest generated xml to include the 
output from the *.txt files
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8757/12/build-support/jenkins/add_std_out_to_junit.py
File build-support/jenkins/add_std_out_to_junit.py:

http://gerrit.cloudera.org:8080/#/c/8757/12/build-support/jenkins/add_std_out_to_junit.py@68
PS12, Line 68:  Globs for *.gz ands *. xml logs in log_location, then sorts 
the logs
I'm missing a ton of context, but why not run this script out of run-test.sh 
instead of build-and-test.sh? That way it'll run once per test and the precise 
gz/xml file pair will be known, right?


http://gerrit.cloudera.org:8080/#/c/8757/12/build-support/jenkins/add_std_out_to_junit.py@69
PS12, Line 69:  I
Comment style nit: you'll find the pronouns "we" and "us" more commonly used 
than "I" and "me".



--
To view, visit http://gerrit.cloudera.org:8080/8757
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4a9a147f29e37e380cb32ff5af43e1290a1a70
Gerrit-Change-Number: 8757
Gerrit-PatchSet: 12
Gerrit-Owner: Edward Fancher 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 03 Jan 2018 19:07:51 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2229. consensus: Leader should not start FD

2018-01-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8711 )

Change subject: KUDU-2229. consensus: Leader should not start FD
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8711/4/src/kudu/rpc/periodic.h
File src/kudu/rpc/periodic.h:

http://gerrit.cloudera.org:8080/#/c/8711/4/src/kudu/rpc/periodic.h@144
PS4, Line 144:   // Returns true iff the failure detected has been started.
FWIW, this comment is wrong: it refers to failure detection rather than to this 
generic periodic timer system.



--
To view, visit http://gerrit.cloudera.org:8080/8711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2ec9c5499e8e4c1659333bd53dd2d7f6dae81f5
Gerrit-Change-Number: 8711
Gerrit-PatchSet: 4
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 03 Jan 2018 18:51:52 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2202 avoid block ID reuse for missing dirs

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8465 )

Change subject: KUDU-2202 avoid block ID reuse for missing dirs
..


Patch Set 7:

(3 comments)

Looks good, just some nits on comments.

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8465/7//COMMIT_MSG@24
PS7, Line 24: the
Double the


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/fs/block_manager.h@265
PS7, Line 265:   // externally-referenced ids that they may not have previously 
found.
Getting better. Perhaps it would be good to add an example that will help 
people understand how the block manager may have previously not found a block? 
Like "...avoid reusing externally-referenced ids that they may not have 
previously found (e.g. because those ids' blocks were on a data directory that 
failed)".


http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/8465/7/src/kudu/tablet/tablet_metadata.cc@404
PS7, Line 404: if a disk
 : // was not read
Can you reword this example so it's more relatable to the relevant Kudu 
concepts? Like, it should probably talk about data directories rather than 
disks, and maybe it should explain why a directory would not be "read" (because 
it failed).



--
To view, visit http://gerrit.cloudera.org:8080/8465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I06f10e95278dafdedcab432a7d4d1dc5c59bf4cc
Gerrit-Change-Number: 8465
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:16:04 +
Gerrit-HasComments: Yes


[kudu-CR] tool: new action for updating the set of data directories (take 2)

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/8352 )

Change subject: tool: new action for updating the set of data directories (take 
2)
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/8352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I6ddbdd6cc6231996e7802a622a8b4691527a0643
Gerrit-Change-Number: 8352
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tool: new action for updating the set of data directories (take 2)

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8352 )

Change subject: tool: new action for updating the set of data directories (take 
2)
..


Patch Set 8: Verified+1

Overriding Jenkins, unrelated test failures.


--
To view, visit http://gerrit.cloudera.org:8080/8352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ddbdd6cc6231996e7802a622a8b4691527a0643
Gerrit-Change-Number: 8352
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:09:44 +
Gerrit-HasComments: No


[kudu-CR] fs: replace CreateDirWithMissing with env util calls

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8469 )

Change subject: fs: replace CreateDirWithMissing with env_util calls
..

fs: replace CreateDirWithMissing with env_util calls

Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
Reviewed-on: http://gerrit.cloudera.org:8080/8469
Reviewed-by: Andrew Wong 
Tested-by: Adar Dembo 
---
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_copy_client-test.cc
6 files changed, 14 insertions(+), 15 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Adar Dembo: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
Gerrit-Change-Number: 8469
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: replace CreateDirWithMissing with env util calls

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/8469 )

Change subject: fs: replace CreateDirWithMissing with env_util calls
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/8469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
Gerrit-Change-Number: 8469
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: replace CreateDirWithMissing with env util calls

2017-11-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8469 )

Change subject: fs: replace CreateDirWithMissing with env_util calls
..


Patch Set 1: Verified+1

Overriding Jenkins, unrelated test failures.


--
To view, visit http://gerrit.cloudera.org:8080/8469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
Gerrit-Change-Number: 8469
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 04 Nov 2017 05:08:48 +
Gerrit-HasComments: No


[kudu-CR] tool: new action for updating the set of data directories (take 2)

2017-11-03 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8352

to look at the new patch set (#8).

Change subject: tool: new action for updating the set of data directories (take 
2)
..

tool: new action for updating the set of data directories (take 2)

This patch includes support for updating the set of data directories in an
existing Kudu filesystem, though it only supports adding for the time being.
The only user-facing bit is a new FsManager option that, when set, augments
Open() to also look for missing fs roots. If any are found, they will be
created, and the existing data directory instance files updated to recognize
these new roots. Also included is a new tool action that opens an FsManager
with this option set.

Updating the set of data directories is a complex, multi-step operation, and
a single error could leave the filesystem in a difficult-to-repair state. As
such, I also wrote some fairly gnarly rollback code that attempts to undo the
changes made in the event of an error.

This logic can be extended to remove data directories, but we'll need to
address KUDU-2202 first.

Change-Id: I6ddbdd6cc6231996e7802a622a8b4691527a0643
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_fs.cc
11 files changed, 688 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/8352/8
--
To view, visit http://gerrit.cloudera.org:8080/8352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ddbdd6cc6231996e7802a622a8b4691527a0643
Gerrit-Change-Number: 8352
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] fs: replace CreateDirWithMissing with env util calls

2017-11-03 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/8469

to review the following change.


Change subject: fs: replace CreateDirWithMissing with env_util calls
..

fs: replace CreateDirWithMissing with env_util calls

Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
---
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_copy_client-test.cc
6 files changed, 14 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/8469/1
--
To view, visit http://gerrit.cloudera.org:8080/8469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9983466e31f694d3b1c748d548ab3dafe2cbb1c0
Gerrit-Change-Number: 8469
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


  1   2   3   4   5   6   7   8   9   10   >