Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/17974
to look at the new patch set (#9).
Change subject: [encryption] KUDU-3331 Encrypt file system
......................................................................
[encryption] KUDU-3331 Encrypt file system
de02a34 introduced encryption support to Env in a self-contained way,
but it's not used across Kudu.
This commit integrates this encryption support into the project and
modifies several test suites to also run tests with encryption enabled.
I also renamed "encrypted" to "is_sensitive" in *FileOption as a file
with this flag will be encrypted only if encryption is enabled for the
process.
When encryption is enabled, the following files are encrypted:
- WAL segments
- LBM blocks and metadata
- FBM blocks
- tablet and consensus metadata
Logs, rolling logs, instance and block manager instance files,
configuration files in integration tests are not encrypted.
As FileCache is not used to access instance files, it only supports
handling sensitive files and can't be used to access unencrypted files.
As the PBC CLI tool be used to dump encrypted (metadata) and unencrypted
files (instance) as well, it needs to be able to determine if a file is
encrypted or not. As encryption headers are not yet implemented, I
introduced a hack which checks the file name and treats the file as
unencrypted if it ends with "instance" and encrypted otherwise.
I ran some benchmarks to compare running Kudu with encryption enabled
and disabled.
The following are StartupBenchmark tests run with KUDU_ALLOW_SLOW_TESTS
set to true, which uses a block count of 1,000,000.
It seems enabling encryption adds around 20% overhead on startup in a
typical use-case with no deletes. All tests below were ran in release
mode.
Performance counter stats for './bin/log_block_manager-test
--gtest_filter=*StartupBenchmark/0' (10 runs):
40391.075316 task-clock (msec) # 2.021 CPUs utilized
( +- 1.05% )
11,089 context-switches # 0.275 K/sec
( +- 9.87% )
280 cpu-migrations # 0.007 K/sec
( +- 1.58% )
593,982 page-faults # 0.015 M/sec
( +- 2.13% )
110,595,311,391 cycles # 2.738 GHz
( +- 1.03% )
90,580,214,722 instructions # 0.82 insn per cycle
( +- 0.14% )
16,449,237,957 branches # 407.249 M/sec
( +- 0.15% )
67,169,915 branch-misses # 0.41% of all branches
( +- 0.49% )
19.988553457 seconds time elapsed
( +- 0.58% )
Performance counter stats for './bin/log_block_manager-test
--encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/0' (10 runs):
51317.845606 task-clock (msec) # 2.133 CPUs utilized
( +- 0.90% )
13,214 context-switches # 0.257 K/sec
( +- 4.03% )
292 cpu-migrations # 0.006 K/sec
( +- 1.76% )
737,815 page-faults # 0.014 M/sec
( +- 1.49% )
144,898,246,536 cycles # 2.824 GHz
( +- 1.08% )
126,702,271,070 instructions # 0.87 insn per cycle
( +- 0.05% )
24,116,649,584 branches # 469.947 M/sec
( +- 0.05% )
106,793,688 branch-misses # 0.44% of all branches
( +- 0.35% )
24.055824830 seconds time elapsed
( +- 0.89% )
With deletes, the difference seems to decrease to about 14% when 90% of
the blocks are deleted.
Performance counter stats for './bin/log_block_manager-test
--gtest_filter=*StartupBenchmark/1' (10 runs):
53247.212289 task-clock (msec) # 1.494 CPUs utilized
( +- 0.69% )
94,868 context-switches # 0.002 M/sec
( +- 0.13% )
530 cpu-migrations # 0.010 K/sec
( +- 1.48% )
399,284 page-faults # 0.007 M/sec
( +- 1.66% )
145,147,457,046 cycles # 2.726 GHz
( +- 0.48% )
141,892,983,444 instructions # 0.98 insn per cycle
( +- 0.04% )
26,167,495,753 branches # 491.434 M/sec
( +- 0.04% )
59,986,442 branch-misses # 0.23% of all branches
( +- 0.33% )
35.648681894 seconds time elapsed
( +- 1.40% )
Performance counter stats for './bin/log_block_manager-test
--encrypt_data_at_rest=1 --gtest_filter=*StartupBenchmark/1' (10 runs):
70616.598642 task-clock (msec) # 1.737 CPUs utilized
( +- 0.81% )
95,082 context-switches # 0.001 M/sec
( +- 0.28% )
523 cpu-migrations # 0.007 K/sec
( +- 1.69% )
679,834 page-faults # 0.010 M/sec
( +- 1.66% )
203,066,615,244 cycles # 2.876 GHz
( +- 1.05% )
209,355,734,267 instructions # 1.03 insn per cycle
( +- 0.08% )
40,477,560,095 branches # 573.202 M/sec
( +- 0.07% )
133,637,310 branch-misses # 0.33% of all branches
( +- 1.48% )
40.653406472 seconds time elapsed
( +- 1.52% )
Delete tablet benchmark takes less than a second to run, so I ran it
1000 times with encryption disabled and enabled. It seems encryption
costs about 30% of overhead in this case.
Performance counter stats for './bin/tablet_server-test
--gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):
735.800649 task-clock (msec) # 0.994 CPUs utilized
( +- 0.33% )
3,613 context-switches # 0.005 M/sec
( +- 0.15% )
178 cpu-migrations # 0.242 K/sec
( +- 0.29% )
10,722 page-faults # 0.015 M/sec
( +- 0.08% )
1,316,404,469 cycles # 1.789 GHz
( +- 0.19% )
1,629,691,550 instructions # 1.24 insn per cycle
( +- 0.21% )
337,778,107 branches # 459.062 M/sec
( +- 0.19% )
6,340,956 branch-misses # 1.88% of all branches
( +- 0.21% )
0.739940005 seconds time elapsed
( +- 2.33% )
Performance counter stats for './bin/tablet_server-test
--encrypt_data_at_rest=1
--gtest_filter=TabletServerTest.TestDeleteTabletBenchmark' (1000 runs):
769.368354 task-clock (msec) # 0.792 CPUs utilized
( +- 0.34% )
3,633 context-switches # 0.005 M/sec
( +- 0.13% )
183 cpu-migrations # 0.238 K/sec
( +- 0.29% )
10,737 page-faults # 0.014 M/sec
( +- 0.07% )
1,356,327,815 cycles # 1.763 GHz
( +- 0.14% )
1,635,206,270 instructions # 1.21 insn per cycle
( +- 0.06% )
338,261,840 branches # 439.662 M/sec
( +- 0.06% )
6,486,125 branch-misses # 1.92% of all branches
( +- 0.21% )
0.971974609 seconds time elapsed
( +- 2.42% )
I also wanted to run dense_node-itest with -num_seconds=240 and
-num_tablets=1000, but the amount of data written (both in terms of
number of blocks and bytes) was all over the place, and so was the time
spent reopening the tablet server (seemingly without correlation to the
amount of data under management by the LBM). This was true with and
without encryption being enabled, so it's hard to draw any conclusions
from this benchmark.
$ for x in {0..1}; do perf stat -r 5 --log-fd=3 ./bin/dense_node-itest
-num_tablets=1000 -num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/$x 3>&1
2> >(grep "dense_node-itest.cc") > >(grep "===="); done
[==========] Running 1 test from 1 test suite.
I1110 05:23:51.307804 107446 dense_node-itest.cc:223] Time spent restarting
master: real 0.083s user 0.000s sys 0.001s
I1110 05:24:19.118248 107446 dense_node-itest.cc:226] Time spent restarting
tserver: real 27.810s user 0.017s sys 0.100s
I1110 05:24:19.118268 107446 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 05:24:19.465764 107446 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 38136
I1110 05:24:19.465798 107446 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 250252129
I1110 05:24:19.465804 107446 dense_node-itest.cc:242]
log_block_manager_containers: 9892
I1110 05:24:19.465808 107446 dense_node-itest.cc:242]
log_block_manager_full_containers: 4017
I1110 05:24:19.465812 107446 dense_node-itest.cc:242] threads_running: 1334
[==========] 1 test from 1 test suite ran. (608499 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:33:58.751773 121520 dense_node-itest.cc:223] Time spent restarting
master: real 0.053s user 0.000s sys 0.001s
I1110 05:41:09.045049 121520 dense_node-itest.cc:226] Time spent restarting
tserver: real 430.293s user 0.291s sys 1.523s
I1110 05:41:09.045073 121520 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 05:41:09.257555 121520 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 29376
I1110 05:41:09.257593 121520 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 199511663
I1110 05:41:09.257599 121520 dense_node-itest.cc:242]
log_block_manager_containers: 11464
I1110 05:41:09.257604 121520 dense_node-itest.cc:242]
log_block_manager_full_containers: 1319
I1110 05:41:09.257608 121520 dense_node-itest.cc:242] threads_running: 226
[==========] 1 test from 1 test suite ran. (947330 ms total)
[==========] Running 1 test from 1 test suite.
I1110 05:50:01.919380 143199 dense_node-itest.cc:223] Time spent restarting
master: real 0.094s user 0.000s sys 0.003s
I1110 05:51:13.537168 143199 dense_node-itest.cc:226] Time spent restarting
tserver: real 71.618s user 0.057s sys 0.249s
I1110 05:51:13.537186 143199 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 05:51:13.581818 143199 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 32557
I1110 05:51:13.581848 143199 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 189720720
I1110 05:51:13.581853 143199 dense_node-itest.cc:242]
log_block_manager_containers: 9910
I1110 05:51:13.581857 143199 dense_node-itest.cc:242]
log_block_manager_full_containers: 2070
I1110 05:51:13.581861 143199 dense_node-itest.cc:242] threads_running: 225
[==========] 1 test from 1 test suite ran. (630376 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:00:18.327955 161886 dense_node-itest.cc:223] Time spent restarting
master: real 0.094s user 0.001s sys 0.001s
I1110 06:00:36.625900 161886 dense_node-itest.cc:226] Time spent restarting
tserver: real 18.298s user 0.011s sys 0.063s
I1110 06:00:36.625918 161886 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:00:36.779762 161886 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 34068
I1110 06:00:36.779801 161886 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 236467612
I1110 06:00:36.779808 161886 dense_node-itest.cc:242]
log_block_manager_containers: 9880
I1110 06:00:36.779814 161886 dense_node-itest.cc:242]
log_block_manager_full_containers: 1410
I1110 06:00:36.779817 161886 dense_node-itest.cc:242] threads_running: 1332
[==========] 1 test from 1 test suite ran. (481328 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:08:12.651594 176648 dense_node-itest.cc:223] Time spent restarting
master: real 0.084s user 0.001s sys 0.002s
I1110 06:09:14.436517 176648 dense_node-itest.cc:226] Time spent restarting
tserver: real 61.785s user 0.045s sys 0.222s
I1110 06:09:14.436537 176648 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:09:14.732786 176648 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 32334
I1110 06:09:14.732823 176648 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 225235870
I1110 06:09:14.732829 176648 dense_node-itest.cc:242]
log_block_manager_containers: 10119
I1110 06:09:14.732833 176648 dense_node-itest.cc:242]
log_block_manager_full_containers: 5070
I1110 06:09:14.732837 176648 dense_node-itest.cc:242] threads_running: 1313
[==========] 1 test from 1 test suite ran. (506357 ms total)
Performance counter stats for './bin/dense_node-itest -num_tablets=1000
-num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/0' (5 runs):
2995824.938107 task-clock (msec) # 4.719 CPUs utilized
( +- 5.08% )
51,210,606 context-switches # 0.017 M/sec
( +- 8.77% )
10,794,492 cpu-migrations # 0.004 M/sec
( +- 12.18% )
5,334,419 page-faults # 0.002 M/sec
( +- 9.46% )
6,394,831,534,947 cycles # 2.135 GHz
( +- 9.25% )
3,981,645,181,601 instructions # 0.62 insn per cycle
( +- 9.86% )
792,397,266,216 branches # 264.501 M/sec
( +- 9.75% )
6,963,950,058 branch-misses # 0.88% of all branches
( +- 7.25% )
634.804501857 seconds time elapsed
( +- 13.11% )
[==========] Running 1 test from 1 test suite.
I1110 06:16:46.444833 194734 dense_node-itest.cc:223] Time spent restarting
master: real 0.075s user 0.000s sys 0.002s
I1110 06:17:29.511780 194734 dense_node-itest.cc:226] Time spent restarting
tserver: real 43.067s user 0.031s sys 0.151s
I1110 06:17:29.511799 194734 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:17:29.575773 194734 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 33354
I1110 06:17:29.575809 194734 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 162500445
I1110 06:17:29.575814 194734 dense_node-itest.cc:242]
log_block_manager_containers: 9946
I1110 06:17:29.575817 194734 dense_node-itest.cc:242]
log_block_manager_full_containers: 2847
I1110 06:17:29.575821 194734 dense_node-itest.cc:242] threads_running: 228
[==========] 1 test from 1 test suite ran. (468746 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:24:34.692034 14506 dense_node-itest.cc:223] Time spent restarting
master: real 4.389s user 0.003s sys 0.017s
I1110 06:25:16.629412 14506 dense_node-itest.cc:226] Time spent restarting
tserver: real 41.937s user 0.027s sys 0.146s
I1110 06:25:16.629434 14506 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:25:22.050498 14506 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 47634
I1110 06:25:22.050537 14506 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 409166624
I1110 06:25:22.050544 14506 dense_node-itest.cc:242]
log_block_manager_containers: 9779
I1110 06:25:22.050549 14506 dense_node-itest.cc:242]
log_block_manager_full_containers: 3405
I1110 06:25:22.050552 14506 dense_node-itest.cc:242] threads_running: 1342
[==========] 1 test from 1 test suite ran. (434683 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:31:46.989178 26886 dense_node-itest.cc:223] Time spent restarting
master: real 0.093s user 0.000s sys 0.003s
I1110 06:32:04.775068 26886 dense_node-itest.cc:226] Time spent restarting
tserver: real 17.786s user 0.010s sys 0.048s
I1110 06:32:04.775091 26886 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:32:04.830790 26886 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 34068
I1110 06:32:04.830832 26886 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 225249875
I1110 06:32:04.830838 26886 dense_node-itest.cc:242]
log_block_manager_containers: 10352
I1110 06:32:04.830842 26886 dense_node-itest.cc:242]
log_block_manager_full_containers: 1198
I1110 06:32:04.830849 26886 dense_node-itest.cc:242] threads_running: 1307
[==========] 1 test from 1 test suite ran. (401113 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:38:28.355651 39523 dense_node-itest.cc:223] Time spent restarting
master: real 0.348s user 0.001s sys 0.003s
I1110 06:39:14.934329 39523 dense_node-itest.cc:226] Time spent restarting
tserver: real 46.579s user 0.034s sys 0.161s
I1110 06:39:14.934355 39523 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:39:15.086436 39523 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 42942
I1110 06:39:15.086474 39523 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 337224147
I1110 06:39:15.086480 39523 dense_node-itest.cc:242]
log_block_manager_containers: 10706
I1110 06:39:15.086484 39523 dense_node-itest.cc:242]
log_block_manager_full_containers: 2511
I1110 06:39:15.086489 39523 dense_node-itest.cc:242] threads_running: 1326
[==========] 1 test from 1 test suite ran. (542365 ms total)
[==========] Running 1 test from 1 test suite.
I1110 06:47:30.400804 54331 dense_node-itest.cc:223] Time spent restarting
master: real 0.125s user 0.001s sys 0.002s
I1110 06:51:28.288359 54331 dense_node-itest.cc:226] Time spent restarting
tserver: real 237.888s user 0.160s sys 0.855s
I1110 06:51:28.288374 54331 dense_node-itest.cc:237] not waiting for
bootstrapping tablets (flag disabled)
I1110 06:51:28.633086 54331 dense_node-itest.cc:242]
log_block_manager_blocks_under_management: 35292
I1110 06:51:28.633126 54331 dense_node-itest.cc:242]
log_block_manager_bytes_under_management: 198352642
I1110 06:51:28.633132 54331 dense_node-itest.cc:242]
log_block_manager_containers: 10015
I1110 06:51:28.633139 54331 dense_node-itest.cc:242]
log_block_manager_full_containers: 2514
I1110 06:51:28.633143 54331 dense_node-itest.cc:242] threads_running: 1300
[==========] 1 test from 1 test suite ran. (709361 ms total)
Performance counter stats for './bin/dense_node-itest -num_tablets=1000
-num_seconds=240 --gtest_filter=DenseNodeTest.RunTest/1' (5 runs):
3115638.620173 task-clock (msec) # 6.094 CPUs utilized
( +- 2.07% )
50,595,541 context-switches # 0.016 M/sec
( +- 7.87% )
10,972,126 cpu-migrations # 0.004 M/sec
( +- 9.15% )
5,985,924 page-faults # 0.002 M/sec
( +- 10.89% )
6,073,807,994,303 cycles # 1.949 GHz
( +- 10.17% )
4,094,332,719,732 instructions # 0.67 insn per cycle
( +- 7.76% )
807,628,648,546 branches # 259.218 M/sec
( +- 7.83% )
7,158,875,684 branch-misses # 0.89% of all branches
( +- 6.38% )
511.281338094 seconds time elapsed
( +- 10.71% )
Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log_index.cc
M src/kudu/consensus/log_util.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/dir_manager.cc
M src/kudu/fs/dir_util.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/dense_node-itest.cc
M src/kudu/integration-tests/mini_cluster_fs_inspector.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/external_mini_cluster.h
M src/kudu/postgres/mini_postgres.cc
M src/kudu/ranger/ranger_client.cc
M src/kudu/security/test/mini_kdc.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_pbc.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env.cc
M src/kudu/util/env.h
M src/kudu/util/env_posix.cc
M src/kudu/util/env_util.cc
M src/kudu/util/file_cache-test.cc
M src/kudu/util/file_cache.cc
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
M src/kudu/util/rolling_log.cc
M src/kudu/util/yamlreader-test.cc
41 files changed, 458 insertions(+), 188 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/17974/9
--
To view, visit http://gerrit.cloudera.org:8080/17974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I909d0c4af0c1fca0d14c99a6627842dbe2ed7524
Gerrit-Change-Number: 17974
Gerrit-PatchSet: 9
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)