[jira] [Commented] (KUDU-636) optimization: we spend a lot of time in alloc/free
[ https://issues.apache.org/jira/browse/KUDU-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155725#comment-17155725 ] ASF subversion and git services commented on KUDU-636: -- Commit a600f386aa2c341522638acb9af53fd45c469431 in kudu's branch refs/heads/master from Todd Lipcon [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=a600f38 ] KUDU-636. Use Arena for EncodedKeys This updates EncodedKeyBuilder, RowSetKeyProbe, and EncodedKey to always allocate from an Arena instead of from the heap. This reduces allocator contention on the write path significantly and improves memory locality. I measured by running a tserver under 'perf stat' while using perf loadgen to insert 80M rows total using 8 client threads. The CPU time on the tserver was reduced by about 20%. Before: Performance counter stats for './build/latest/bin/kudu tserver run -fs-wal-dir /tmp/ts': 269853.10 msec task-clock#6.862 CPUs utilized 293066 context-switches #0.001 M/sec 44541 cpu-migrations#0.165 K/sec 2846435 page-faults #0.011 M/sec 1110190206891 cycles#4.114 GHz (83.33%) 201895623339 stalled-cycles-frontend # 18.19% frontend cycles idle (83.33%) 137095475307 stalled-cycles-backend# 12.35% backend cycles idle (83.32%) 894201276095 instructions #0.81 insn per cycle #0.23 stalled cycles per insn (83.33%) 159095264762 branches # 589.562 M/sec (83.35%) 639216492 branch-misses #0.40% of all branches (83.35%) 255.178068000 seconds user 14.913394000 seconds sys After: Performance counter stats for './build/latest/bin/kudu tserver run -fs-wal-dir /tmp/ts': 227730.62 msec task-clock#6.212 CPUs utilized 263824 context-switches #0.001 M/sec 45470 cpu-migrations#0.200 K/sec 3165436 page-faults #0.014 M/sec 931840588715 cycles#4.092 GHz (83.25%) 183214671009 stalled-cycles-frontend # 19.66% frontend cycles idle (83.40%) 111864991317 stalled-cycles-backend# 12.00% backend cycles idle (83.35%) 832636863971 instructions #0.89 insn per cycle #0.22 stalled cycles per insn (83.40%) 148228107120 branches # 650.892 M/sec (83.24%) 563344647 branch-misses #0.38% of all branches (83.35%) 211.361472000 seconds user 16.635265000 seconds sys Change-Id: Ib46d0e2c31e03a7f319ceb0bf742e08ff74d7683 Reviewed-on: http://gerrit.cloudera.org:8080/16162 Reviewed-by: Alexey Serbin Tested-by: Todd Lipcon > optimization: we spend a lot of time in alloc/free > -- > > Key: KUDU-636 > URL: https://issues.apache.org/jira/browse/KUDU-636 > Project: Kudu > Issue Type: Improvement > Components: perf >Affects Versions: Public beta >Reporter: Todd Lipcon >Priority: Major > > Looking at a workload in the cluster, several of the top 10 lines of perf > report are tcmalloc-related. It seems like we don't do a good job of making > use of the per-thread free-lists, and we end up in a lot of contention on the > central free list. There are a few low-hanging fruit things we could do to > improve this for a likely perf boost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KUDU-3119) ToolTest.TestFsAddRemoveDataDirEndToEnd reports race under TSAN
[ https://issues.apache.org/jira/browse/KUDU-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Serbin updated KUDU-3119: Attachment: kudu-tool-test.3.txt.xz > ToolTest.TestFsAddRemoveDataDirEndToEnd reports race under TSAN > --- > > Key: KUDU-3119 > URL: https://issues.apache.org/jira/browse/KUDU-3119 > Project: Kudu > Issue Type: Bug > Components: CLI, test >Reporter: Alexey Serbin >Priority: Minor > Attachments: kudu-tool-test.20200709.txt.xz, kudu-tool-test.3.txt.xz, > kudu-tool-test.log.xz > > > Sometimes the {{TestFsAddRemoveDataDirEndToEnd}} scenario of the {{ToolTest}} > reports races for TSAN builds: > {noformat} > /data0/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:266: > Failure > Failed > Bad status: Runtime error: /tmp/dist-test-taskIZqSmU/build/tsan/bin/kudu: > process exited with non-ze > ro status 66 > Google Test trace: > /data0/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:265: > W0506 17:5 > 6:02.744191 4432 flags.cc:404] Enabled unsafe flag: --never_fsync=true > I0506 17:56:02.780252 4432 fs_manager.cc:263] Metadata directory not provided > I0506 17:56:02.780442 4432 fs_manager.cc:269] Using write-ahead log > directory (fs_wal_dir) as metad > ata directory > I0506 17:56:02.789638 4432 fs_manager.cc:399] Time spent opening directory > manager: real 0.007s > user 0.005s sys 0.002s > I0506 17:56:02.789986 4432 env_posix.cc:1676] Not raising this process' open > files per process limi > t of 1048576; it is already as high as it can go > I0506 17:56:02.790426 4432 file_cache.cc:465] Constructed file cache lbm > with capacity 419430 > == > WARNING: ThreadSanitizer: data race (pid=4432) > ... > {noformat} > The log is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3119) ToolTest.TestFsAddRemoveDataDirEndToEnd reports race under TSAN
[ https://issues.apache.org/jira/browse/KUDU-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155693#comment-17155693 ] Alexey Serbin commented on KUDU-3119: - One more TSAN trace.f [^kudu-tool-test.3.txt.xz] > ToolTest.TestFsAddRemoveDataDirEndToEnd reports race under TSAN > --- > > Key: KUDU-3119 > URL: https://issues.apache.org/jira/browse/KUDU-3119 > Project: Kudu > Issue Type: Bug > Components: CLI, test >Reporter: Alexey Serbin >Priority: Minor > Attachments: kudu-tool-test.20200709.txt.xz, kudu-tool-test.3.txt.xz, > kudu-tool-test.log.xz > > > Sometimes the {{TestFsAddRemoveDataDirEndToEnd}} scenario of the {{ToolTest}} > reports races for TSAN builds: > {noformat} > /data0/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:266: > Failure > Failed > Bad status: Runtime error: /tmp/dist-test-taskIZqSmU/build/tsan/bin/kudu: > process exited with non-ze > ro status 66 > Google Test trace: > /data0/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/kudu-tool-test.cc:265: > W0506 17:5 > 6:02.744191 4432 flags.cc:404] Enabled unsafe flag: --never_fsync=true > I0506 17:56:02.780252 4432 fs_manager.cc:263] Metadata directory not provided > I0506 17:56:02.780442 4432 fs_manager.cc:269] Using write-ahead log > directory (fs_wal_dir) as metad > ata directory > I0506 17:56:02.789638 4432 fs_manager.cc:399] Time spent opening directory > manager: real 0.007s > user 0.005s sys 0.002s > I0506 17:56:02.789986 4432 env_posix.cc:1676] Not raising this process' open > files per process limi > t of 1048576; it is already as high as it can go > I0506 17:56:02.790426 4432 file_cache.cc:465] Constructed file cache lbm > with capacity 419430 > == > WARNING: ThreadSanitizer: data race (pid=4432) > ... > {noformat} > The log is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-2200) Sanity-check that users specify the right number of masters when connecting
[ https://issues.apache.org/jira/browse/KUDU-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155691#comment-17155691 ] Bankim Bhavsar commented on KUDU-2200: -- Noting the commit: https://github.com/apache/kudu/commit/3e86797e6e73365c26b8826083c447494c7fa7eb > Sanity-check that users specify the right number of masters when connecting > --- > > Key: KUDU-2200 > URL: https://issues.apache.org/jira/browse/KUDU-2200 > Project: Kudu > Issue Type: Improvement > Components: client, master, supportability >Affects Versions: 1.6.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Fix For: 1.6.0 > > > A common issue I've seen is that users set up an HA master setup (3 masters) > but then in various cases only specify one of the masters when they try to > connect using the client. This currently will work if it happens that they > picked the leader master, and otherwise will return a s omewhat confusing "no > leader" error message. > We should improve usability here by having the master send back a list of the > master addresses in the case that it isn't the leader, and the client can > use this to provide a more actionable error message like "Client connection > specified only a subset of the cluster's masters" or somesuch. > I wouldn't want to automatically reconfigure the client and reconnect > because this puts the client in a configuration state that will fail once the > one master they specified goes down. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3154) RangerClientTestBase.TestLogging sometimes fails
[ https://issues.apache.org/jira/browse/KUDU-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155663#comment-17155663 ] ASF subversion and git services commented on KUDU-3154: --- Commit 5291221ecea6b892364720342eb15c64bcf41e95 in kudu's branch refs/heads/master from Andrew Wong [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=5291221 ] KUDU-3154: add policy to Ranger before authorizing For whatever reason, in some environments, we've seen the Kudu Ranger plugin hang when downloading policies. It isn't exactly clear why this happens but one workaround we've seen is to add a policy to Ranger before authorizing any requests. This patch adds the workaround to the only Ranger test that suffers from this. Change-Id: Ice0b3842335e5854835542e18b41c54509c2fc33 Reviewed-on: http://gerrit.cloudera.org:8080/16166 Reviewed-by: Alexey Serbin Tested-by: Kudu Jenkins Reviewed-by: Attila Bukor > RangerClientTestBase.TestLogging sometimes fails > > > Key: KUDU-3154 > URL: https://issues.apache.org/jira/browse/KUDU-3154 > Project: Kudu > Issue Type: Bug > Components: ranger, test >Affects Versions: 1.13.0 >Reporter: Alexey Serbin >Priority: Major > Attachments: kudu-3154_jstacks.txt, ranger_client-test.txt, > ranger_client-test.txt.xz > > > The {{RangerClientTestBase.TestLogging}} scenario of the > {{ranger_client-test}} sometimes fails (all types of builds) with error > message like below: > {noformat} > src/kudu/ranger/ranger_client-test.cc:398: Failure > Failed > > Bad status: Timed out: timed out while in flight > > I0620 07:06:02.907177 1140 server.cc:247] Received an EOF from the > subprocess > I0620 07:06:02.910923 1137 server.cc:317] get failed, inbound queue shut > down: Aborted: > I0620 07:06:02.910964 1141 server.cc:380] outbound queue shut down: Aborted: > > I0620 07:06:02.910995 1138 server.cc:317] get failed, inbound queue shut > down: Aborted: > I0620 07:06:02.910984 1139 server.cc:317] get failed, inbound queue shut > down: Aborted: > {noformat} > The log is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KUDU-2857) Rewrite docker build script in python
[ https://issues.apache.org/jira/browse/KUDU-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke resolved KUDU-2857. --- Fix Version/s: 1.13.0 Resolution: Fixed > Rewrite docker build script in python > - > > Key: KUDU-2857 > URL: https://issues.apache.org/jira/browse/KUDU-2857 > Project: Kudu > Issue Type: Improvement >Reporter: Grant Henke >Assignee: Grant Henke >Priority: Major > Labels: build, docker > Fix For: 1.13.0 > > > The docker build bash script has gotten sufficiently complicated that it > should be rewritten in python. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-2857) Rewrite docker build script in python
[ https://issues.apache.org/jira/browse/KUDU-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155608#comment-17155608 ] ASF subversion and git services commented on KUDU-2857: --- Commit 341605597f7ed1ab7a36bfefb2e181102c173302 in kudu's branch refs/heads/master from Grant Henke [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3416055 ] [docker] KUDU-2857: Rewrite docker build script in python This patch rewrites the Docker script in Python while maintaining the same functionality. Not only will this make future development easier, but it makes usage more intuitive as well. Now instead of using environment variables, flags are used to configure the build. Additionally the flags are validated against valid values and `--help` output is available to show all the options. I also added basic resource validation checks to ensure users have configured enough CPU and memory resources to have a successful build and fail fast if they don’t. Change-Id: Ie678967a6b64f53682c636b8d1cdcb2f5467e608 Reviewed-on: http://gerrit.cloudera.org:8080/16161 Reviewed-by: Attila Bukor Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin > Rewrite docker build script in python > - > > Key: KUDU-2857 > URL: https://issues.apache.org/jira/browse/KUDU-2857 > Project: Kudu > Issue Type: Improvement >Reporter: Grant Henke >Assignee: Grant Henke >Priority: Major > Labels: build, docker > > The docker build bash script has gotten sufficiently complicated that it > should be rewritten in python. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3090) Add owner concept in Kudu
[ https://issues.apache.org/jira/browse/KUDU-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155503#comment-17155503 ] ASF subversion and git services commented on KUDU-3090: --- Commit 59813014099a3c433734c7c58268ddfb80d69055 in kudu's branch refs/heads/master from Attila Bukor [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=5981301 ] KUDU-3090 Ownership support in Java client Change-Id: I083ad9750ce1b3ae31bb510b700d1204fcdf291d Reviewed-on: http://gerrit.cloudera.org:8080/16125 Tested-by: Kudu Jenkins Reviewed-by: Grant Henke > Add owner concept in Kudu > - > > Key: KUDU-3090 > URL: https://issues.apache.org/jira/browse/KUDU-3090 > Project: Kudu > Issue Type: New Feature > Components: authz, security >Reporter: Hao Hao >Assignee: Attila Bukor >Priority: Major > Labels: roadmap-candidate > > As mentioned in the Ranger integration design doc, Ranger supports ownership > privilege by creating a default policy that allows \{OWNER} of a resource to > access it without creating additional policy manually. Unless Kudu actually > has a full support for owner, ownership privilege is not possible with Ranger > integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3090) Add owner concept in Kudu
[ https://issues.apache.org/jira/browse/KUDU-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155504#comment-17155504 ] ASF subversion and git services commented on KUDU-3090: --- Commit fad779cf7a73722fe670d4eee2f67b3691602bf0 in kudu's branch refs/heads/master from Attila Bukor [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fad779c ] KUDU-3090 Support backing up ownership info Change-Id: I963db0a36cd4b7f080944ed46fc4119b1e055143 Reviewed-on: http://gerrit.cloudera.org:8080/16126 Tested-by: Kudu Jenkins Reviewed-by: Grant Henke > Add owner concept in Kudu > - > > Key: KUDU-3090 > URL: https://issues.apache.org/jira/browse/KUDU-3090 > Project: Kudu > Issue Type: New Feature > Components: authz, security >Reporter: Hao Hao >Assignee: Attila Bukor >Priority: Major > Labels: roadmap-candidate > > As mentioned in the Ranger integration design doc, Ranger supports ownership > privilege by creating a default policy that allows \{OWNER} of a resource to > access it without creating additional policy manually. Unless Kudu actually > has a full support for owner, ownership privilege is not possible with Ranger > integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3090) Add owner concept in Kudu
[ https://issues.apache.org/jira/browse/KUDU-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155505#comment-17155505 ] ASF subversion and git services commented on KUDU-3090: --- Commit 880ffe4db635e11e45b22256c7626c80b81ccbc0 in kudu's branch refs/heads/master from Attila Bukor [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=880ffe4 ] KUDU-3090 Add delegate admin privilege Database systems usually require ALL WITH GRANT OPTION privilege to change ownership of a table. Apache Ranger supports a special quasi-access type called "delegate admin" which grants the user permissions to grant and revoke permissions which is similar to GRANT OPTION in other systems. This patch adds support for this delegate option flag which is required in addition to "ALL" to create a table with a different owner. It also refactors the Ranger subprocess to allow evaluating ALL and delegate admin in a single RangerRequestPB. Change-Id: If8ba018dac568a1ab74cf2d5657221579636ac1c Reviewed-on: http://gerrit.cloudera.org:8080/16071 Tested-by: Kudu Jenkins Reviewed-by: Grant Henke > Add owner concept in Kudu > - > > Key: KUDU-3090 > URL: https://issues.apache.org/jira/browse/KUDU-3090 > Project: Kudu > Issue Type: New Feature > Components: authz, security >Reporter: Hao Hao >Assignee: Attila Bukor >Priority: Major > Labels: roadmap-candidate > > As mentioned in the Ranger integration design doc, Ranger supports ownership > privilege by creating a default policy that allows \{OWNER} of a resource to > access it without creating additional policy manually. Unless Kudu actually > has a full support for owner, ownership privilege is not possible with Ranger > integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-2612) Implement multi-row transactions
[ https://issues.apache.org/jira/browse/KUDU-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155221#comment-17155221 ] ASF subversion and git services commented on KUDU-2612: --- Commit efd8c4f165460b7fa337b8ebd1856b10bc274311 in kudu's branch refs/heads/master from Andrew Wong [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=efd8c4f ] KUDU-2612 p2: introduce transaction status management This introduces the TxnStatusManager, which is backed by the TxnStatusTablet that exposes the following APIs that will be called via RPC, and will serve as many of the building blocks for orchestrating two-phase commit: - BeginTransaction: adds a new transaction under management of the TxnStatusManager - BeginCommitTransaction: transitions the state of a transaction from OPEN to COMMIT_IN_PROGRESS - AbortTransaction: transitions the state of a transaction from OPEN or COMMIT_IN_PROGRESS to ABORTED - RegisterParticipant: adds a participant to be associated with a specific transaction ID For completeness sake w.r.t defining the transaction state's enums, the following API is also added, which will be called by the TxnStatusManager itself upon determining a transaction has been completed. - FinalizeCommitTransaction: transitions the state of a transaction from COMMIT_IN_PROGRESS to COMMITTED This new abstraction mirrors that used by the CatalogManager, which uses copy-on-write locking to protect concurrent access to metadata while writes to the underlying TabletReplica (i.e. SysCatalogTable, or in this case, TxnStatusTablet) are being replicated. This is at least enough of a jumping off point that we can begin plumbing this into the tablet servers and defining an RPC service around it -- there are still no facilities to create a TxnStatusManager. It should be noted that end-users will not call these methods directly, but rather through some layer of indirection (e.g. clients won't request a specific transaction ID, they'll just request to begin a transaction, and some intermediary layer will be in charge of getting an appropriate transaction ID). This should give us flexibility in changing the TxnStatusManager's interface moving forward. Change-Id: I371bb200cf65073ae3ac7cb311ab9a0b8344a636 Reviewed-on: http://gerrit.cloudera.org:8080/16044 Reviewed-by: Alexey Serbin Tested-by: Andrew Wong > Implement multi-row transactions > > > Key: KUDU-2612 > URL: https://issues.apache.org/jira/browse/KUDU-2612 > Project: Kudu > Issue Type: Task >Reporter: Mike Percy >Priority: Major > Labels: roadmap-candidate > > Tracking Jira to implement multi-row / multi-table transactions in Kudu. -- This message was sent by Atlassian Jira (v8.3.4#803005)