[
https://issues.apache.org/jira/browse/HDFS-12691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214722#comment-16214722
]
Yiqun Lin commented on HDFS-12691:
----------------------------------
Hi [~cheersyang], you raise up a good point, but also I have not a very good
idea in my mind. But even if we find a appropriate value for each task, it
seems too complex for users to understand and use.
When I read the codes, I found following place isn't right now:
bq. // Default container size is 5G and block size is 256MB, a full
container...
After HDFS-12543, we can create key without specfying size. That mean the real
size is used rather than default size. So in one container, there might be a
large number of blocks contained if we put lots of small size keys.
One thing I most want to say is that the value
{{SCMBlockDeletingService#BLOCK_DELETE_TX_PER_REQUEST_LIMIT}} can be increased
to {{OZONE_KEY_DELETING_LIMIT_PER_TASK_DEFAULT}}. From my understanding of
this, we use this limited umber to get txs. Because one tx is corresponding to
one block, we can set {{OZONE_KEY_DELETING_LIMIT_PER_TASK_DEFAULT}} to
{{SCMBlockDeletingService#BLOCK_DELETE_TX_PER_REQUEST_LIMIT}}.
Please correct me if I am wrong. Thanks.
> Ozone: Decrease interval time of SCMBlockDeletingService for improving the
> efficiency
> -------------------------------------------------------------------------------------
>
> Key: HDFS-12691
> URL: https://issues.apache.org/jira/browse/HDFS-12691
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone, performance
> Affects Versions: HDFS-7240
> Reporter: Yiqun Lin
> Assignee: Yiqun Lin
>
> After logging elapsed time of each block deletion task, found some places we
> can make some improvements after testing. The logging during testing:
> {noformat}
> 2017-10-20 17:02:55,168 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:02:56,169 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:02:56,451 [SCMBlockDeletingService#0] INFO
> utils.BackgroundService (BackgroundService.java:run(99)) - Running background
> service : SCMBlockDeletingService
> 2017-10-20 17:02:56,755 [KeyDeletingService#0] INFO utils.BackgroundService
> (BackgroundService.java:run(99)) - Running background service :
> KeyDeletingService
> 2017-10-20 17:02:56,758 [KeyDeletingService#1] INFO ksm.KeyDeletingService
> (KeyDeletingService.java:call(99)) - Found 11 to-delete keys in KSM
> 2017-10-20 17:02:56,775 [IPC Server handler 19 on 52342] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:deleteKeyBlocks(870)) - SCM is informed by
> KSM to delete 11 blocks
> 2017-10-20 17:02:57,182 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:02:57,640 [KeyDeletingService#1] INFO ksm.KeyDeletingService
> (KeyDeletingService.java:call(125)) - Number of key deleted from KSM DB:
> 11, task elapsed time: 885ms
> 2017-10-20 17:02:58,168 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:03,178 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> ...
> 2017-10-20 17:03:04,167 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:05,173 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:06,095 [BlockDeletingService#0] INFO
> utils.BackgroundService (BackgroundService.java:run(99)) - Running background
> service : BlockDeletingService
> 2017-10-20 17:03:06,095 [BlockDeletingService#0] INFO
> background.BlockDeletingService (BlockDeletingService.java:getTasks(109))
> - Plan to choose 10 containers for block deletion, actually returns 0 valid
> containers.
> 2017-10-20 17:03:06,171 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> ...
> 2017-10-20 17:03:54,279 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:55,267 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:56,282 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:56,461 [SCMBlockDeletingService#0] INFO
> utils.BackgroundService (BackgroundService.java:run(99)) - Running background
> service : SCMBlockDeletingService
> 2017-10-20 17:03:56,467 [SCMBlockDeletingService#1] INFO
> block.SCMBlockDeletingService (SCMBlockDeletingService.java:call(129)) -
> Totally added 11 delete blocks command for 1 datanodes, task elapsed time: 6ms
> 2017-10-20 17:03:57,265 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:57,645 [KeyDeletingService#0] INFO utils.BackgroundService
> (BackgroundService.java:run(99)) - Running background service :
> KeyDeletingService
> 2017-10-20 17:03:58,278 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:03:58,522 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:handle(87)) - Start to delete container
> blocks, TXIDs=[1(0),2(0),3(0),4(0),5(0),6(0),7(0),8(0),9(0),10(0),11(0)],
> numOfContainers=1, numOfBlocks=11
> 2017-10-20 17:03:58,851 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 29352ed9-cee2-4e05-beee-50284ea300c2 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:58,879 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 88903455-756f-4640-8deb-069a5b2266fe to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:58,906 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block d2ec1e39-f768-4432-88a9-acc2c5d7a7f2 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:58,934 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 2785467a-f05c-49a1-ac9a-fc2819f73eba to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:58,962 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block c4b39df9-d350-4a02-92eb-8ab3719e70c7 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:58,990 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 0e85a4d3-2c6a-4822-9b9e-947451a8f645 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,017 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block b022c3bb-059c-47ac-83ca-49ae281765fb to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,045 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 4b76df3d-8b23-46f4-bbc1-98847e7c82bd to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,073 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 6474e09e-4626-4352-89e5-2e11e6063203 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,101 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 4ec42696-5388-4869-958a-714b81e8e5fb to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,235 [Command processor thread] INFO
> commandhandler.DeleteBlocksCommandHandler
> (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168)) - Transited
> Block 2fdb437c-dc51-4d43-8dfb-9aff0c319712 to DELETING state in container
> 1bfa7356-b07c-4d25-97b2-e66eaff19584
> 2017-10-20 17:03:59,236 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=1 from block deletion log
> 2017-10-20 17:03:59,236 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=2 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=3 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=4 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=5 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=6 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=7 from block deletion log
> 2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=8 from block deletion log
> 2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=9 from block deletion log
> 2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=10 from block deletion log
> 2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO
> scm.StorageContainerManager
> (StorageContainerManager.java:sendContainerBlocksDeletionACK(779)) -
> Purging TXID=11 from block deletion log
> 2017-10-20 17:03:59,265 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:00,279 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:01,266 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:02,266 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:03,268 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:04,278 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:05,275 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:06,099 [BlockDeletingService#1] INFO
> utils.BackgroundService (BackgroundService.java:run(99)) - Running background
> service : BlockDeletingService
> 2017-10-20 17:04:06,099 [BlockDeletingService#1] INFO
> background.BlockDeletingService (BlockDeletingService.java:getTasks(109))
> - Plan to choose 10 containers for block deletion, actually returns 1 valid
> containers.
> 2017-10-20 17:04:06,185 [BlockDeletingService#0] INFO
> background.BlockDeletingService (BlockDeletingService.java:call(224)) -
> Container: 1bfa7356-b07c-4d25-97b2-e66eaff19584, deleted blocks: 11, task
> elapsed time: 84ms
> 2017-10-20 17:04:06,265 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:07,271 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> 2017-10-20 17:04:08,269 [Datanode State Machine Thread - 0] INFO
> Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit
> for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
> {noformat}
> Now three deletion tasks reuse the same interval time for executing
> periodically, but actually this will lead some problem. From the elapsed time
> showed in log, service {{KeyDeletingService}} will cost the most time, then
> {{BlockDeletingService}} and {{SCMBlockDeletingService}} costs least time.
> We should make {{SCMBlockDeletingService}} executed frequently than other two
> service, then the service {{BlockDeletingService}} can work more effectively
> since it relies on {{SCMBlockDeletingService}}'s completion.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]