[jira] [Created] (HDDS-10909) OM stops
Attila Doroszlai created HDDS-10909: --- Summary: OM stops Key: HDDS-10909 URL: https://issues.apache.org/jira/browse/HDDS-10909 Project: Apache Ozone Issue Type: Bug Reporter: Attila Doroszlai Trying to run Ozone in a single docker container using command from [wiki|https://cwiki.apache.org/confluence/display/OZONE/Running+via+DockerHub]: {code} docker run -d -p 9878:9878 -p 9876:9876 apache/ozone {code} OM stops due to misconfig: {code} 2024-05-24 05:35:05 WARN ServerUtils:323 - Storage directory for Ratis is not configured. It is a good idea to map this to an SSD disk. Falling back to ozone.metadata.dirs 2024-05-24 05:35:05 WARN OzoneManagerRatisUtils:473 - ozone.om.ratis.snapshot.dir is not configured. Falling back to ozone.metadata.dirs config 2024-05-24 05:35:05 ERROR OzoneManagerStarter:76 - OM start failed with exception java.io.IOException: Ratis group Dir on disk tmp does not match with RaftGroupIDbf265839-605b-3f16-9796-c5ba1605619e generated from service id omServiceIdDefault. Looks like there is a change to ozone.om.service .ids value after the cluster is setup. Currently change to this value is not supported. at org.apache.hadoop.ozone.om.OzoneManager.initializeRatisDirs(OzoneManager.java:1476) at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:697) at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:771) at org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:189) at org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86) at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:74) at org.apache.hadoop.hdds.cli.GenericCli.call(GenericCli.java:38) ... at org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Updated] (HDDS-10909) OM stops in single-node Docker container with default settings
[ https://issues.apache.org/jira/browse/HDDS-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-10909: Summary: OM stops in single-node Docker container with default settings (was: OM stops ) > OM stops in single-node Docker container with default settings > -- > > Key: HDDS-10909 > URL: https://issues.apache.org/jira/browse/HDDS-10909 > Project: Apache Ozone > Issue Type: Bug >Reporter: Attila Doroszlai >Priority: Major > > Trying to run Ozone in a single docker container using command from > [wiki|https://cwiki.apache.org/confluence/display/OZONE/Running+via+DockerHub]: > {code} > docker run -d -p 9878:9878 -p 9876:9876 apache/ozone > {code} > OM stops due to misconfig: > {code} > 2024-05-24 05:35:05 WARN ServerUtils:323 - Storage directory for Ratis is > not configured. It is a good idea to map this to an SSD disk. Falling back to > ozone.metadata.dirs > 2024-05-24 05:35:05 WARN OzoneManagerRatisUtils:473 - > ozone.om.ratis.snapshot.dir is not configured. Falling back to > ozone.metadata.dirs config > 2024-05-24 05:35:05 ERROR OzoneManagerStarter:76 - OM start failed with > exception > java.io.IOException: Ratis group Dir on disk tmp does not match with > RaftGroupIDbf265839-605b-3f16-9796-c5ba1605619e generated from service id > omServiceIdDefault. Looks like there is a change to ozone.om.service > .ids value after the cluster is setup. Currently change to this value is not > supported. > at > org.apache.hadoop.ozone.om.OzoneManager.initializeRatisDirs(OzoneManager.java:1476) > at > org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:697) > at > org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:771) > at > org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:189) > at > org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86) > at > org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:74) > at org.apache.hadoop.hdds.cli.GenericCli.call(GenericCli.java:38) > ... > at > org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
ChenSammi commented on PR #6690: URL: https://github.com/apache/ozone/pull/6690#issuecomment-2128427394 > @ChenSammi if I understand correctly, Datanode does not free up the mmap handles as aggressively as we would like. Thus, we need to introduce this change. I would recommend the following choices over adding complexity. > > 1. Can we bump up the limit in the OS, the Datanode should stable state to a certain upper bound based on GC frequency and concurrency setting. > > 2. We can revert to non mmap handling if we hit the limit. > > > I agree with @duongkame the correctly solution might be to implement a new read API using netty and using the better abstractions it offers to manage direct buffers and NIO capabilities. @kerneltime , the proposed solution is exactly putting a upper bound limit and fallback to non mmap buffer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Commented] (HDDS-10750) Intermittent fork timeout while stopping Ratis server
[ https://issues.apache.org/jira/browse/HDDS-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849128#comment-17849128 ] Tsz-wo Sze commented on HDDS-10750: --- [~wfps1210], assigned it to you. Thanks! > Intermittent fork timeout while stopping Ratis server > - > > Key: HDDS-10750 > URL: https://issues.apache.org/jira/browse/HDDS-10750 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Attila Doroszlai >Priority: Critical > Attachments: 2024-04-21T16-53-06_683-jvmRun1.dump, > 2024-05-03T11-31-12_561-jvmRun1.dump, > org.apache.hadoop.fs.ozone.TestOzoneFileChecksum-output.txt, > org.apache.hadoop.hdds.scm.TestSCMInstallSnapshot-output.txt, > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy-output.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output-1.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output.txt, > org.apache.hadoop.ozone.om.TestOzoneManagerPrepare-output.txt > > > {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2024/04/21/30803/it-client/output.log} > [INFO] Running > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy > [INFO] > [INFO] Results: > ... > ... There was a timeout or other error in the fork > {code} > {code} > "main" >java.lang.Thread.State: WAITING > at java.lang.Object.wait(Native Method) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405) > ... > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:473) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:414) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:400) > at > org.apache.hadoop.ozone.client.rpc.AbstractTestECKeyOutputStream.shutdown(AbstractTestECKeyOutputStream.java:160) > "ForkJoinPool.commonPool-worker-7" >java.lang.Thread.State: TIMED_WAITING > ... > at > java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:144) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:136) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$close$9(RaftServerProxy.java:438) > ... > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerProxy.close(RaftServerProxy.java:415) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.stop(XceiverServerRatis.java:603) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop(OzoneContainer.java:484) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:447) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:637) > at > org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:550) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanode(MiniOzoneClusterImpl.java:479) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl$$Lambda$2077/645273703.accept(Unknown > Source) > "c7edee5d-bf3c-45a7-a783-e11562f208dc-impl-thread2" >java.lang.Thread.State: WAITING > ... > at > java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$close$3(RaftServerImpl.java:543) > at > org.apache.ratis.server.impl.RaftServerImpl$$Lambda$1925/263251010.run(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.lambda$checkStateAndClose$7(LifeCycle.java:306) > at org.apache.ratis.util.LifeCycle$$Lambda$1204/655954062.get(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:326) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerImpl.close(RaftServerImpl.java:525) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Commented] (HDDS-10750) Intermittent fork timeout while stopping Ratis server
[ https://issues.apache.org/jira/browse/HDDS-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849125#comment-17849125 ] Chung En Lee commented on HDDS-10750: - [~szetszwo], I'll work on it. Could you assign RATIS-2100 to me, thanks. > Intermittent fork timeout while stopping Ratis server > - > > Key: HDDS-10750 > URL: https://issues.apache.org/jira/browse/HDDS-10750 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Attila Doroszlai >Priority: Critical > Attachments: 2024-04-21T16-53-06_683-jvmRun1.dump, > 2024-05-03T11-31-12_561-jvmRun1.dump, > org.apache.hadoop.fs.ozone.TestOzoneFileChecksum-output.txt, > org.apache.hadoop.hdds.scm.TestSCMInstallSnapshot-output.txt, > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy-output.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output-1.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output.txt, > org.apache.hadoop.ozone.om.TestOzoneManagerPrepare-output.txt > > > {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2024/04/21/30803/it-client/output.log} > [INFO] Running > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy > [INFO] > [INFO] Results: > ... > ... There was a timeout or other error in the fork > {code} > {code} > "main" >java.lang.Thread.State: WAITING > at java.lang.Object.wait(Native Method) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405) > ... > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:473) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:414) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:400) > at > org.apache.hadoop.ozone.client.rpc.AbstractTestECKeyOutputStream.shutdown(AbstractTestECKeyOutputStream.java:160) > "ForkJoinPool.commonPool-worker-7" >java.lang.Thread.State: TIMED_WAITING > ... > at > java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:144) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:136) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$close$9(RaftServerProxy.java:438) > ... > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerProxy.close(RaftServerProxy.java:415) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.stop(XceiverServerRatis.java:603) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop(OzoneContainer.java:484) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:447) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:637) > at > org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:550) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanode(MiniOzoneClusterImpl.java:479) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl$$Lambda$2077/645273703.accept(Unknown > Source) > "c7edee5d-bf3c-45a7-a783-e11562f208dc-impl-thread2" >java.lang.Thread.State: WAITING > ... > at > java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$close$3(RaftServerImpl.java:543) > at > org.apache.ratis.server.impl.RaftServerImpl$$Lambda$1925/263251010.run(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.lambda$checkStateAndClose$7(LifeCycle.java:306) > at org.apache.ratis.util.LifeCycle$$Lambda$1204/655954062.get(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:326) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerImpl.close(RaftServerImpl.java:525) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Commented] (HDDS-10750) Intermittent fork timeout while stopping Ratis server
[ https://issues.apache.org/jira/browse/HDDS-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849123#comment-17849123 ] Tsz-wo Sze commented on HDDS-10750: --- [~wfps1210], thanks a lot for digging out the problem! Are you going to submit a pull request? If not, I can work on it. > Intermittent fork timeout while stopping Ratis server > - > > Key: HDDS-10750 > URL: https://issues.apache.org/jira/browse/HDDS-10750 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Attila Doroszlai >Priority: Critical > Attachments: 2024-04-21T16-53-06_683-jvmRun1.dump, > 2024-05-03T11-31-12_561-jvmRun1.dump, > org.apache.hadoop.fs.ozone.TestOzoneFileChecksum-output.txt, > org.apache.hadoop.hdds.scm.TestSCMInstallSnapshot-output.txt, > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy-output.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output-1.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output.txt, > org.apache.hadoop.ozone.om.TestOzoneManagerPrepare-output.txt > > > {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2024/04/21/30803/it-client/output.log} > [INFO] Running > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy > [INFO] > [INFO] Results: > ... > ... There was a timeout or other error in the fork > {code} > {code} > "main" >java.lang.Thread.State: WAITING > at java.lang.Object.wait(Native Method) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405) > ... > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:473) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:414) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:400) > at > org.apache.hadoop.ozone.client.rpc.AbstractTestECKeyOutputStream.shutdown(AbstractTestECKeyOutputStream.java:160) > "ForkJoinPool.commonPool-worker-7" >java.lang.Thread.State: TIMED_WAITING > ... > at > java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:144) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:136) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$close$9(RaftServerProxy.java:438) > ... > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerProxy.close(RaftServerProxy.java:415) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.stop(XceiverServerRatis.java:603) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop(OzoneContainer.java:484) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:447) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:637) > at > org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:550) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanode(MiniOzoneClusterImpl.java:479) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl$$Lambda$2077/645273703.accept(Unknown > Source) > "c7edee5d-bf3c-45a7-a783-e11562f208dc-impl-thread2" >java.lang.Thread.State: WAITING > ... > at > java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$close$3(RaftServerImpl.java:543) > at > org.apache.ratis.server.impl.RaftServerImpl$$Lambda$1925/263251010.run(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.lambda$checkStateAndClose$7(LifeCycle.java:306) > at org.apache.ratis.util.LifeCycle$$Lambda$1204/655954062.get(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:326) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerImpl.close(RaftServerImpl.java:525) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
kerneltime commented on PR #6690: URL: https://github.com/apache/ozone/pull/6690#issuecomment-2128042743 @ChenSammi if I understand correctly, Datanode does not free up the mmap handles as aggressively as we would like. Thus, we need to introduce this change. I would recommend the following choices over adding complexity. 1. Can we bump up the limit in the OS, the Datanode should stable state to a certain upper bound based on GC frequency and concurrency setting. 2. We can revert to non mmap handling if we hit the limit. I agree with @duongkame the correctly solution might be to implement a new read API using netty and using the better abstractions it offers to manage direct buffers and NIO capabilities. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Commented] (HDDS-10750) Intermittent fork timeout while stopping Ratis server
[ https://issues.apache.org/jira/browse/HDDS-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849111#comment-17849111 ] Chung En Lee commented on HDDS-10750: - [~adoroszlai] , I think it happens when closing LogAppender from NEW. I created a Jira issure RATIS-2100 for this. > Intermittent fork timeout while stopping Ratis server > - > > Key: HDDS-10750 > URL: https://issues.apache.org/jira/browse/HDDS-10750 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Attila Doroszlai >Priority: Critical > Attachments: 2024-04-21T16-53-06_683-jvmRun1.dump, > 2024-05-03T11-31-12_561-jvmRun1.dump, > org.apache.hadoop.fs.ozone.TestOzoneFileChecksum-output.txt, > org.apache.hadoop.hdds.scm.TestSCMInstallSnapshot-output.txt, > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy-output.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output-1.txt, > org.apache.hadoop.ozone.container.TestECContainerRecovery-output.txt, > org.apache.hadoop.ozone.om.TestOzoneManagerPrepare-output.txt > > > {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2024/04/21/30803/it-client/output.log} > [INFO] Running > org.apache.hadoop.ozone.client.rpc.TestECKeyOutputStreamWithZeroCopy > [INFO] > [INFO] Results: > ... > ... There was a timeout or other error in the fork > {code} > {code} > "main" >java.lang.Thread.State: WAITING > at java.lang.Object.wait(Native Method) > at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405) > ... > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanodes(MiniOzoneClusterImpl.java:473) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:414) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:400) > at > org.apache.hadoop.ozone.client.rpc.AbstractTestECKeyOutputStream.shutdown(AbstractTestECKeyOutputStream.java:160) > "ForkJoinPool.commonPool-worker-7" >java.lang.Thread.State: TIMED_WAITING > ... > at > java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:144) > at > org.apache.ratis.util.ConcurrentUtils.shutdownAndWait(ConcurrentUtils.java:136) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$close$9(RaftServerProxy.java:438) > ... > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerProxy.close(RaftServerProxy.java:415) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.stop(XceiverServerRatis.java:603) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.stop(OzoneContainer.java:484) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.close(DatanodeStateMachine.java:447) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.stopDaemon(DatanodeStateMachine.java:637) > at > org.apache.hadoop.ozone.HddsDatanodeService.stop(HddsDatanodeService.java:550) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopDatanode(MiniOzoneClusterImpl.java:479) > at > org.apache.hadoop.ozone.MiniOzoneClusterImpl$$Lambda$2077/645273703.accept(Unknown > Source) > "c7edee5d-bf3c-45a7-a783-e11562f208dc-impl-thread2" >java.lang.Thread.State: WAITING > ... > at > java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$close$3(RaftServerImpl.java:543) > at > org.apache.ratis.server.impl.RaftServerImpl$$Lambda$1925/263251010.run(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.lambda$checkStateAndClose$7(LifeCycle.java:306) > at org.apache.ratis.util.LifeCycle$$Lambda$1204/655954062.get(Unknown > Source) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:326) > at > org.apache.ratis.util.LifeCycle.checkStateAndClose(LifeCycle.java:304) > at > org.apache.ratis.server.impl.RaftServerImpl.close(RaftServerImpl.java:525) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
duongkame commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1612104239 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: Please see LeakDetector (HDDS-9528) @ChenSammi . Yet, relying on GC activity to perform resource closure may not be a good idea, because we have no control over when GC would cleanup unreachable objects. Usually people only reply on finalizer/WeakReference to detect leaks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Resolved] (HDDS-10860) Intermittent failure in TestLeaseRecovery.testFinalizeBlockFailure
[ https://issues.apache.org/jira/browse/HDDS-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDDS-10860. Fix Version/s: HDDS-7593 Resolution: Fixed > Intermittent failure in TestLeaseRecovery.testFinalizeBlockFailure > -- > > Key: HDDS-10860 > URL: https://issues.apache.org/jira/browse/HDDS-10860 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Ashish Kumar >Assignee: Ashish Kumar >Priority: Major > Labels: pull-request-available > Fix For: HDDS-7593 > > > > {code:java} > Error: Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 283.789 s <<< FAILURE! - in org.apache.hadoop.fs.ozone.TestLeaseRecovery > 2292Error: > org.apache.hadoop.fs.ozone.TestLeaseRecovery.testFinalizeBlockFailure Time > elapsed: 25.541 s <<< FAILURE! > 2293org.opentest4j.AssertionFailedError: expected: but was: > 2294 at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > 2295 at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > 2296 at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > 2297 at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > 2298 at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31) > 2299 at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183) > 2300 at > org.apache.hadoop.fs.ozone.TestLeaseRecovery.testFinalizeBlockFailure(TestLeaseRecovery.java:277) > 2301 at java.base/java.lang.reflect.Method.invoke(Method.java:568) > 2302 at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > 2303 at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) {code} > Failure link: > https://github.com/apache/ozone/actions/runs/9079532056/job/24949734316 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10860. Fix Intermittent failure in TestLeaseRecovery.testFinalizeBlockFailure [ozone]
jojochuang merged PR #6707: URL: https://github.com/apache/ozone/pull/6707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10905. Implement getHomeDirectory() method in OzoneFileSystem implementations [ozone]
adoroszlai commented on code in PR #6718: URL: https://github.com/apache/ozone/pull/6718#discussion_r1612112237 ## hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/AbstractRootedOzoneFileSystemTest.java: ## @@ -299,6 +299,11 @@ void testOzoneFsServiceLoader() throws IOException { OzoneConsts.OZONE_OFS_URI_SCHEME, confTestLoader), RootedOzoneFileSystem.class); } + @Test + void testUserHomeDirectory() { +assertEquals(new Path("/user/" + USER1), userOfs.getHomeDirectory()); Review Comment: And similarly: ``` expected: but was: ``` https://github.com/SaketaChalamchala/ozone/actions/runs/9212326470/job/25344354470#step:5:2184 ## hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/AbstractOzoneFileSystemTest.java: ## @@ -258,6 +267,11 @@ public BucketLayout getBucketLayout() { return bucketLayout; } + @Test + void testUserHomeDirectory() { +assertEquals(new Path("/user/" + USER1), userO3fs.getHomeDirectory()); Review Comment: Looks like these are failing with: ``` expected: but was: ``` https://github.com/SaketaChalamchala/ozone/actions/runs/9212326470/job/25344354470#step:5:2216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
devmadhuu commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1612146507 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +685,720 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime":
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
devmadhuu commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1612144375 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +686,625 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime":
[jira] [Resolved] (HDDS-8371) A keyName field in the keyTable might contain a full path for the key instead of the file name
[ https://issues.apache.org/jira/browse/HDDS-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDDS-8371. --- Resolution: Duplicate I'll go ahead resolve this jira because it was fixed by HDDS-8292. > A keyName field in the keyTable might contain a full path for the key instead > of the file name > -- > > Key: HDDS-8371 > URL: https://issues.apache.org/jira/browse/HDDS-8371 > Project: Apache Ozone > Issue Type: Bug > Components: OM >Affects Versions: 1.3.0 >Reporter: Kohei Sugihara >Priority: Major > Labels: pull-request-available > > The listStatus API serves a repeated path in the list when a path for the key > is deep. We noticed the listStatus API serves a corrupt result against some > specific keys in a bucket. The corruption is that repeats a requested key > prefix in a final list of the listStatus result like the following: > {code:java} > # expected case > % aws s3 ls s3://bucket/a/b/c/d/e/f/g/file.zip > a/b/c/d/e/f/g/file.zip > ... > # actual: "a/b/c/d/e/f/g" is duplicated > % aws s3 ls s3://bucket/a/b/c/d/e/f/g/file.zip > a/b/c/d/e/f/g/a/b/c/d/e/f/g/file.zip > ... {code} > Environment: > * Ozone 1.3 > [compatible|https://github.com/apache/ozone/commit/9c61a8aa497ab96c014ad3bb7b1ee4f731ebfaf8] > version (same environment as HDDS-7701, HDDS-7925) > * Several large files, all of them are uploaded by multipart using AWS-CLI, > divided into 8 MB chunks > * An FSO-enabled bucket > * OM HA > Problem Details: > I've dug the OM DB and found metadata in the keyTable has the full path for > the key, so it finally appears redundant prefix twice in the result of the > listStatus API. > {code:java} > # dump keyTable entries > while (keyIter.hasNext()) { > Table.KeyValue kv = keyIter.next(); > OmKeyInfo v = kv.getValue(); > LOG.info("v/b={}/{} parent={} Key={} size={} time={} checksum={} id={} > keyName={}", > kv.getValue().getVolumeName(), kv.getValue().getBucketName(), > nodeId, kv.getValue().getFileName(), > kv.getValue().getDataSize(), kv.getValue().getCreationTime(), > kv.getValue().getFileChecksum(), kv.getValue().getObjectID(), > kv.getValue().getKeyName()); > } > # keyName has a full path for the key > - v/b=volume/bucket parent=-9223371931475161596 Key=0g0pustv.tar.gz > size=2276021708 time=1672052153340 checksum=null id=-9223371931457566207 > keyName=a/b/c/d/e/0g0pustv.tar.gz > - v/b=volume/bucket parent=-9223371931475161596 Key=0g0pustv.zip > size=2333733892 time=167205395 checksum=null id=-9223371931408929023 > keyName=a/b/c/d/e/0g0pustv.zip > - v/b=volume/bucket parent=-9223371931475161596 Key=0nh5ww00.tar.gz > size=249321741 time=1672052233487 checksum=null id=-9223371931393057791 > keyName=a/b/c/d/e/0nh5ww00.tar.gz > - v/b=volume/bucket parent=-9223371931475161596 Key=0nh5ww00.zip > size=255764877 time=1672052242830 checksum=null id=-9223371931388326655 > keyName=a/b/c/d/e/0nh5ww00.zip > - v/b=volume/bucket parent=-9223371931475161596 Key=5b2uha1h.tar.gz > size=2276346612 time=1672052331175 checksum=null id=-9223371931348859135 > keyName=a/b/c/d/e/5b2uha1h.tar.gz > ... > # other keys which have the same parent do not have their prefix in the key > - v/b=volume/bucket parent=-9223371931475161596 Key=kh7vbwlh.zip > size=573797127 time=1672052273970 checksum=null id=-9223371931375503871 > keyName=kh7vbwlh.zip > - v/b=volume/bucket parent=-9223371931475161596 Key=ngaxsd8c.tar.gz > size=380094900 time=1672052284433 checksum=null id=-9223371931368669695 > keyName=ngaxsd8c.tar.gz > - v/b=volume/bucket parent=-9223371931475161596 Key=ngaxsd8c.zip > size=393085953 time=1672052099618 checksum=null id=-9223371931473057023 > keyName=ngaxsd8c.zip > - v/b=volume/bucket parent=-9223371931475161596 Key=nrou31c3.tar.gz > size=568466718 time=1672052124043 checksum=null id=-9223371931461502975 > keyName=nrou31c3.tar.gz > - v/b=volume/bucket parent=-9223371931475161596 Key=nrou31c3.zip > size=574807485 time=1672052149947 checksum=null id=-9223371931446918911 > keyName=nrou31c3.zip > - v/b=volume/bucket parent=-9223371931475161596 Key=ol8dhbqo.tar.gz > size=555722830 time=1672052168904 checksum=null id=-9223371931435349759 > keyName=ol8dhbqo.tar.gz {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
duongkame commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1612123943 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: GRPC is known to have some limits in the outbound: 1. It doesn't provide any callbacks or events to notify it successfully sends a response. (unlike when sending a request, you can listen to the response observer). 2. Even if the buffer being sent is a direct one (or mapped buffer), grpc will not do zero-copy when sending it as a response. It will do another copy to re-frame the original buffer. The copy is done through a reused heap buffer. This PR may be very out of scope (of this PR discussion), but GRPC is not the right tool for the job. And we will keep making unnatural efforts to cope with it (like we're doing). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
szetszwo commented on PR #6690: URL: https://github.com/apache/ozone/pull/6690#issuecomment-2127749473 @ChenSammi , please don't implement our own `MappedBufferManager`. Netty already has a very good buffer management. Let's simply use it; see [HDDS-7188](https://issues.apache.org/jira/browse/HDDS-7188) ( https://github.com/apache/ozone/pull/3730 ). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Updated] (HDDS-10908) Increase DataNode XceiverServerGrpc event loop group size
[ https://issues.apache.org/jira/browse/HDDS-10908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDDS-10908: --- Description: The current configuration has the XceiverServerGrpc boss and worker event loop group share the same thread pool whose size is number of volumes * hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool size number of volumes * hdds.datanode.read.chunk.threads.per.volume. The event loop group thread pool size is too small. Assuming single volume that implies just one thread shared between boss/worker. Using freon DN Echo tool I found increasing the pool size slightly significantly increases throughput: {noformat} sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n 1000 --sleep-time-ms=0 --read-only hdds.datanode.read.chunk.threads.per.volume = 10 (default): mean rate = 44125.45 calls/second hdds.datanode.read.chunk.threads.per.volume = 20: mean rate = 61322.60 calls/second hdds.datanode.read.chunk.threads.per.volume = 40: mean rate = 77951.91 calls/second hdds.datanode.read.chunk.threads.per.volume = 100: mean rate = 65573.07 calls/second hdds.datanode.read.chunk.threads.per.volume = 1000: mean rate = 25079.32 calls/second {noformat} So it appears that increasing the default value to 40 has positive impact. Or we should consider don't associate the thread pool size with number of volumes. Otherwise the number becomes too big for say 48 disks. Note: DN echo in Ratis read only mode is about 83k requests per second on the same host. OM echo in read only mode is about 38k requests per second. was: The current configuration has the XceiverServerGrpc boss and worker event loop group share the same thread pool whose size is number of volumes * hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool size number of volumes * hdds.datanode.read.chunk.threads.per.volume. The event loop group thread pool size is too small. Assuming single volume that implies just one thread shared between boss/worker. Using freon DN Echo tool I found increasing the pool size slightly significantly increases throughput: {noformat} sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n 1000 --sleep-time-ms=0 --read-only hdds.datanode.read.chunk.threads.per.volume = 10 (default): mean rate = 44125.45 calls/second hdds.datanode.read.chunk.threads.per.volume = 20: mean rate = 61322.60 calls/second hdds.datanode.read.chunk.threads.per.volume = 40: mean rate = 77951.91 calls/second hdds.datanode.read.chunk.threads.per.volume = 100: mean rate = 65573.07 calls/second hdds.datanode.read.chunk.threads.per.volume = 1000: mean rate = 25079.32 calls/second {noformat} So it appears that increasing the default value to 40 has positive impact. Or we should consider don't associate the thread pool size with number of volumes. Note: DN echo in Ratis read only mode is about 83k requests per second on the same host. OM echo in read only mode is about 38k requests per second. > Increase DataNode XceiverServerGrpc event loop group size > - > > Key: HDDS-10908 > URL: https://issues.apache.org/jira/browse/HDDS-10908 > Project: Apache Ozone > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > The current configuration has the XceiverServerGrpc boss and worker event > loop group share the same thread pool whose size is number of volumes * > hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool > size number of volumes * hdds.datanode.read.chunk.threads.per.volume. > The event loop group thread pool size is too small. Assuming single volume > that implies just one thread shared between boss/worker. > Using freon DN Echo tool I found increasing the pool size slightly > significantly increases throughput: > {noformat} > sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n > 1000 --sleep-time-ms=0 --read-only > hdds.datanode.read.chunk.threads.per.volume = 10 (default): > mean rate = 44125.45 calls/second > hdds.datanode.read.chunk.threads.per.volume = 20: > mean rate = 61322.60 calls/second > hdds.datanode.read.chunk.threads.per.volume = 40: > mean rate = 77951.91 calls/second > hdds.datanode.read.chunk.threads.per.volume = 100: > mean rate = 65573.07 calls/second > hdds.datanode.read.chunk.threads.per.volume = 1000: > mean rate = 25079.32 calls/second > {noformat} > So it appears that increasing the default value to 40 has positive impact. Or > we should consider don't associate the thread pool size with number of > volumes. Ot
[jira] [Assigned] (HDDS-10908) Increase DataNode XceiverServerGrpc event loop group size
[ https://issues.apache.org/jira/browse/HDDS-10908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDDS-10908: -- Assignee: Wei-Chiu Chuang > Increase DataNode XceiverServerGrpc event loop group size > - > > Key: HDDS-10908 > URL: https://issues.apache.org/jira/browse/HDDS-10908 > Project: Apache Ozone > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > The current configuration has the XceiverServerGrpc boss and worker event > loop group share the same thread pool whose size is number of volumes * > hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool > size number of volumes * hdds.datanode.read.chunk.threads.per.volume. > The event loop group thread pool size is too small. Assuming single volume > that implies just one thread shared between boss/worker. > Using freon DN Echo tool I found increasing the pool size slightly > significantly increases throughput: > {noformat} > sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n > 1000 --sleep-time-ms=0 --read-only > hdds.datanode.read.chunk.threads.per.volume = 10 (default): > mean rate = 44125.45 calls/second > hdds.datanode.read.chunk.threads.per.volume = 20: > mean rate = 61322.60 calls/second > hdds.datanode.read.chunk.threads.per.volume = 40: > mean rate = 77951.91 calls/second > hdds.datanode.read.chunk.threads.per.volume = 100: > mean rate = 65573.07 calls/second > hdds.datanode.read.chunk.threads.per.volume = 1000: > mean rate = 25079.32 calls/second > {noformat} > So it appears that increasing the default value to 40 has positive impact. Or > we should consider don't associate the thread pool size with number of > volumes. > Note: > DN echo in Ratis read only mode is about 83k requests per second on the same > host. > OM echo in read only mode is about 38k requests per second. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Created] (HDDS-10908) Increase DataNode XceiverServerGrpc event loop group size
Wei-Chiu Chuang created HDDS-10908: -- Summary: Increase DataNode XceiverServerGrpc event loop group size Key: HDDS-10908 URL: https://issues.apache.org/jira/browse/HDDS-10908 Project: Apache Ozone Issue Type: Improvement Components: Ozone Datanode Reporter: Wei-Chiu Chuang The current configuration has the XceiverServerGrpc boss and worker event loop group share the same thread pool whose size is number of volumes * hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool size number of volumes * hdds.datanode.read.chunk.threads.per.volume. The event loop group thread pool size is too small. Assuming single volume that implies just one thread shared between boss/worker. Using freon DN Echo tool I found increasing the pool size slightly significantly increases throughput: {noformat} sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n 1000 --sleep-time-ms=0 --read-only hdds.datanode.read.chunk.threads.per.volume = 10 (default): mean rate = 44125.45 calls/second hdds.datanode.read.chunk.threads.per.volume = 20: mean rate = 61322.60 calls/second hdds.datanode.read.chunk.threads.per.volume = 40: mean rate = 77951.91 calls/second hdds.datanode.read.chunk.threads.per.volume = 100: mean rate = 65573.07 calls/second hdds.datanode.read.chunk.threads.per.volume = 1000: mean rate = 25079.32 calls/second {noformat} So it appears that increasing the default value to 40 has positive impact. Or we should consider don't associate the thread pool size with number of volumes. Note: DN echo in Ratis read only mode is about 83k requests per second on the same host. OM echo in read only mode is about 38k requests per second. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
duongkame commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1612104239 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: Please see LeakDetector (HDDS-9528) @ChenSammi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Updated] (HDDS-10896) Refactor PerformanceMetrics creation
[ https://issues.apache.org/jira/browse/HDDS-10896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-10896: Fix Version/s: 1.5.0 Resolution: Implemented Status: Resolved (was: Patch Available) > Refactor PerformanceMetrics creation > > > Key: HDDS-10896 > URL: https://issues.apache.org/jira/browse/HDDS-10896 > Project: Apache Ozone > Issue Type: Sub-task > Components: common >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available > Fix For: 1.5.0 > > > Move creation of mutable metrics objects to {{PerformanceMetrics}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10896. Refactor PerformanceMetrics creation [ozone]
adoroszlai commented on PR #6712: URL: https://github.com/apache/ozone/pull/6712#issuecomment-2127686976 Thanks @xichen01 for the review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10896. Refactor PerformanceMetrics creation [ozone]
adoroszlai merged PR #6712: URL: https://github.com/apache/ozone/pull/6712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Resolved] (HDDS-10831) Customize grpc control flow window size
[ https://issues.apache.org/jira/browse/HDDS-10831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDDS-10831. Resolution: Won't Fix control flow window is set to 5MB on ratis server side. As far as I can tell there's no difference between 1MB and 5MB window size, so I'll resolve this one as won't fix for now. > Customize grpc control flow window size > --- > > Key: HDDS-10831 > URL: https://issues.apache.org/jira/browse/HDDS-10831 > Project: Apache Ozone > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > HDDS-2990 discovered that having a larger flow control windows (grpc default > 1MB) improves throughput performance, and suggested a window size of 5MB. > However, our grpc/ratis code never used the configuration value implemented > by HDDS-2990 so we are still relying on the default value, which is not > optimal. > Open this Jira to use the customized control flow window size. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10239. Storage Container Reconciliation. [ozone]
kerneltime merged PR #6121: URL: https://github.com/apache/ozone/pull/6121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10239. [Reconciliation] Initial proto file for Merkle Tree [ozone]
kerneltime closed pull request #6302: HDDS-10239. [Reconciliation] Initial proto file for Merkle Tree URL: https://github.com/apache/ozone/pull/6302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10239. [Reconciliation] Initial proto file for Merkle Tree [ozone]
kerneltime commented on PR #6302: URL: https://github.com/apache/ozone/pull/6302#issuecomment-2127609298 The proto changes are being merged in https://github.com/apache/ozone/pull/6708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10906. [hsync] 6th merge from master [ozone]
jojochuang commented on PR #6720: URL: https://github.com/apache/ozone/pull/6720#issuecomment-2127377315 wrong base -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10906. [hsync] 6th merge from master [ozone]
jojochuang commented on PR #6720: URL: https://github.com/apache/ozone/pull/6720#issuecomment-2127375682 cc @chungen0126 because both conflicts are associated with your PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Updated] (HDDS-10906) [hsync] 6th merge from master
[ https://issues.apache.org/jira/browse/HDDS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-10906: -- Labels: pull-request-available (was: ) > [hsync] 6th merge from master > - > > Key: HDDS-10906 > URL: https://issues.apache.org/jira/browse/HDDS-10906 > Project: Apache Ozone > Issue Type: Sub-task >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-9626. [Recon] Disk Usage page with high number of key/bucket/volume [ozone]
smitajoshi12 commented on code in PR #6535: URL: https://github.com/apache/ozone/pull/6535#discussion_r1608515878 ## hadoop-ozone/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx: ## @@ -144,20 +145,19 @@ export class DiskUsage extends React.Component, IDUState> const dataSize = duResponse.size; let subpaths: IDUSubpath[] = duResponse.subPaths; - subpaths.sort((a, b) => (a.size < b.size) ? 1 : -1); - // Only show top n blocks with the most DU, // other blocks are merged as a single block - if (subpaths.length > limit) { + if (subpaths.length > limit || (subpaths.length > 0 && limit === MAX_DISPLAY_LIMIT)) { subpaths = subpaths.slice(0, limit); let topSize = 0; -for (let i = 0; i < limit; ++i) { +for (let i = 0; limit === MAX_DISPLAY_LIMIT ? i < subpaths.length : i < limit; ++i) { Review Comment: @dombizita Done testing with for (let i = 0; i < subpaths.length; ++i) it is working for all test cases changed comment section for other Size Object in latest commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-9626. [Recon] Disk Usage page with high number of key/bucket/volume [ozone]
smitajoshi12 commented on code in PR #6535: URL: https://github.com/apache/ozone/pull/6535#discussion_r1608489889 ## hadoop-ozone/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx: ## @@ -265,8 +265,8 @@ export class DiskUsage extends React.Component, IDUState> updateDisplayLimit(e): void { let res = -1; -if (e.key === 'all') { - res = Number.MAX_VALUE; +if (e.key === '30') { + res = Number.parseInt(e.key, 10); } else { res = Number.parseInt(e.key, 10); } Review Comment: @dombizita Removed duplicate Number.parseInt(e.key, 10) in latest commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10862. Reduce time spent on initializing metrics during OM start [ozone]
mango-li commented on PR #6682: URL: https://github.com/apache/ozone/pull/6682#issuecomment-2127188647 > The volume table and bucket table use FullTableCache, CountEstimatedRowsInTable is actually the exact number of rows in the table. Let's keep it synchronized. > > ``` > Metrics.setNumVolumes(metadataManager.countEstimatedRowsInTable(metadataManager.getvolumeTable())); > Metrics.setNumBuckets(metadataManager.countEstimatedRowsInTable(metadataManager.getBucketTable())); > ``` Thank you for the review. I have updated the code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10862. Reduce time spent on initializing metrics during OM start [ozone]
mango-li commented on PR #6682: URL: https://github.com/apache/ozone/pull/6682#issuecomment-2127181738 > > The time saved is mainly from `metadataManager.countRowsInTable(metadataManager.getVolumeTable())` and `metadataManager.countRowsInTable(metadataManager.getBucketTable())` > > Volume and bucket tables are fully cached, so we could get row count from cache size by calling: > > https://github.com/apache/ozone/blob/6f30f2fc2214744fa481f3bf3f96bc301557ff15/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/TypedTable.java#L453-L456 > > This could save time by avoiding iteration and key/value conversion. Good idea! I have changed `metrics.numVolumes` and `metrics.numBuckets` using `metadataManager.countEstimatedRowsInTable`. And still keep om metrics set synchronized. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10862. Reduce time spent on initializing metrics during OM start [ozone]
guohao-rosicky commented on PR #6682: URL: https://github.com/apache/ozone/pull/6682#issuecomment-2127042051 The volume table and bucket table use FullTableCache, CountEstimatedRowsInTable is actually the exact number of rows in the table. Let's keep it synchronized. ``` Metrics.setNumVolumes(metadataManager.countEstimatedRowsInTable(metadataManager.getvolumeTable())); Metrics.setNumBuckets(metadataManager.countEstimatedRowsInTable(metadataManager.getBucketTable())); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
devmadhuu commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1611521963 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +685,624 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime":
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
devmadhuu commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1611497309 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +685,624 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime":
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
sumitagrawl commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1611434486 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +686,625 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime
Re: [PR] HDDS-10634. Recon - listKeys API for listing keys with optional filters [ozone]
sumitagrawl commented on code in PR #6658: URL: https://github.com/apache/ozone/pull/6658#discussion_r1611285969 ## hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java: ## @@ -674,6 +685,720 @@ public Response getDeletedDirectorySummary() { return Response.ok(dirSummary).build(); } + /** + * This API will list out limited 'count' number of keys after applying below filters in API parameters: + * Default Values of API param filters: + *-- replicationType - empty string and filter will not be applied, so list out all keys irrespective of + * replication type. + *-- creationTime - empty string and filter will not be applied, so list out keys irrespective of age. + *-- keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all. + *-- startPrefix - /, API assumes that startPrefix path always starts with /. E.g. /volume/bucket + *-- prevKey - "" + *-- limit - 1000 + * + * @param replicationType Filter for RATIS or EC replication keys + * @param creationDate Filter for keys created after creationDate in "MM-dd- HH:mm:ss" string format. + * @param keySize Filter for Keys greater than keySize in bytes. + * @param startPrefix Filter for startPrefix path. + * @param prevKey rocksDB last key of page requested. + * @param limit Filter for limited count of keys. + * + * @return the list of keys in JSON structured format as per respective bucket layout. + * + * Now lets consider, we have following OBS, LEGACY and FSO bucket key/files namespace tree structure + * + * For OBS Bucket + * + * /volume1/obs-bucket/key1 + * /volume1/obs-bucket/key1/key2 + * /volume1/obs-bucket/key1/key2/key3 + * /volume1/obs-bucket/key4 + * /volume1/obs-bucket/key5 + * /volume1/obs-bucket/key6 + * For LEGACY Bucket + * + * /volume1/legacy-bucket/key + * /volume1/legacy-bucket/key1/key2 + * /volume1/legacy-bucket/key1/key2/key3 + * /volume1/legacy-bucket/key4 + * /volume1/legacy-bucket/key5 + * /volume1/legacy-bucket/key6 + * For FSO Bucket + * + * /volume1/fso-bucket/dir1/dir2/dir3 + * /volume1/fso-bucket/dir1/testfile + * /volume1/fso-bucket/dir1/file1 + * /volume1/fso-bucket/dir1/dir2/testfile + * /volume1/fso-bucket/dir1/dir2/file1 + * /volume1/fso-bucket/dir1/dir2/dir3/testfile + * /volume1/fso-bucket/dir1/dir2/dir3/file1 + * Input Request for OBS bucket: + * + * `api/v1/keys/listKeys?startPrefix=/volume1/obs-bucket&limit=2&replicationType=RATIS` + * Output Response: + * + * { + * "status": "OK", + * "path": "/volume1/obs-bucket", + * "replicatedDataSize": 62914560, + * "unReplicatedDataSize": 62914560, + * "lastKey": "/volume1/obs-bucket/key6", + * "keys": [ + * { + * "key": "/volume1/obs-bucket/key1", + * "path": "volume1/obs-bucket/key1", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781418742, + * "modificationTime": 1715781419762, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2", + * "path": "volume1/obs-bucket/key1/key2", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781421716, + * "modificationTime": 1715781422723, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key1/key2/key3", + * "path": "volume1/obs-bucket/key1/key2/key3", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime": 1715781424718, + * "modificationTime": 1715781425598, + * "isKey": true + * }, + * { + * "key": "/volume1/obs-bucket/key4", + * "path": "volume1/obs-bucket/key4", + * "size": 10485760, + * "replicatedSize": 10485760, + * "replicationInfo": { + * "replicationFactor": "ONE", + * "requiredNodes": 1, + * "replicationType": "RATIS" + * }, + * "creationTime
[jira] [Updated] (HDDS-10900) Add some health state metrics for EC container
[ https://issues.apache.org/jira/browse/HDDS-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-10900: -- Labels: pull-request-available (was: ) > Add some health state metrics for EC container > -- > > Key: HDDS-10900 > URL: https://issues.apache.org/jira/browse/HDDS-10900 > Project: Apache Ozone > Issue Type: Improvement > Components: EC, SCM >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Major > Labels: pull-request-available > > Add some health state metrics for EC container. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[PR] HDDS-10900. Add some health state metrics for EC container [ozone]
YuanbenWang opened a new pull request, #6719: URL: https://github.com/apache/ozone/pull/6719 ## What changes were proposed in this pull request? This PR aims to add some health state metrics for EC container. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-10900 ## How was this patch tested? Unit Test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
[jira] [Assigned] (HDDS-10900) Add some health state metrics for EC container
[ https://issues.apache.org/jira/browse/HDDS-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben reassigned HDDS-10900: -- Assignee: WangYuanben > Add some health state metrics for EC container > -- > > Key: HDDS-10900 > URL: https://issues.apache.org/jira/browse/HDDS-10900 > Project: Apache Ozone > Issue Type: Improvement > Components: EC, SCM >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Major > > Add some health state metrics for EC container. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
ChenSammi commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1611135403 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: A ChunkBuffer is converted as one ByteString(data can be copied, or zero copied, depending on unsafe conversion is enabled or not). Multiple ByteString data for a readChunk request could be concatenated as one whole ByteString or a list of ByteString, depending on request options. The whole response with data is passed to the GPRC call which is async. I could be wrong, but I didn't find any GRPC notification or callback to tell a response has been sent out successfully so that the buffer associated with the response can be safely released, when last time I investigated if it's feasible to adopt the buffer pool in datanode for data read. When the response is finally sent out, the buffer will be auto released by GC some time later. GC will not call ChunkBuffer#close(), but it will call finalize(). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
ChenSammi commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1611137340 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: I'm trying to use WeakReference solution. Will update the patch later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org
Re: [PR] HDDS-10488. Datanode OOO due to run out of mmap handler [ozone]
ChenSammi commented on code in PR #6690: URL: https://github.com/apache/ozone/pull/6690#discussion_r1611135403 ## hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChunkBufferImplWithByteBuffer.java: ## @@ -163,4 +175,12 @@ public String toString() { return getClass().getSimpleName() + ":limit=" + buffer.limit() + "@" + Integer.toHexString(hashCode()); } + + @Override + protected void finalize() throws Throwable { Review Comment: A ChunkBuffer is converted as one ByteString(data can be copied, or zero copied, depending on unsafe conversion is enabled or not). Multiple ByteString data for a readChunk request could be concatenated as one whole ByteString or a list of ByteString, depending on request options. The whole response with data is passed to the GPRC call which is async. I could be wrong, but I didn't find any GRPC notification or callback to tell a response has been sent out successfully so that the buffer associated with the response can be safely released, when last time I investigated if it's feasible to adopt the buffer pool in datanode for data read. When the response is finally sent out, the buffer will be auto released by GC some time later. GC will no call ChunkBuffer#close(), but it will call finalize(). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org