[ https://issues.apache.org/jira/browse/HDDS-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836252#comment-16836252 ]
Mukul Kumar Singh commented on HDDS-1485: ----------------------------------------- The problem here is because of the open file limit being reached on the datanode. The container creation(create directory) failed because of max number of open files and that fails the writeChunk with ContainerNotFoundException. We should improve the error logging for this error though. > Ozone writes fail when single threaded client writes 100MB files repeatedly. > ----------------------------------------------------------------------------- > > Key: HDDS-1485 > URL: https://issues.apache.org/jira/browse/HDDS-1485 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Aravindan Vijayan > Assignee: Shashikant Banerjee > Priority: Blocker > > *Environment* > 26 node physical cluster. > All Datanodes are up and running. > Client attempting to write 1600 x 100MB files using the FsStress utility > (https://github.com/arp7/FsPerfTest) fails with the following error. > {code} > 19/05/02 09:58:49 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 424 does not exist > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:573) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:539) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$2(BlockOutputStream.java:616) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > It looks like a corruption in the container metadata. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org