[ 
https://issues.apache.org/jira/browse/HDDS-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955757#comment-16955757
 ] 

Lokesh Jain commented on HDDS-2332:
-----------------------------------

[~cxorm] It is difficult to reproduce the issue. I saw it in one of the runs. 
It is happening because of RATIS-718. Once it is fixed it should not appear in 
the runs. But we might need to support request timeouts in ozone as well.

> BlockOutputStream#waitOnFlushFutures blocks on putBlock combined future
> -----------------------------------------------------------------------
>
>                 Key: HDDS-2332
>                 URL: https://issues.apache.org/jira/browse/HDDS-2332
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>            Reporter: Lokesh Jain
>            Priority: Major
>
> BlockOutputStream blocks on waitOnFlushFutures call. Two jstacks show that 
> the thread is blocked on the same condition.
> {code:java}
> 2019-10-18 06:30:38
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.141-b15 mixed mode):
> "main" #1 prio=5 os_prio=0 tid=0x00007fbea001a800 nid=0x2a56 waiting on 
> condition [0x00007fbea96d6000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000e4739888> (a 
> java.util.concurrent.CompletableFuture$Signaller)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
>       at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
>       at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
>       at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:518)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:481)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:496)
>       at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:439)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:232)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:190)
>       at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:46)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>       at java.io.DataOutputStream.write(DataOutputStream.java:107)
>       - locked <0x00000000a6a75930> (a 
> org.apache.hadoop.fs.FSDataOutputStream)
>       at 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:77)
>       - locked <0x00000000a6a75918> (a 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter)
>       at 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:64)
>       at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
>       at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>       at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>       at 
> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
>       at 
> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:203)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 2019-10-18 07:02:50
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.141-b15 mixed mode):
> "main" #1 prio=5 os_prio=0 tid=0x00007fbea001a800 nid=0x2a56 waiting on 
> condition [0x00007fbea96d6000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000e4739888> (a 
> java.util.concurrent.CompletableFuture$Signaller)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
>       at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
>       at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
>       at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:518)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:481)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:496)
>       at 
> org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:143)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:439)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:232)
>       at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:190)
>       at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:46)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>       at java.io.DataOutputStream.write(DataOutputStream.java:107)
>       - locked <0x00000000a6a75930> (a 
> org.apache.hadoop.fs.FSDataOutputStream)
>       at 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:77)
>       - locked <0x00000000a6a75918> (a 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter)
>       at 
> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:64)
>       at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
>       at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>       at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>       at 
> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
>       at 
> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:203)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to