[ 
https://issues.apache.org/jira/browse/HDFS-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149554#comment-16149554
 ] 

Nandakumar commented on HDFS-12367:
-----------------------------------

If corona was executed in one of the datanode, HDFS-12382 would have caused 
"Too many open files". I was getting the same error, after applying patch for 
HDFS-12382 the issue got resolved in my local.

Also noticed the following in output of {{lsof}} command for corona process

{code}
java      9876 nvadivelu  357u     IPv4 0xc4e9fc0d68262f8d        0t0      TCP 
10.200.4.230:52234->10.200.4.230:50011 (ESTABLISHED)
java      9876 nvadivelu  358      PIPE 0xc4e9fc0d5e18843d      16384          
->0xc4e9fc0d5e18afbd
java      9876 nvadivelu  359      PIPE 0xc4e9fc0d5e18afbd      16384          
->0xc4e9fc0d5e18843d
java      9876 nvadivelu  360u   KQUEUE                                        
count=0, state=0xa
java      9876 nvadivelu  361      PIPE 0xc4e9fc0d5e18837d      16384          
->0xc4e9fc0d5e18a3bd
java      9876 nvadivelu  362      PIPE 0xc4e9fc0d5e1882bd      16384          
->0xc4e9fc0d5e18837d
java      9876 nvadivelu  363u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  364      PIPE 0xc4e9fc0d5e1881fd      16384          
->0xc4e9fc0d5e18813d
java      9876 nvadivelu  365      PIPE 0xc4e9fc0d5e18813d      16384          
->0xc4e9fc0d5e1881fd
java      9876 nvadivelu  366u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  367      PIPE 0xc4e9fc0d5e18807d      16384          
->0xc4e9fc0d70ba59fd
java      9876 nvadivelu  368      PIPE 0xc4e9fc0d70ba59fd      16384          
->0xc4e9fc0d5e18807d
java      9876 nvadivelu  369u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  370      PIPE 0xc4e9fc0d70ba5ffd      16384          
->0xc4e9fc0d70ba56fd
java      9876 nvadivelu  371      PIPE 0xc4e9fc0d70ba56fd      16384          
->0xc4e9fc0d70ba5ffd
java      9876 nvadivelu  372u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  373      PIPE 0xc4e9fc0d70ba5f3d      16384          
->0xc4e9fc0d70ba563d
java      9876 nvadivelu  374      PIPE 0xc4e9fc0d70ba563d      16384          
->0xc4e9fc0d70ba5f3d
java      9876 nvadivelu  375u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  376      PIPE 0xc4e9fc0d70ba4c7d      16384          
->0xc4e9fc0d70ba69bd
java      9876 nvadivelu  377      PIPE 0xc4e9fc0d70ba69bd      16384          
->0xc4e9fc0d70ba4c7d
java      9876 nvadivelu  378u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  379      PIPE 0xc4e9fc0d70ba6e3d      16384          
->0xc4e9fc0d70ba497d
java      9876 nvadivelu  380      PIPE 0xc4e9fc0d70ba497d      16384          
->0xc4e9fc0d70ba6e3d
java      9876 nvadivelu  381u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  382      PIPE 0xc4e9fc0d70ba57bd      16384          
->0xc4e9fc0d6fdd9b3d
java      9876 nvadivelu  383      PIPE 0xc4e9fc0d6fdd9b3d      16384          
->0xc4e9fc0d70ba57bd
java      9876 nvadivelu  384u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  385      PIPE 0xc4e9fc0d6fdd9d7d      16384          
->0xc4e9fc0d6fdd9efd
java      9876 nvadivelu  386      PIPE 0xc4e9fc0d6fdd9efd      16384          
->0xc4e9fc0d6fdd9d7d
java      9876 nvadivelu  387u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  388      PIPE 0xc4e9fc0d6fdd9e3d      16384          
->0xc4e9fc0d658559fd
java      9876 nvadivelu  389      PIPE 0xc4e9fc0d658559fd      16384          
->0xc4e9fc0d6fdd9e3d
java      9876 nvadivelu  390u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  391      PIPE 0xc4e9fc0d65855abd      16384          
->0xc4e9fc0d6585593d
java      9876 nvadivelu  392      PIPE 0xc4e9fc0d6585593d      16384          
->0xc4e9fc0d65855abd
java      9876 nvadivelu  393u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  394      PIPE 0xc4e9fc0d6585587d      16384          
->0xc4e9fc0d65855b7d
java      9876 nvadivelu  395      PIPE 0xc4e9fc0d65855b7d      16384          
->0xc4e9fc0d6585587d
java      9876 nvadivelu  396u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  397      PIPE 0xc4e9fc0d658557bd      16384          
->0xc4e9fc0d65855c3d
java      9876 nvadivelu  398      PIPE 0xc4e9fc0d65855c3d      16384          
->0xc4e9fc0d658557bd
java      9876 nvadivelu  399u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  400      PIPE 0xc4e9fc0d65855cfd      16384          
->0xc4e9fc0d658556fd
java      9876 nvadivelu  401      PIPE 0xc4e9fc0d658556fd      16384          
->0xc4e9fc0d65855cfd
java      9876 nvadivelu  402u   KQUEUE                                        
count=0, state=0x8
java      9876 nvadivelu  403      PIPE 0xc4e9fc0d6585563d      16384          
->0xc4e9fc0d65855dbd
java      9876 nvadivelu  404      PIPE 0xc4e9fc0d65855dbd      16384          
->0xc4e9fc0d6585563d
java      9876 nvadivelu  405u   KQUEUE                                        
count=0, state=0x8
....... truncated
{code}


> Ozone: Too many open files error while running corona
> -----------------------------------------------------
>
>                 Key: HDFS-12367
>                 URL: https://issues.apache.org/jira/browse/HDFS-12367
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone, tools
>            Reporter: Weiwei Yang
>            Assignee: Mukul Kumar Singh
>
> Too many open files error keeps happening to me while using corona, I have 
> simply setup a single node cluster and run corona to generate 1000 keys, but 
> I keep getting following error
> {noformat}
> ./bin/hdfs corona -numOfThreads 1 -numOfVolumes 1 -numOfBuckets 1 -numOfKeys 
> 1000
> 17/08/28 00:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 17/08/28 00:47:42 INFO tools.Corona: Number of Threads: 1
> 17/08/28 00:47:42 INFO tools.Corona: Mode: offline
> 17/08/28 00:47:42 INFO tools.Corona: Number of Volumes: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Buckets per Volume: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Keys per Bucket: 1000.
> 17/08/28 00:47:42 INFO rpc.OzoneRpcClient: Creating Volume: vol-0-05000, with 
> wwei as owner and quota set to 1152921504606846976 bytes.
> 17/08/28 00:47:42 INFO tools.Corona: Starting progress bar Thread.
> ...
> ERROR tools.Corona: Exception while adding key: key-251-19293 in bucket: 
> bucket-0-34960 of volume: vol-0-05000.
> java.io.IOException: Exception getting XceiverClient.
>       at 
> org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:156)
>       at 
> org.apache.hadoop.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:122)
>       at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.getFromKsmKeyInfo(ChunkGroupOutputStream.java:289)
>       at 
> org.apache.hadoop.ozone.client.rpc.OzoneRpcClient.createKey(OzoneRpcClient.java:487)
>       at 
> org.apache.hadoop.ozone.tools.Corona$OfflineProcessor.run(Corona.java:352)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.IllegalStateException: failed to create a child event loop
>       at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2234)
>       at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>       at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
>       at 
> org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:144)
>       ... 9 more
> Caused by: java.lang.IllegalStateException: failed to create a child event 
> loop
>       at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68)
>       at 
> io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49)
>       at 
> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:61)
>       at 
> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:52)
>       at 
> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:44)
>       at 
> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:36)
>       at org.apache.hadoop.scm.XceiverClient.connect(XceiverClient.java:76)
>       at 
> org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:151)
>       at 
> org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:145)
>       at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
>       at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
>       at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
>       at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
>       at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>       ... 12 more
> Caused by: io.netty.channel.ChannelException: failed to open a new selector
>       at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:128)
>       at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:120)
>       at 
> io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:87)
>       at 
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
>       ... 25 more
> Caused by: java.io.IOException: Too many open files
>       at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
>       at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:130)
>       at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:69)
>       at 
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
>       at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:126)
>       ... 28 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to