[jira] [Updated] (HDDS-3498) Address already in use Should shutdown the datanode with FATAL log and point out the port and configure key

2020-06-09 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-3498:
--
Target Version/s: 0.7.0  (was: 0.6.0)

> Address already in use Should shutdown the datanode with FATAL log and point 
> out the port and configure key
> ---
>
> Key: HDDS-3498
> URL: https://issues.apache.org/jira/browse/HDDS-3498
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Priority: Minor
>  Labels: Triaged
>
> Now, the datanode process cannot work because the port is in use, but the 
> process still live.
> Further more, i guess the in use port is 9861, but it isn't, after look the 
> source code, i find it is the `dfs.container.ipc`, default port is 9859, this 
> port should appear with the following exception. I think this error should be 
> in FATAL level, and we can terminate the datanode process.
> {code:java}
> 2020-04-21 15:53:05,436 [Datanode State Machine Thread - 0] WARN 
> org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine: 
> Unable to communicate to SCM server at 127.0.0.1:9861 for past 300 seconds.
> java.io.IOException: Failed to bind
> at 
> org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:246)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:184)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:90)
> at 
> org.apache.hadoop.ozone.container.common.transport.server.XceiverServerGrpc.start(XceiverServerGrpc.java:141)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:235)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:113)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:132)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:551)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1345)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:503)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:488)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:984)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:247)
> at 
> org.apache.ratis.thirdparty.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at 
> org.apache.ratis.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> ... 1 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To 

[jira] [Updated] (HDDS-3498) Address already in use Should shutdown the datanode with FATAL log and point out the port and configure key

2020-06-01 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-3498:

Target Version/s: 0.6.0
  Labels: Triaged  (was: )

> Address already in use Should shutdown the datanode with FATAL log and point 
> out the port and configure key
> ---
>
> Key: HDDS-3498
> URL: https://issues.apache.org/jira/browse/HDDS-3498
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Priority: Minor
>  Labels: Triaged
>
> Now, the datanode process cannot work because the port is in use, but the 
> process still live.
> Further more, i guess the in use port is 9861, but it isn't, after look the 
> source code, i find it is the `dfs.container.ipc`, default port is 9859, this 
> port should appear with the following exception. I think this error should be 
> in FATAL level, and we can terminate the datanode process.
> {code:java}
> 2020-04-21 15:53:05,436 [Datanode State Machine Thread - 0] WARN 
> org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine: 
> Unable to communicate to SCM server at 127.0.0.1:9861 for past 300 seconds.
> java.io.IOException: Failed to bind
> at 
> org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:246)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:184)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:90)
> at 
> org.apache.hadoop.ozone.container.common.transport.server.XceiverServerGrpc.start(XceiverServerGrpc.java:141)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:235)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:113)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:132)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:551)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1345)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:503)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:488)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:984)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:247)
> at 
> org.apache.ratis.thirdparty.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at 
> org.apache.ratis.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> ... 1 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HDDS-3498) Address already in use Should shutdown the datanode with FATAL log and point out the port and configure key

2020-05-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3498:
-
Component/s: Ozone Datanode

> Address already in use Should shutdown the datanode with FATAL log and point 
> out the port and configure key
> ---
>
> Key: HDDS-3498
> URL: https://issues.apache.org/jira/browse/HDDS-3498
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Priority: Minor
>
> Now, the datanode process cannot work because the port is in use, but the 
> process still live.
> Further more, i guess the in use port is 9861, but it isn't, after look the 
> source code, i find it is the `dfs.container.ipc`, default port is 9859, this 
> port should appear with the following exception. I think this error should be 
> in FATAL level, and we can terminate the datanode process.
> {code:java}
> 2020-04-21 15:53:05,436 [Datanode State Machine Thread - 0] WARN 
> org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine: 
> Unable to communicate to SCM server at 127.0.0.1:9861 for past 300 seconds.
> java.io.IOException: Failed to bind
> at 
> org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:246)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:184)
> at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:90)
> at 
> org.apache.hadoop.ozone.container.common.transport.server.XceiverServerGrpc.start(XceiverServerGrpc.java:141)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:235)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:113)
> at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:132)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:551)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1345)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:503)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:488)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:984)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:247)
> at 
> org.apache.ratis.thirdparty.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
> at 
> org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at 
> org.apache.ratis.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> ... 1 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: