Re: HDFS in Kubernetes

Ravi Prakash Mon, 06 Jun 2016 14:10:46 -0700

Klaus!

Good luck with your attempt to run HDFS inside Kubernetes! Please keep us
posted.


For creating a new file, a DFSClient :
1. First calls addBlock on the NameNode.
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java#L842
. This returns a list of "LocatedBlocks" . (This is essentially a list of
datanode storages to which the client should write)
2. The DFSClient then creates a pipeline to which it streams data
consisting of the (usually 3) datanodes.

In your case, when you go to the Namenode Web UI
(http://<namenode>:50070/dfshealth.html#tab-datanode)
what is the Datanode's ID? You should debug the client and namenode to see
what is the list of LocatedBlocks returned by the addBlock call.

HTH
Ravi

On Sat, Jun 4, 2016 at 7:22 AM, Klaus Ma <[email protected]> wrote:

> Hi team,
>
>
> I'm working to run HDFS in kubernetes; all configuration is ready:
> kube-dns, hdfs-site.xml and ssh. But when I create files in HDFS I got the
> following exception. In exception, "10.0.1.126:50010" is the host's ip &
> port instead of container; is there any configuration to ask DFSClient to
> use container's IP instead of host IP?
>
>
> 16/06/04 14:06:23 INFO hdfs.DFSClient: Exception in createBlockOutputStream
>
> java.net.ConnectException: Connection refused
>
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
>
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1537)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1313)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>
> 16/06/04 14:06:23 INFO hdfs.DFSClient: Abandoning
> BP-223491250-172.1.78.2-1465048638628:blk_1073741825_1001
>
> 16/06/04 14:06:23 INFO hdfs.DFSClient: Excluding datanode
> DatanodeInfoWithStorage[10.0.1.126:50010
> ,DS-a2c2d3db-790c-4b76-81f6-856c809b01e2,DISK]
>
> 16/06/04 14:06:23 WARN hdfs.DFSClient: DataStreamer Exception
>
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /user/root/QuasiMonteCarlo_1465049182505_2071986941/in/part0 could only be
> replicated to 0 nodes instead of minReplication (=1).  There are 1
> datanode(s) running and 1 node(s) are excluded in this operation.
>
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
>
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
>
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>
>
> ----
> Da (Klaus), Ma (马达) | PMP | Advisory Software Engineer
> Platform OpenSource Technology, STG, IBM GCG
> +86-10-8245 4084 | [email protected] | http://k82.me
>

Re: HDFS in Kubernetes

Reply via email to