Hi Eric! Did you follow https://hadoop.apache.org/docs/current2/hadoop-project- dist/hadoop-common/SingleCluster.html to set up your single node cluster? Did you set dfs.replication in hdfs-site.xml ? The logs you posted don't have enough information to debug the issue.
*IF* everything has been set up correctly, your understanding is correct. The block would be written to the single datanode. *IF* the replication was set to >1, and when the block was written it didn't have enough replicas, a "source" replica would be chosen to write to "target" datanodes, so that sufficient replicas existed for your block. If there were no datanodes available, the # of "targets" would be 0 and so HDFS wouldn't be able to achieve the replication you requested. Your configuration would have to be a bit messed up for HDFS to even allow you to write a file with less than minimum replication and then try to replicate after you close. I suggest you follow the SingleCluster.html doc assiduously. HTH Ravi On Tue, Oct 4, 2016 at 11:58 AM, Eric Swenson <e...@swenson.org> wrote: > I have set up a single node cluster (initially) and am attempting to > write a file from a client outside the cluster. I’m using the Java > org.apache.hadoop.fs.FileSystem interface to write the file. > > While the write call returns, the close call hangs for a very long time, > eventually > returns, but the resulting file in HDFS is 0 bytes in length. The namenode > log > says: > > 2016-10-03 22:01:41,367 INFO BlockStateChange: chooseUnderReplicatedBlocks > selected 1 blocks at priority level 0; Total=1 Reset bookmarks? true > 2016-10-03 22:01:41,367 INFO BlockStateChange: BLOCK* neededReplications = > 1, pendingReplications = 0. > 2016-10-03 22:01:41,367 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > Blocks chosen but could not be replicated = 1; of which 1 have no target, 0 > have no source, 0 are UC, 0 are abandoned, 0 already have enough replicas. > > Why is the block not written to the single datanode (same as > namenode)? What does it mean to "have no target"? The replication > count is 1 and I would have thought that a single copy of the file > would be stored on the single cluster node. > > I decided to see what happened if I added a second node to the cluster. > Essentially the same thing happens. The file (in HDFS) ends up being > zero-length, and I get similar messages from the NameNode telling me that > there are additional neededReplications and that none of the blocks could > be replicated because they “have no target”. > > If I SSH into the combined Name/Data node instance and use the “hdfs dfs > -put” command, I have no trouble storing files. I’m using the same user > regardless of whether I’m using a remote fs.write operation or whether I’m > using the “hdfs dfs -put” command while logged into the NameNode. > > What am I doing wrong? — Eric > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org > For additional commands, e-mail: user-h...@hadoop.apache.org > >