The actual check is done to see if 5 blocks worth of space is available remaining.
On Sat, Jul 16, 2011 at 1:52 PM, Thomas Anderson <t.dt.aander...@gmail.com> wrote: > Harsh, > > Thanks, you are right. The problem stems from the tmp directory space > is not large enough. After changing tmp dir to other place, the > problem goes away. > > But I remember one block size (default) in hdfs is 64m, so shouldn't > it at least allow one file, whose actual size in local disk is smaller > than 1k, to be uploaded? > > Thanks again for the advice. > > On Fri, Jul 15, 2011 at 7:49 PM, Harsh J <ha...@cloudera.com> wrote: >> Thomas, >> >> Your problem might lie simply with the virtual node DNs using /tmp and >> tmpfs being used for that -- which somehow is causing reported free >> space to go as 0 in reports to the NN (master). >> >> tmpfs 101M 44K 101M 1% /tmp >> >> This causes your trouble that the NN can't choose a suitable DN to >> write to, cause it determines that none has at least a block size >> worth of space (64MB default) available for writes. >> >> You can resolve as: >> >> 1. Stop DFS completely. >> >> 2. Create a directory under root somewhere (I use Cloudera's distro, >> and its default configured location for data files comes along as >> /var/lib/hadoop-0.20/cache/, if you need an idea for a location) and >> set it as your hadoop.tmp.dir in core-site.xml on all the nodes. >> >> 3. Reformat your NameNode (hadoop namenode -format, say Y) and restart >> DFS. Things _should_ be OK now. >> >> Config example (core-site.xml): >> >> <property> >> <name>hadoop.tmp.dir</name> >> <value>/var/lib/hadoop-0.20/cache</value> >> </property> >> >> Let us know if this still doesn't get your dev cluster up and running >> for action :) >> >> On Fri, Jul 15, 2011 at 4:40 PM, Thomas Anderson >> <t.dt.aander...@gmail.com> wrote: >>> When doing partition, I remember only / and swap was specified for all >>> nodes during creation. So I think /tmp is also mounted under /, which >>> should have size around 9G. The total size of hardisk specified is >>> 10G. >>> >>> The df -kh shows >>> >>> server01: >>> /dev/sda1 9.4G 2.3G 6.7G 25% / >>> tmpfs 5.0M 4.0K 5.0M 1% /lib/init/rw >>> tmpfs 5.0M 0 5.0M 0% /var/run/lock >>> tmpfs 101M 132K 101M 1% /tmp >>> udev 247M 0 247M 0% /dev >>> tmpfs 101M 0 101M 0% /var/run/shm >>> tmpfs 51M 176K 51M 1% /var/run >>> >>> server02: >>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>> tmpfs 5.0M 4.0K 5.0M 1% /lib/init/rw >>> tmpfs 5.0M 0 5.0M 0% /var/run/lock >>> tmpfs 101M 44K 101M 1% /tmp >>> udev 247M 0 247M 0% /dev >>> tmpfs 101M 0 101M 0% /var/run/shm >>> tmpfs 51M 176K 51M 1% /var/run >>> >>> server03: >>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>> tmpfs 5.0M 4.0K 5.0M 1% /lib/init/rw >>> tmpfs 5.0M 0 5.0M 0% /var/run/lock >>> tmpfs 101M 44K 101M 1% /tmp >>> udev 247M 0 247M 0% /dev >>> tmpfs 101M 0 101M 0% /var/run/shm >>> tmpfs 51M 176K 51M 1% /var/run >>> >>> server04: >>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>> tmpfs 5.0M 4.0K 5.0M 1% /lib/init/rw >>> tmpfs 5.0M 0 5.0M 0% /var/run/lock >>> tmpfs 101M 44K 101M 1% /tmp >>> udev 247M 0 247M 0% /dev >>> tmpfs 101M 0 101M 0% /var/run/shm >>> tmpfs 51M 176K 51M 1% /var/run >>> >>> server05: >>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>> tmpfs 5.0M 4.0K 5.0M 1% /lib/init/rw >>> tmpfs 5.0M 0 5.0M 0% /var/run/lock >>> tmpfs 101M 44K 101M 1% /tmp >>> udev 247M 0 247M 0% /dev >>> tmpfs 101M 0 101M 0% /var/run/shm >>> tmpfs 51M 176K 51M 1% /var/run >>> >>> In addition, the output of dfs (du -sk /tmp/hadoop-user/dfs) is >>> >>> server02: >>> 8 /tmp/hadoop-user/dfs/ >>> >>> server03: >>> 8 /tmp/hadoop-user/dfs/ >>> >>> server04: >>> 8 /tmp/hadoop-user/dfs/ >>> >>> server05: >>> 8 /tmp/hadoop-user/dfs/ >>> >>> On Fri, Jul 15, 2011 at 7:01 PM, Harsh J <ha...@cloudera.com> wrote: >>>> (P.s. I asked that cause if you look at your NN's live nodes tables, >>>> the reported space is all 0) >>>> >>>> What's the output of: >>>> >>>> du -sk /tmp/hadoop-user/dfs on all your DNs? >>>> >>>> On Fri, Jul 15, 2011 at 4:01 PM, Harsh J <ha...@cloudera.com> wrote: >>>>> Thomas, >>>>> >>>>> Is your /tmp/ mount point also under the / or is it separate? Your >>>>> dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are >>>>> separately mounted then what's the available space on that? >>>>> >>>>> (bad idea in production to keep things default on /tmp though, like >>>>> dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary) >>>>> >>>>> On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson >>>>> <t.dt.aander...@gmail.com> wrote: >>>>>> 1.) The disk usage (with df -kh) on namenode (server01) >>>>>> >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sda1 9.4G 2.3G 6.7G 25% / >>>>>> >>>>>> and datanodes (server02 ~ server05) >>>>>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>>>>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>>>>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>>>>> /dev/sda1 9.4G 2.2G 6.8G 25% / >>>>>> >>>>>> 2.) How can I make sure that datanode is busy? The environment is only >>>>>> for testing so there is no other user processes are running at that >>>>>> moment. Also it is a fresh installation, so only hadoop required >>>>>> packages are installed such as hadoop and jdk. >>>>>> >>>>>> 3.) fs.block.size is not set in hdfs-site.xml, including datanodes and >>>>>> namenode, because its purpose is for testing. I thought it would use >>>>>> the default value, which should be 512? >>>>>> >>>>>> 4.) What might be a good way for fast check if network is not stable? >>>>>> I check the healthy page e.g. server01:50070/dfshealth.jsp where >>>>>> livenodes are up and last contact varies when checking the page. >>>>>> >>>>>> Node Last Contact Admin State Configured Capacity (GB) >>>>>> Used >>>>>> (GB) Non DFS Used (GB) Remaining (GB) Used (%) >>>>>> Used (%) >>>>>> Remaining (%) Blocks >>>>>> server02 2 In Service 0.1 0 0 0.1 >>>>>> 0.01 99.96 0 >>>>>> server03 0 In Service 0.1 0 0 0.1 >>>>>> 0.01 99.96 0 >>>>>> server04 1 In Service 0.1 0 0 0.1 >>>>>> 0.01 99.96 0 >>>>>> server05 2 In Service 0.1 0 0 0.1 >>>>>> 0.01 99.96 0 >>>>>> >>>>>> 5.) Only command `hadoop fs -put /tmp/testfile test` is issued as it >>>>>> is just to test if the installation is working. So the file e.g. >>>>>> testfile will be removed first (hadoop fs -rm test/testfile), then >>>>>> upload again with hadoop put command. >>>>>> >>>>>> The logs are listed as below: >>>>>> >>>>>> namenode: >>>>>> server01: http://pastebin.com/TLpDmmPx >>>>>> >>>>>> datanodes: >>>>>> server02: http://pastebin.com/pdE5XKfi >>>>>> server03: http://pastebin.com/4aV7ECCV >>>>>> server04: http://pastebin.com/tF7HiRZj >>>>>> server05: http://pastebin.com/5qwSPrvU >>>>>> >>>>>> Please let me know if more information needs to be provided. >>>>>> >>>>>> I really appreciate your suggestion. >>>>>> >>>>>> Thank you. >>>>>> >>>>>> >>>>>> On Fri, Jul 15, 2011 at 4:54 PM, Brahma Reddy <brahmared...@huawei.com> >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> By seeing this exception(could only be replicated to 0 nodes, instead >>>>>>> of 1) >>>>>>> ,datanode is not available to Name Node.. >>>>>>> >>>>>>> This are the following cases Data Node may not available to Name Node >>>>>>> >>>>>>> 1)Data Node disk is Full >>>>>>> >>>>>>> 2)Data Node is Busy with block report and block scanning >>>>>>> >>>>>>> 3)If Block Size is Negative value(dfs.block.size in hdfs-site.xml) >>>>>>> >>>>>>> 4)while write in progress primary datanode goes down(Any n/w >>>>>>> fluctations b/w >>>>>>> Name Node and Data Node Machines) >>>>>>> >>>>>>> 5)when Ever we append any partial chunk and call sync for subsequent >>>>>>> partial >>>>>>> chunk appends client should store the previous data in buffer. >>>>>>> >>>>>>> For example after appending "a" I have called sync and when I am trying >>>>>>> the >>>>>>> to append the buffer should have "ab" >>>>>>> >>>>>>> And Server side when the chunk is not multiple of 512 then it will try >>>>>>> to do >>>>>>> Crc comparison for the data present in block file as well as crc >>>>>>> present in >>>>>>> metafile. But while constructing crc for the data present in block it is >>>>>>> always comparing till the initial Offeset >>>>>>> >>>>>>> Or For more analysis Please the data node logs >>>>>>> >>>>>>> Warm Regards >>>>>>> >>>>>>> Brahma Reddy >>>>>>> >>>>>>> **************************************************************************** >>>>>>> *********** >>>>>>> This e-mail and attachments contain confidential information from >>>>>>> HUAWEI, >>>>>>> which is intended only for the person or entity whose address is listed >>>>>>> above. Any use of the information contained herein in any way >>>>>>> (including, >>>>>>> but not limited to, total or partial disclosure, reproduction, or >>>>>>> dissemination) by persons other than the intended recipient's) is >>>>>>> prohibited. If you receive this e-mail in error, please notify the >>>>>>> sender by >>>>>>> phone or email immediately and delete it! >>>>>>> -----Original Message----- >>>>>>> From: Thomas Anderson [mailto:t.dt.aander...@gmail.com] >>>>>>> Sent: Friday, July 15, 2011 9:09 AM >>>>>>> To: hdfs-user@hadoop.apache.org >>>>>>> Subject: could only be replicated to 0 nodes, instead of 1 >>>>>>> >>>>>>> I have fresh hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk >>>>>>> 1.6.0_26. The problem is when trying to put a file to hdfs, it throws >>>>>>> error `org.apache.hadoop.ipc.RemoteException: java.io.IOException: >>>>>>> File /path/to/file could only be replicated to 0 nodes, instead of 1'; >>>>>>> however, there is no problem to create a folder, as the command ls >>>>>>> print the result >>>>>>> >>>>>>> Found 1 items >>>>>>> drwxr-xr-x - user supergroup 0 2011-07-15 11:09 >>>>>>> /user/user/test >>>>>>> >>>>>>> I also try with flushing firewall (remove all iptables restriction), >>>>>>> but the error message is still thrown out when uploading (hadoop fs >>>>>>> -put /tmp/x test) a file from local fs. >>>>>>> >>>>>>> The name node log shows >>>>>>> >>>>>>> 2011-07-15 10:42:43,491 INFO org.apache.hadoop.hdfs.StateChange: >>>>>>> BLOCK* NameSystem.registerDatanode: node registration from >>>>>>> aaa.bbb.ccc.ddd.22:50010 storage DS-929017105-aaa.bbb.ccc.22-50010-13 >>>>>>> 10697763488 >>>>>>> 2011-07-15 10:42:43,495 INFO org.apache.hadoop.net.NetworkTopology: >>>>>>> Adding a new node: /default-rack/aaa.bbb.ccc.22:50010 >>>>>>> 2011-07-15 10:42:44,169 INFO org.apache.hadoop.hdfs.StateChange: >>>>>>> BLOCK* NameSystem.registerDatanode: node registration from >>>>>>> aaa.bbb.ccc.35:50010 storage DS-884574392-aaa.bbb.ccc.35-50010-13 >>>>>>> 10697764164 >>>>>>> 2011-07-15 10:42:44,170 INFO org.apache.hadoop.net.NetworkTopology: >>>>>>> Adding a new node: /default-rack/aaa.bbb.ccc.35:50010 >>>>>>> 2011-07-15 10:42:44,507 INFO org.apache.hadoop.hdfs.StateChange: >>>>>>> BLOCK* NameSystem.registerDatanode: node registration from >>>>>>> aaa.bbb.ccc.ddd.11:50010 storage DS-1537583073-aaa.bbb.ccc.11-50010-1 >>>>>>> 310697764488 >>>>>>> 2011-07-15 10:42:44,507 INFO org.apache.hadoop.net.NetworkTopology: >>>>>>> Adding a new node: /default-rack/aaa.bbb.ccc.11:50010 >>>>>>> 2011-07-15 10:42:45,796 INFO org.apache.hadoop.hdfs.StateChange: >>>>>>> BLOCK* NameSystem.registerDatanode: node registration from >>>>>>> 140.127.220.25:50010 storage DS-1500589162-aaa.bbb.ccc.25-50010-1 >>>>>>> 310697765386 >>>>>>> 2011-07-15 10:42:45,797 INFO org.apache.hadoop.net.NetworkTopology: >>>>>>> Adding a new node: /default-rack/aaa.bbb.ccc.25:50010 >>>>>>> >>>>>>> And all datanodes have similar message as below: >>>>>>> >>>>>>> 2011-07-15 10:42:46,562 INFO >>>>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: using >>>>>>> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec >>>>>>> 2011-07-15 10:42:47,163 INFO >>>>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 >>>>>>> blocks got processed in 3 msecs >>>>>>> 2011-07-15 10:42:47,187 INFO >>>>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic >>>>>>> block scanner. >>>>>>> 2011-07-15 11:19:42,931 INFO >>>>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 >>>>>>> blocks got processed in 1 msecs >>>>>>> >>>>>>> Command `hadoop fsck /` displays >>>>>>> >>>>>>> Status: HEALTHY >>>>>>> Total size: 0 B >>>>>>> Total dirs: 3 >>>>>>> Total files: 0 (Files currently being written: 1) >>>>>>> Total blocks (validated): 0 >>>>>>> Minimally replicated blocks: 0 >>>>>>> Over-replicated blocks: 0 >>>>>>> Under-replicated blocks: 0 >>>>>>> Mis-replicated blocks: 0 >>>>>>> Default replication factor: 3 >>>>>>> Average block replication: 0.0 >>>>>>> Corrupt blocks: 0 >>>>>>> Missing replicas: 0 >>>>>>> Number of data-nodes: 4 >>>>>>> >>>>>>> The setting in conf include: >>>>>>> >>>>>>> - Master node: >>>>>>> core-site.xml >>>>>>> <property> >>>>>>> <name>fs.default.name</name> >>>>>>> <value>hdfs://lab01:9000/</value> >>>>>>> </property> >>>>>>> >>>>>>> hdfs-site.xml >>>>>>> <property> >>>>>>> <name>dfs.replication</name> >>>>>>> <value>3</value> >>>>>>> </property> >>>>>>> >>>>>>> -Slave nodes: >>>>>>> core-site.xml >>>>>>> <property> >>>>>>> <name>fs.default.name</name> >>>>>>> <value>hdfs://lab01:9000/</value> >>>>>>> </property> >>>>>>> >>>>>>> hdfs-site.xml >>>>>>> <property> >>>>>>> <name>dfs.replication</name> >>>>>>> <value>3</value> >>>>>>> </property> >>>>>>> >>>>>>> Do I missing any configuration? Or any place that I can check? >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Harsh J >>>>> >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> >>> >> >> >> >> -- >> Harsh J >> > -- Harsh J