Re: could only be replicated to 0 nodes, instead of 1

Harsh J Fri, 15 Jul 2011 03:32:11 -0700

Thomas,

Is your /tmp/ mount point also under the / or is it separate? Your
dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
separately mounted then what's the available space on that?


(bad idea in production to keep things default on /tmp though, like
dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
<t.dt.aander...@gmail.com> wrote:
> 1.) The disk usage (with df -kh) on namenode (server01)
>
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1             9.4G  2.3G  6.7G  25% /
>
> and datanodes (server02 ~ server05)
> /dev/sda1             9.4G  2.2G  6.8G  25% /
> /dev/sda1             9.4G  2.2G  6.8G  25% /
> /dev/sda1             9.4G  2.2G  6.8G  25% /
> /dev/sda1             9.4G  2.2G  6.8G  25% /
>
> 2.) How can I make sure that datanode is busy? The environment is only
> for testing so there is no other user processes are running at that
> moment. Also it is a fresh installation, so only hadoop required
> packages are installed such as hadoop and jdk.
>
> 3.) fs.block.size is not set in hdfs-site.xml, including datanodes and
> namenode, because its purpose is for testing. I thought it would use
> the default value, which should be 512?
>
> 4.) What might be a good way for fast check if network is not stable?
> I check the healthy page e.g. server01:50070/dfshealth.jsp where
> livenodes are up and  last contact varies when checking the page.
>
> Node     Last Contact    Admin State     Configured  Capacity (GB)       Used
> (GB)     Non DFS  Used (GB)      Remaining  (GB)         Used  (%)       Used 
>  (%)
> Remaining  (%)   Blocks
> server02         2      In Service      0.1     0       0       0.1     0.01  
>    99.96  0
> server03         0      In Service      0.1     0       0       0.1     0.01  
>    99.96  0
> server04         1      In Service      0.1     0       0       0.1     0.01  
>    99.96  0
> server05         2      In Service      0.1     0       0       0.1     0.01  
>    99.96  0
>
> 5.) Only command `hadoop fs -put /tmp/testfile test` is issued as it
> is just to test if the installation is working. So the file e.g.
> testfile will be removed first (hadoop fs -rm test/testfile), then
> upload again with hadoop put command.
>
> The logs are listed as below:
>
> namenode:
> server01: http://pastebin.com/TLpDmmPx
>
> datanodes:
> server02: http://pastebin.com/pdE5XKfi
> server03: http://pastebin.com/4aV7ECCV
> server04: http://pastebin.com/tF7HiRZj
> server05: http://pastebin.com/5qwSPrvU
>
> Please let me know if more information needs to be provided.
>
> I really appreciate your suggestion.
>
> Thank you.
>
>
> On Fri, Jul 15, 2011 at 4:54 PM, Brahma Reddy <brahmared...@huawei.com> wrote:
>> Hi,
>>
>> By seeing this exception(could only be replicated to 0 nodes, instead of 1)
>> ,datanode is not available to Name Node..
>>
>> This are the following cases Data Node may not available to Name Node
>>
>> 1)Data Node disk is Full
>>
>> 2)Data Node is Busy with block report and block scanning
>>
>> 3)If Block Size is Negative value(dfs.block.size in hdfs-site.xml)
>>
>> 4)while write in progress primary datanode goes down(Any n/w fluctations b/w
>> Name Node and Data Node Machines)
>>
>> 5)when Ever we append any partial chunk and call sync for subsequent partial
>> chunk appends client should store the previous data in buffer.
>>
>> For example after appending "a" I have called sync and when I am trying the
>> to append the buffer should have "ab"
>>
>> And Server side when the chunk is not multiple of 512 then it will try to do
>> Crc comparison for the data present in block file as well as crc present in
>> metafile. But while constructing crc for the data present in block it is
>> always comparing till the initial Offeset
>>
>> Or For more analysis Please the data node logs
>>
>> Warm Regards
>>
>> Brahma Reddy
>>
>> ****************************************************************************
>> ***********
>> This e-mail and attachments contain confidential information from HUAWEI,
>> which is intended only for the person or entity whose address is listed
>> above. Any use of the information contained herein in any way (including,
>> but not limited to, total or partial disclosure, reproduction, or
>> dissemination) by persons other than the intended recipient's) is
>> prohibited. If you receive this e-mail in error, please notify the sender by
>> phone or email immediately and delete it!
>> -----Original Message-----
>> From: Thomas Anderson [mailto:t.dt.aander...@gmail.com]
>> Sent: Friday, July 15, 2011 9:09 AM
>> To: hdfs-user@hadoop.apache.org
>> Subject: could only be replicated to 0 nodes, instead of 1
>>
>> I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
>> 1.6.0_26. The problem is when trying to put a file to hdfs, it throws
>> error `org.apache.hadoop.ipc.RemoteException: java.io.IOException:
>> File /path/to/file could only be replicated to 0 nodes, instead of 1';
>> however, there is no problem to create a folder, as the command ls
>> print the result
>>
>> Found 1 items
>> drwxr-xr-x   - user supergroup          0 2011-07-15 11:09 /user/user/test
>>
>> I also try with flushing firewall (remove all iptables restriction),
>> but the error message is still thrown out when uploading (hadoop fs
>> -put /tmp/x test) a file from local fs.
>>
>> The name node log shows
>>
>> 2011-07-15 10:42:43,491 INFO org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.registerDatanode: node registration from
>> aaa.bbb.ccc.ddd.22:50010 storage DS-929017105-aaa.bbb.ccc.22-50010-13
>> 10697763488
>> 2011-07-15 10:42:43,495 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a new node: /default-rack/aaa.bbb.ccc.22:50010
>> 2011-07-15 10:42:44,169 INFO org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.registerDatanode: node registration from
>> aaa.bbb.ccc.35:50010 storage DS-884574392-aaa.bbb.ccc.35-50010-13
>> 10697764164
>> 2011-07-15 10:42:44,170 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a new node: /default-rack/aaa.bbb.ccc.35:50010
>> 2011-07-15 10:42:44,507 INFO org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.registerDatanode: node registration from
>> aaa.bbb.ccc.ddd.11:50010 storage DS-1537583073-aaa.bbb.ccc.11-50010-1
>> 310697764488
>> 2011-07-15 10:42:44,507 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a new node: /default-rack/aaa.bbb.ccc.11:50010
>> 2011-07-15 10:42:45,796 INFO org.apache.hadoop.hdfs.StateChange:
>> BLOCK* NameSystem.registerDatanode: node registration from
>> 140.127.220.25:50010 storage DS-1500589162-aaa.bbb.ccc.25-50010-1
>> 310697765386
>> 2011-07-15 10:42:45,797 INFO org.apache.hadoop.net.NetworkTopology:
>> Adding a new node: /default-rack/aaa.bbb.ccc.25:50010
>>
>> And all datanodes have similar message as below:
>>
>> 2011-07-15 10:42:46,562 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: using
>> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
>> 2011-07-15 10:42:47,163 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>> blocks got processed in 3 msecs
>> 2011-07-15 10:42:47,187 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
>> block scanner.
>> 2011-07-15 11:19:42,931 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>> blocks got processed in 1 msecs
>>
>> Command `hadoop fsck /`  displays
>>
>> Status: HEALTHY
>>  Total size:    0 B
>>  Total dirs:    3
>>  Total files:   0 (Files currently being written: 1)
>>  Total blocks (validated):      0
>>  Minimally replicated blocks:   0
>>  Over-replicated blocks:        0
>>  Under-replicated blocks:       0
>>  Mis-replicated blocks:         0
>>  Default replication factor:    3
>>  Average block replication:     0.0
>>  Corrupt blocks:                0
>>  Missing replicas:              0
>>  Number of data-nodes:          4
>>
>> The setting in conf include:
>>
>> - Master node:
>> core-site.xml
>>  <property>
>>    <name>fs.default.name</name>
>>    <value>hdfs://lab01:9000/</value>
>>  </property>
>>
>> hdfs-site.xml
>>  <property>
>>    <name>dfs.replication</name>
>>    <value>3</value>
>>  </property>
>>
>> -Slave nodes:
>> core-site.xml
>>  <property>
>>    <name>fs.default.name</name>
>>    <value>hdfs://lab01:9000/</value>
>>  </property>
>>
>> hdfs-site.xml
>>  <property>
>>    <name>dfs.replication</name>
>>    <value>3</value>
>>  </property>
>>
>> Do I missing any configuration? Or any place that I can check?
>>
>> Thanks.
>>
>>
>



-- 
Harsh J

Re: could only be replicated to 0 nodes, instead of 1

Reply via email to