Hi Artem,

Yes that usually is what most do and should work fine in production
environments. If you're worried about NFS going up/down often, then
using a release with both
https://issues.apache.org/jira/browse/HADOOP-4885
(dfs.name.dir.restore feature, toggled to true at the NN) and
https://issues.apache.org/jira/browse/HDFS-3652 (a possible edge-case
your config may be exposing, when it comes to ejecting bad name-dirs
at the NN) will help further.

On Tue, Sep 18, 2012 at 10:03 PM, Artem Ervits <[email protected]> wrote:
> Thanks Harsh,
>
> I'm aware of the implications of copying periodically. This is just a test 
> until I get an NFS share to play with. Do you just let Hadoop write to two 
> directories where one is an NFS share or is there another way?
>
> -----Original Message-----
> From: Harsh J [mailto:[email protected]]
> Sent: Monday, September 17, 2012 10:44 PM
> To: [email protected]
> Subject: Re: Hadoop recovery test
>
> Hi Artem,
>
> You are running 1 DN in this cluster from what I see, and hence you can 
> ignore the reports that go: Under replicated blk_7701720691642589882_1086. 
> Target Replicas is 3 but found 1 replica(s).
>
> The two truly missing blocks are:
>
> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks
> /user/hduser/teragen-out/part-00000: MISSING 1 blocks
>
> Which may be cause of those being written at the time of your copy of the 
> fsimage and edits (thats a wrong way to go about it, btw - you should 
> configure for redundant writes such that you also sustain failures, not copy 
> it periodically - thats not a consistent way to keep a backup, and you can 
> rather go for dfsadmin methods to fetchImage instead). Does that sound likely?
>
> On Tue, Sep 18, 2012 at 3:08 AM, Artem Ervits <[email protected]> wrote:
>> Hello all,
>>
>>
>>
>> I am testing the Hadoop recovery as per
>> http://wiki.apache.org/hadoop/NameNode document. But instead of using
>> an NFS share, I am copying to another directory. Then when I shut down
>> the cluster, I scp that directory to another server and start Hadoop
>> cluster using that machine as the namenode. I see in the log that some
>> blocks are corrupt and/or missing. Do I have to wait for replication
>> to recover all blocks or am I doing something else altogether? I am
>> using Hadoop 1.0.3. Can someone point me to a more detailed document
>> than the wiki in case I'm doing something wrong.
>>
>>
>>
>> p.s. if I restart the cluster using the original namenode, filesystem
>> reports as healthy.
>>
>>
>>
>> Thank you.
>>
>>
>>
>> .
>>
>> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: CORRUPT block
>> blk_9043419219670949307
>>
>>
>>
>> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks of
>> total size 4 B...
>>
>> /user/hduser/teragen/_logs/history/job_201209120941_0002_1347458152167_hduser_TeraGen:
>> Under replicated blk_-976282286234272458_1079. Target Replicas is 3
>> but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen/_logs/history/job_201209120941_0002_conf.xml:
>> Under replicated blk_137658109390447967_1075. Target Replicas is 3 but
>> found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen/_partition.lst:  Under replicated
>> blk_-3005280481530403302_1080. Target Replicas is 3 but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen/part-00000:  Under replicated
>> blk_-7008813028808832816_1077. Target Replicas is 3 but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen/part-00001:  Under replicated
>> blk_-5256967771026054061_1078. Target Replicas is 3 but found 1 replica(s).
>>
>> ..
>>
>> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_1347458249920_hduser_TeraSort:
>> Under replicated blk_1137779303840586677_1089. Target Replicas is 3
>> but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_conf.xml:
>> Under replicated blk_7701720691642589882_1086. Target Replicas is 3
>> but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen-out/part-00000: CORRUPT block
>> blk_8059469267617478950
>>
>>
>>
>> /user/hduser/teragen-out/part-00000: MISSING 1 blocks of total size
>> 1000000 B...
>>
>> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_1347458495941_hduser_TeraValidate:
>> Under replicated blk_5680565744062298575_1098. Target Replicas is 3
>> but found 1 replica(s).
>>
>> .
>>
>> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_conf.xml:
>> Under replicated blk_1566253937037013126_1095. Target Replicas is 3
>> but found 1 replica(s).
>>
>> .Status: CORRUPT
>>
>> Total size:    1050720258 B
>>
>> Total dirs:    39
>>
>> Total files:   32
>>
>> Total blocks (validated):      42 (avg. block size 25017149 B)
>>
>>   ********************************
>>
>>   CORRUPT FILES:        2
>>
>>   MISSING BLOCKS:       2
>>
>>   MISSING SIZE:         1000004 B
>>
>>   CORRUPT BLOCKS:       2
>>
>>   ********************************
>>
>> Minimally replicated blocks:   40 (95.2381 %)
>>
>> Over-replicated blocks:        0 (0.0 %)
>>
>> Under-replicated blocks:       40 (95.2381 %)
>>
>> Mis-replicated blocks:         0 (0.0 %)
>>
>> Default replication factor:    3
>>
>> Average block replication:     0.95238096
>>
>> Corrupt blocks:                2
>>
>> Missing replicas:              80 (200.0 %)
>>
>> Number of data-nodes:          1
>>
>> Number of racks:               1
>>
>> FSCK ended at Mon Sep 17 17:29:08 EDT 2012 in 21 milliseconds
>>
>>
>>
>>
>>
>> The filesystem under path '/' is CORRUPT
>>
>>
>>
>>
>>
>> Artem Ervits
>>
>> Data Analyst
>>
>> New York Presbyterian Hospital
>>
>>
>>
>>
>> ________________________________
>> This electronic message is intended to be for the use only of the
>> named recipient, and may contain information that is confidential or 
>> privileged.
>> If you are not the intended recipient, you are hereby notified that
>> any disclosure, copying, distribution or use of the contents of this
>> message is strictly prohibited. If you have received this message in
>> error or are not the named recipient, please notify us immediately by
>> contacting the sender at the electronic mail address noted above, and
>> delete and destroy all copies of this message. Thank you.
>>
>> --------------------
>>
>> This electronic message is intended to be for the use only of the
>> named recipient, and may contain information that is confidential or 
>> privileged.
>> If you are not the intended recipient, you are hereby notified that
>> any disclosure, copying, distribution or use of the contents of this
>> message is strictly prohibited.  If you have received this message in
>> error or are not the named recipient, please notify us immediately by
>> contacting the sender at the electronic mail address noted above, and
>> delete and destroy all copies of this message.  Thank you.
>>
>> --------------------
>>
>> This electronic message is intended to be for the use only of the
>> named recipient, and may contain information that is confidential or 
>> privileged.
>> If you are not the intended recipient, you are hereby notified that
>> any disclosure, copying, distribution or use of the contents of this
>> message is strictly prohibited.  If you have received this message in
>> error or are not the named recipient, please notify us immediately by
>> contacting the sender at the electronic mail address noted above, and
>> delete and destroy all copies of this message.  Thank you.
>>
>>
>
>
>
> --
> Harsh J
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named 
> recipient, and may contain information that is confidential or privileged.  
> If you are not the intended recipient, you are hereby notified that any 
> disclosure, copying, distribution or use of the contents of this message is 
> strictly prohibited.  If you have received this message in error or are not 
> the named recipient, please notify us immediately by contacting the sender at 
> the electronic mail address noted above, and delete and destroy all copies of 
> this message.  Thank you.
>
>
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named 
> recipient, and may contain information that is confidential or privileged.  
> If you are not the intended recipient, you are hereby notified that any 
> disclosure, copying, distribution or use of the contents of this message is 
> strictly prohibited.  If you have received this message in error or are not 
> the named recipient, please notify us immediately by contacting the sender at 
> the electronic mail address noted above, and delete and destroy all copies of 
> this message.  Thank you.
>
>
>



-- 
Harsh J

Reply via email to