Hi Ravi,
We have 10 data nodes. Each data node has 12 disks mounted and each data
node contains nearly 20 TB.
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md0 113183272 12830044 94597128 12% /
tmpfs 66061772 0 66061772 0% /dev/shm
/dev/sdc1 3905108984 1847318400 2057790584 48% /data/sda
/dev/sdd1 3905108984 1766808072 2138300912 46% /data/sdb
/dev/sde1 3905108984 1762628972 2142480012 46% /data/sdc
/dev/sdf1 3905108984 1762803256 2142305728 46% /data/sdd
/dev/sdg1 3905108984 1757301724 2147807260 46% /data/sde
/dev/sdh1 3905108984 1764210768 2140898216 46% /data/sdf
/dev/sdi1 3905108984 1754803788 2150305196 45% /data/sdg
/dev/sdj1 3905108984 1753740904 2151368080 45% /data/sdh
/dev/sdk1 3905108984 1758186416 2146922568 46% /data/sdi
/dev/sdl1 3905108984 1757352332 2147756652 46% /data/sdj
/dev/sdm1 3905108984 1759121952 2145987032 46% /data/sdk
/dev/sdn1 3905108984 2991279120 913829864 77% /data/sdl
10.10.2.54:/home 113183744 55836672 51591168 52% /home
10.10.2.54:/vol/home1
976283648 93448192 882835456 10% /vol/home1
10.10.2.54:/vol/home2
976284672 412706816 563577856 43% /vol/home2
10.10.2.54:/vol/home3
976284672 51256320 925028352 6% /vol/home3
Thanks,
Chathuri
On Thu, Mar 24, 2016 at 2:45 PM, Ravi Prakash <[email protected]> wrote:
> Hi Chathuri!
>
> You're welcome! We did not have an HBase instance to upgrade. It depends
> on how many blocks your datanodes are storing (== how big your disks are *
> how many disks you have * how full your disks are). What are those numbers
> for you? We experienced anywhere from 1-3 hours for the upgrade.
>
> HTH
> Ravi
>
> On Thu, Mar 24, 2016 at 1:16 AM, Chathuri Wimalasena <[email protected]
> > wrote:
>
>> Hi Ravi,
>>
>> Thank you for all the information, Our application is indexing twitter
>> data to HBase and then do some data analytics on top of that. That's why
>> HDFS data is very important to us. We cannot tolerate any data loss with
>> the update. Do you remember how long it took for you to upgrade it from
>> 2.4.1 to 2.7.1 ?
>>
>> Thanks,
>> Chathuri
>>
>> On Wed, Mar 23, 2016 at 7:09 PM, Ravi Prakash <[email protected]>
>> wrote:
>>
>>> Hi Chathuri!
>>>
>>> Technically there is a rollback option during upgrade. I don't know how
>>> well it has been tested, but the idea is that old metadata is not deleted
>>> until the cluster administrator says $ hdfs dfsadmin -finalizeUpgrade . I'm
>>> fairly confident that the HDFS upgrade will work smoothly. We have upgraded
>>> quite a few Hadoop-2.4.1 clusters to Hadoop-2.7.1 successfully (never
>>> having to roll back). Its your applications that work on top of HDFS and
>>> YARN that I'd be concerned about.
>>>
>>> HTH
>>> Ravi
>>>
>>> On Wed, Mar 23, 2016 at 2:22 PM, Chathuri Wimalasena <
>>> [email protected]> wrote:
>>>
>>>> Thanks for information Ravi. Is there a way that I can back up data
>>>> before the update ? I was thinking about this approach..
>>>>
>>>> Copy the current hadoop directories to a new set of directories.
>>>> Point hadoop to this new set
>>>> Start the migration with the backup set
>>>>
>>>> Please let me know if people have done this upgrade successfully. I
>>>> believe many things can go wrong in a lengthy upgrade like this. The data
>>>> in the cluster is very important.
>>>> Thanks,
>>>> Chathuri
>>>>
>>>> On Wed, Mar 23, 2016 at 4:37 PM, Ravi Prakash <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Chathuri!
>>>>>
>>>>> - When we upgrade, does it change the namenode data structures and
>>>>> data nodes? I assume it only changes the name node...
>>>>>
>>>>> It changes the NN as well as DN layout. As a matter of fact, this
>>>>> upgrade will take a long time on Datanodes as well because of
>>>>> https://issues.apache.org/jira/browse/HDFS-6482
>>>>>
>>>>> - What are the risks with this upgrade ?
>>>>>
>>>>> What Hadoop applications do you run on top of your cluster? The hope
>>>>> is that everything continues working smoothly for the most part, but
>>>>> inevitably some backward incompatible changes creep in.
>>>>>
>>>>> - Is there a place where I can review the changes made to file
>>>>> system from 2.5.1 to 2.7.2?
>>>>>
>>>>> The release notes. http://hadoop.apache.org/releases.html .You'd have
>>>>> to accumulate all the changes in the versions.
>>>>>
>>>>> Practically, I'd try to run my application on your upgraded test
>>>>> cluster.
>>>>>
>>>>> HTH
>>>>>
>>>>> Ravi
>>>>>
>>>>> On Wed, Mar 23, 2016 at 12:17 PM, Chathuri Wimalasena <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We have a hadoop production deployment with 1 name node and 10 data
>>>>>> nodes which has more than 20TB of data in HDFS. We are currently using
>>>>>> Hadoop 2.5.1 and we want to update it to latest Hadoop version, 2.7.2.
>>>>>>
>>>>>> I followed the following link (
>>>>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html)
>>>>>> and updated a single node system running in pseudo distributed mode and
>>>>>> it
>>>>>> went without any issues. But this system did not have that much data as
>>>>>> the
>>>>>> production system.
>>>>>>
>>>>>> Since this is a production system, I'm reluctant to do this update. I
>>>>>> would like to see what other people have done in these cases and their
>>>>>> experiences... Here are few questions I have..
>>>>>>
>>>>>> - When we upgrade, does it change the namenode data structures
>>>>>> and data nodes? I assume it only changes the name node...
>>>>>> - What are the risks with this upgrade ?
>>>>>> - Is there a place where I can review the changes made to file
>>>>>> system from 2.5.1 to 2.7.2?
>>>>>>
>>>>>> I would really appreciate if you can share your experiences.
>>>>>>
>>>>>> Thanks in advance,
>>>>>> Chathuri
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>