Thanks for the reply. How can I assign a new value to the transfer speed for the balancer ? Is this the parameter, dfs.balance.bandwidthPerSec ?
Where should this go, in conf/hdfs-site.xml ? or conf/core-site.xml ? -Tapas On Mar 19, 2013, at 11:05 PM, Harsh J <[email protected]> wrote: > If your balancer does not exit, then it means its heavily working in > iterations trying to balance your cluster. The default bandwidth > allows only for limited transfer speed (10 Mbps) to not affect the > cluster's RW performance while moving blocks between DNs for > balancing, so the operation may be slow unless you raise the allowed > bandwidth. > > On Wed, Mar 20, 2013 at 7:37 AM, Tapas Sarangi <[email protected]> > wrote: >> Any more follow ups ? >> >> Thanks >> -Tapas >> >> On Mar 19, 2013, at 9:55 AM, Tapas Sarangi <[email protected]> wrote: >> >>> >>> On Mar 18, 2013, at 11:50 PM, Harsh J <[email protected]> wrote: >>> >>>> What do you mean that the balancer is always active? >>> >>> meaning, the same process is active for a long time. The process that >>> starts may not be exiting at all. We have a cron job set to run it every 10 >>> minutes, but that's not in effect because the process may never exit. >>> >>> >>>> It is to be used >>>> as a tool and it exits once it balances in a specific run (loops until >>>> it does, but always exits at end). The balancer does balance based on >>>> usage percentage so that is what you're probably looking for/missing. >>>> >>> >>> May be. How does the balancer look for the usage percentage ? >>> >>> -Tapas >>> >>> >>>> On Tue, Mar 19, 2013 at 6:56 AM, Tapas Sarangi <[email protected]> >>>> wrote: >>>>> Hi, >>>>> >>>>> On Mar 18, 2013, at 8:21 PM, 李洪忠 <[email protected]> wrote: >>>>> >>>>> Maybe you need to modify the rackware script to make the rack balance, ie, >>>>> all the racks are the same size, on rack by 6 small nodes, one rack by 1 >>>>> large nodes. >>>>> P.S. >>>>> you need to reboot the cluster for rackware script modify. >>>>> >>>>> >>>>> Like I mentioned earlier in my reply to Bertrand, we haven't considered >>>>> rack >>>>> awareness for the cluster, currently it is considered as just one rack. >>>>> Can >>>>> that be the problem ? I don't know… >>>>> >>>>> -Tapas >>>>> >>>>> >>>>> >>>>> 于 2013/3/19 7:17, Bertrand Dechoux 写道: >>>>> >>>>> And by active, it means that it does actually stops by itself? Else it >>>>> might >>>>> mean that the throttling/limit might be an issue with regard to the data >>>>> volume or velocity. >>>>> >>>>> What threshold is used? >>>>> >>>>> About the small and big datanodes, how are they distributed with regards >>>>> to >>>>> racks? >>>>> About files, how is used the replication factor(s) and block size(s)? >>>>> >>>>> Surely trivial questions again. >>>>> >>>>> Bertrand >>>>> >>>>> On Mon, Mar 18, 2013 at 10:46 PM, Tapas Sarangi <[email protected]> >>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Sorry about that, had it written, but thought it was obvious. >>>>>> Yes, balancer is active and running on the namenode. >>>>>> >>>>>> -Tapas >>>>>> >>>>>> On Mar 18, 2013, at 4:43 PM, Bertrand Dechoux <[email protected]> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> It is not explicitly said but did you use the balancer? >>>>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer >>>>>> >>>>>> Regards >>>>>> >>>>>> Bertrand >>>>>> >>>>>> On Mon, Mar 18, 2013 at 10:01 PM, Tapas Sarangi <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I am using one of the old legacy version (0.20) of hadoop for our >>>>>>> cluster. We have scheduled for an upgrade to the newer version within a >>>>>>> couple of months, but I would like to understand a couple of things >>>>>>> before >>>>>>> moving towards the upgrade plan. >>>>>>> >>>>>>> We have about 200 datanodes and some of them have larger storage than >>>>>>> others. The storage for the datanodes varies between 12 TB to 72 TB. >>>>>>> >>>>>>> We found that the disk-used percentage is not symmetric through all the >>>>>>> datanodes. For larger storage nodes the percentage of disk-space used is >>>>>>> much lower than that of other nodes with smaller storage space. In >>>>>>> larger >>>>>>> storage nodes the percentage of used disk space varies, but on average >>>>>>> about >>>>>>> 30-50%. For the smaller storage nodes this number is as high as 99.9%. >>>>>>> Is >>>>>>> this expected ? If so, then we are not using a lot of the disk space >>>>>>> effectively. Is this solved in a future release ? >>>>>>> >>>>>>> If no, I would like to know if there are any checks/debugs that one can >>>>>>> do to find an improvement with the current version or upgrading hadoop >>>>>>> should solve this problem. >>>>>>> >>>>>>> I am happy to provide additional information if needed. >>>>>>> >>>>>>> Thanks for any help. >>>>>>> >>>>>>> -Tapas >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Bertrand Dechoux >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Harsh J >>> >> > > > > -- > Harsh J
