Hello All,
   We have been receiving many requests from users to give a "Rebalance  
completion time estimation". This email is to gather ideas and feedback from 
the community for the same. We have one proposal, but nothing is concrete. 
Please feel free to give your input for this problem.

A brief about rebalance operation:
- Rebalance process is used to rebalance data across cluster most likely in the 
event of add-brick and remove-brick. Rebalance is spawned on each node. The job 
for the process is to read directories, fix it's layout to include the newly 
added brick. Read children files(only those reside on local bricks) of the 
directory and migrate them if necessary decided by the new layout.


Here is one of the solution pitched by Manoj Pillai.

Assumptions for this idea:
 - files are of similar size.
 - Max 40% of the total files will be migrated

1- Do a statfs on the local bricks. Say the total size is St.
2- Based on first file size say Sf, assume the no of files in the local system 
to be: Nt
3- So the time estimation would be: (Nt * migration time for one file) * 40%.
4- Rebalance will keep updating this estimation as more files are crawled and 
will try to give a fare estimation.

Problem with this approach: This method assumes that the files size will be 
almost similar. For cluster  with variable file sizes this estimation go wrong.

So this is one initial idea. Please give your suggestions/ideas/feedback on 
this.


Thanks,
Susant






 

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to