Hi guys, I am quite new to Hbase.. I am trying to figure out the max additional disk space required for compactions..
If the set of small Hfiles amount to a size of U in total, before a major compaction happens and the 'behemoth' HFile has size M, assuming the resultant size of the Hfile after compaction is U+M (worst case has only insertions) and a replication factor of r, then disk space taken by the Hfiles is 2r(U+M).. Is this estimate reasonable? (This also is based on my understanding that compactions happen on HDFS and not on the local file system: am I correct?)... Thank you Vidhya