My usecase: I have several tabels with key starting with a timestamp. Also, this tabels have set data retention to 30 days. Table size is around 1Tb(3Tb replicated) and data is inserted regular(on 5minute, ~200Mb is inserted). File size is set to 1Gb. I have this tables in use for almost half an year and now a table has around 6k partitions and 40% of them are empty. The problem: the number of regions per region server is now pretty high.
Questions: Which approach is better? - to merge adiacent empty partitions in a bigger one? - to merge empty partitions to non-empty partitions? Also, I'm wondering why regions merge is not part of major compactions and why it's neccesary to stop the entire fleet to solve this problem. Amazon Development Center (Romania) S.R.L. registered office: 3E Palat Street, floor 2, Iasi, Iasi County, Iasi 700032, Romania. Registered in Romania. Registration number J22/2621/2005.
