Hello all,
As I understand it, a common performance tweak is to disable major
compactions so that you don't end up with storms taking things out at
inconvenient times. I'm thinking that I should just write a quick
script to rotate through all of our regions, one at a time, and compact
them. Again, if I'm understanding this correctly we should not end up
with storms as they'll only happen one at a time, and each one doesn't
run for long. Does that seem reasonable, or am I missing something? My
hope is to run the script regularly.
Corollary question... I recently added drives to our nodes and since I
did this while they were all still running, basically just restarting
the datanode underneath to pick up the new spindles, I'm fairly sure
I've thrown data locality out the window, based on the changed pattern
of network traffic. If I'm right, manually running major compactions
against all of the regions should resolve that, as the underlying data
would all get written locally. Again, does that make sense?
Thanks!
--Brennon
- Question about compactions Brennon Church
-