Hello,

Every now and then we need to flatten our cluster and re-import all data
from log files (changes in data format, etc.) Afterwards we notice a
significant increase in scan performance. As data is added and shuffled
around between region servers, performance goes down again over time (say a
couple of weeks). Are there any routine operations that one should run
manually, or settings to activate in the HBase configuration to keep the
data well distributed? We use HBase 0.92 as part of a Cloudera4 cluster.

Thank you,

/David

Reply via email to