remove or provide config to completely disable all aspects of 'table 
fragmentation' 
------------------------------------------------------------------------------------

                 Key: HBASE-2181
                 URL: https://issues.apache.org/jira/browse/HBASE-2181
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: ryan rawson
             Fix For: 0.21.0


Given the potentially low and misleading value of the metric, and how much 
effort must be expended to collect them, I would argue at least we should allow 
users to disable the feature completely.

The first problem is the data the metric delivers is not very useful. On any 
given busy system, this value is often 100%. On a sample system here, 12% of 
the  tables were at either 0 or 100%.  Furthermore the 100% metric is not 
particularly informative. If a table has 100% 'fragmentation' it does not 
necessarily imply that this table is in dire need of compaction.  The HBase 
compaction code will generally keep at least 2 store files around - it refuses 
to minor compact older and larger files, preferring to merge small files.  Thus 
on a table taking writes on all regions, the expected value of fragmentation is 
in fact 100%. And this is not a bad thing either.  Considering that compacting 
a 500GB table will take an hour and hammer a cluster, misleading users into 
striving to get to 0% is non ideal.

The other major problem of this feature is collecting the data is non-trivial 
on larger clusters.  I did a test where I did a lsr on a hadoop cluster, and to 
generate 15k lines of output, it pegged the namenode at over 100% cpu for a few 
seconds. On a cluster with 7000 regions, we can clearly easily have 14,000 (2 
store files per region is typical) files thus causing spikes against the 
namenode to generate this statistic.

I would propose 3 courses of actions:
- allow complete disablement of the feature, including the background thread 
and the UI display
- change the metric to mean '# of regions with > 5 store files'
- replacing the metric with a completely different one that attempts to capture 
the spirit of the intent but with less load.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to