In case you didn't get what was just said: When a regionserver crashes, HBase needs to replay the contents of outstanding WAL logs (Write-Ahead-Logs A.K.A HLogs) before it can bring the regions the crashed server was hosting back on line again. Replaying involves reading the WALs and then 'splitting' the edits by region so that when the region is opened in a new location, it has a nice neat file of its edits, only, to replay before it starts serving.
The two metrics you cite are time and size histograms for this WAL split process. The size histogram is updated with the size of all the WAL logs each time the splitting process is run. The time histogram is updated w/ how long the split process took. These metrics are sort of important. You want the time to be little since it is time some of your data is offline... smaller log splitting sizes will usually mean less time so it is to try and keep the number of outstanding WALs low. Our metrics are better now, rationalized (though the num_ops suffix here seems 'off'). They could do with a bit of doc'ing. What is in the refguide is a bit stale. Yours, St.Ack On Thu, Feb 6, 2014 at 5:31 PM, Ted Yu <[email protected]> wrote: > These two metrics are the counter portion of MutableHistogram's for HLogs > split which give you histogram using hadoop2's metrics2 system. > > See > hbase-hadoop2-compat//src/main/java/org/apache/hadoop/hbase/master/MetricsMasterFilesystemSourceImpl.java > > Cheers > > > On Thu, Feb 6, 2014 at 4:59 PM, Sreepathi <[email protected] > >wrote: > > > Hello, > > > > Could somebody explain what the following metrics mean and how they can > be > > used ? > > > > HlogSplitTime_num_ops > > HlogSplitSize_num_ops > > > > -- > > *Regards,* > > --- *Sreepathi * > > >
