[
https://issues.apache.org/jira/browse/HBASE-29399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Junegunn Choi updated HBASE-29399:
----------------------------------
Description:
h2. Problem
The
[hadoop-metrics2-hbase.properties|https://github.com/apache/hbase/blob/master/conf/hadoop-metrics2-hbase.properties]
configuration template has a few errors, and it doesn't provide enough
examples to show how to configure the metrics.
h2. Issues and suggestions
h3. Invalid wildcard pattern
{code}
*.sink.file*.class=org.apache.hadoop.metrics2.sink.FileSink
{code}
It seems the original intention of {{file\*}} was to match all instances with
{{file}} prefix. However, it doesn't work as expected. Such partial wildcard
patterns are not allowed by [the regular
expression|https://github.com/apache/hadoop/blob/8378ab9f92c72dc6164b62f7be71826fd750dba4/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsConfig.java#L88]
in the Hadoop codebase.
If we change it to {{\*.sink.\*.class}}, then it does match all instances. But
that would be overly broad and likely unintended, so I don't think we should do
that. Instead, I suggest just removing this line because it's pointless and
misleading. We're already repeating
{{org.apache.hadoop.metrics2.sink.FileSink}} on every sink anyway.
h3. Incorrect context
{code}
# hbase.sink.file0.context=hmaster
{code}
The context should be {{master}}, not {{hmaster}}.
h3. Incorrect path
{code}
# hbase.sink.file1.class=org.apache.hadoop.metrics2.sink.FileSink
# hbase.sink.file1.context=thrift-one
# hbase.sink.file1.filename=thrift-one.metrics
# hbase.sink.file2.class=org.apache.hadoop.metrics2.sink.FileSink
# hbase.sink.file2.context=thrift-two
# hbase.sink.file2.filename=thrift-one.metrics
{code}
You see the problem?
h3. Missing regionserver context example
I think it's strange that we don't show a sink example for {{regionserver}},
when we have others like thrift1, thrift2, and rest.
h3. Better naming of instances
The instances are named {{file-all}}, then {{file0}}, {{file1}}, {{file2}}, and
so on. More descriptive names like {{file-master}} or {{file-thrift1}} would
make the configuration easier to understand and maintain.
h3. Provide a filtering example
HBase emits a large number of metrics, and it is sometimes desired to reduce
the number by filtering out the metrics you are not interested in. But the
template doesn't provide a good example of how to do it.
Here is a sample configuration (we actually use) that demonstrates how to
exclude some metrics.
{code}
hbase.source.metric.filter.class=org.apache.hadoop.metrics2.filter.RegexFilter
hbase.source.metric.filter.exclude=.*_((25|75|90|95|98)th_percentile|min|mean)
{code}
was:
h2. Problem
The
[hadoop-metrics2-hbase.properties|https://github.com/apache/hbase/blob/master/conf/hadoop-metrics2-hbase.properties]
configuration template has a few errors, and it doesn't provide enough
examples to show how to configure the metrics.
h2. Issues and suggestions
h3. Invalid wildcard pattern
{code}
*.sink.file*.class=org.apache.hadoop.metrics2.sink.FileSink
{code}
It seems the original intention of {{file\*}} was to match all instances with
{{file}} prefix. However, it doesn't work as expected. Such partial wildcard
patterns are not allowed by [the regular
expression|https://github.com/apache/hadoop/blob/8378ab9f92c72dc6164b62f7be71826fd750dba4/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsConfig.java#L88]
in the Hadoop codebase.
If we change it to {{\*.sink.\*.class}}, then it does match all instances. But
that would be overly broad and likely unintended, so I don't think we should do
that. Instead, I suggest just removing this line because it's pointless and
misleading.
h3. Incorrect context
{code}
# hbase.sink.file0.context=hmaster
{code}
The context should be {{master}}, not {{hmaster}}.
h3. Incorrect path
{code}
# hbase.sink.file1.class=org.apache.hadoop.metrics2.sink.FileSink
# hbase.sink.file1.context=thrift-one
# hbase.sink.file1.filename=thrift-one.metrics
# hbase.sink.file2.class=org.apache.hadoop.metrics2.sink.FileSink
# hbase.sink.file2.context=thrift-two
# hbase.sink.file2.filename=thrift-one.metrics
{code}
You see the problem?
h3. Missing regionserver context example
I think it's strange that we don't show a sink example for {{regionserver}},
when we have others like thrift1, thrift2, and rest.
h3. Better naming of instances
The instances are named {{file-all}}, then {{file0}}, {{file1}}, {{file2}}, and
so on. More descriptive names like {{file-master}} or {{file-thrift1}} would
make the configuration easier to understand and maintain.
h3. Provide a filtering example
HBase emits a large number of metrics, and it is sometimes desired to reduce
the number by filtering out the metrics you are not interested in. But the
template doesn't provide a good example of how to do it.
Here is a sample configuration (we actually use) that demonstrates how to
exclude some metrics.
{code}
hbase.source.metric.filter.class=org.apache.hadoop.metrics2.filter.RegexFilter
hbase.source.metric.filter.exclude=.*_((25|75|90|95|98)th_percentile|min|mean)
{code}
> Update hadoop-metrics2-hbase.properties template
> ------------------------------------------------
>
> Key: HBASE-29399
> URL: https://issues.apache.org/jira/browse/HBASE-29399
> Project: HBase
> Issue Type: Improvement
> Components: conf, documentation
> Reporter: Junegunn Choi
> Assignee: Junegunn Choi
> Priority: Major
>
> h2. Problem
> The
> [hadoop-metrics2-hbase.properties|https://github.com/apache/hbase/blob/master/conf/hadoop-metrics2-hbase.properties]
> configuration template has a few errors, and it doesn't provide enough
> examples to show how to configure the metrics.
> h2. Issues and suggestions
> h3. Invalid wildcard pattern
> {code}
> *.sink.file*.class=org.apache.hadoop.metrics2.sink.FileSink
> {code}
> It seems the original intention of {{file\*}} was to match all instances with
> {{file}} prefix. However, it doesn't work as expected. Such partial wildcard
> patterns are not allowed by [the regular
> expression|https://github.com/apache/hadoop/blob/8378ab9f92c72dc6164b62f7be71826fd750dba4/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsConfig.java#L88]
> in the Hadoop codebase.
> If we change it to {{\*.sink.\*.class}}, then it does match all instances.
> But that would be overly broad and likely unintended, so I don't think we
> should do that. Instead, I suggest just removing this line because it's
> pointless and misleading. We're already repeating
> {{org.apache.hadoop.metrics2.sink.FileSink}} on every sink anyway.
> h3. Incorrect context
> {code}
> # hbase.sink.file0.context=hmaster
> {code}
> The context should be {{master}}, not {{hmaster}}.
> h3. Incorrect path
> {code}
> # hbase.sink.file1.class=org.apache.hadoop.metrics2.sink.FileSink
> # hbase.sink.file1.context=thrift-one
> # hbase.sink.file1.filename=thrift-one.metrics
> # hbase.sink.file2.class=org.apache.hadoop.metrics2.sink.FileSink
> # hbase.sink.file2.context=thrift-two
> # hbase.sink.file2.filename=thrift-one.metrics
> {code}
> You see the problem?
> h3. Missing regionserver context example
> I think it's strange that we don't show a sink example for {{regionserver}},
> when we have others like thrift1, thrift2, and rest.
> h3. Better naming of instances
> The instances are named {{file-all}}, then {{file0}}, {{file1}}, {{file2}},
> and so on. More descriptive names like {{file-master}} or {{file-thrift1}}
> would make the configuration easier to understand and maintain.
> h3. Provide a filtering example
> HBase emits a large number of metrics, and it is sometimes desired to reduce
> the number by filtering out the metrics you are not interested in. But the
> template doesn't provide a good example of how to do it.
> Here is a sample configuration (we actually use) that demonstrates how to
> exclude some metrics.
> {code}
> hbase.source.metric.filter.class=org.apache.hadoop.metrics2.filter.RegexFilter
> hbase.source.metric.filter.exclude=.*_((25|75|90|95|98)th_percentile|min|mean)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)