[
https://issues.apache.org/jira/browse/HBASE-21150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603769#comment-16603769
]
Duo Zhang commented on HBASE-21150:
-----------------------------------
First, please use ConcurrentMap instead of Map MetricsTableAggregateSourceImpl,
since you removed the computeIfAbsent, it is not necessary to be a
ConcurrentHashMap but it still need to be a ConcurrentMap, as for different
tables, we may do get and put at the same time. HashMap is not thread safe.
Second, you use a thread pool to execute the registration, then it could still
happen that we want to get a metrics but it is not registered yet, and cause
another NPE...
> Avoid delay in first flushes due to overheads in table metrics registration
> ---------------------------------------------------------------------------
>
> Key: HBASE-21150
> URL: https://issues.apache.org/jira/browse/HBASE-21150
> Project: HBase
> Issue Type: Bug
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Major
> Attachments: 21150.v1.txt, 21150.v2.txt, 21150.v3.txt
>
>
> After HBASE-15728 is integrated, the lazy table metrics registration results
> in penalty for the first flushes.
> Excerpt from log shows delay (note the same timestamp 08:18:23,234) :
> {code:java}
> 2018-09-02 08:18:23,232 DEBUG
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2]
> regionserver.MetricsTableSourceImpl(124): Creating new
> MetricsTableSourceImpl for table 'testtb-1535901500805'
> 2018-09-02 08:18:23,233 DEBUG
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2]
> regionserver.MetricsTableSourceImpl(137): registering metrics for testtb-
> 1535901500805
> 2018-09-02 08:18:23,234 INFO
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1]
> regionserver.HRegion(2822): Finished flush of dataSize ~2.29 KB/2343,
> heapSize ~5.16 KB/5280, currentSize=0 B/0 for
> fa403f6a4fb8dbc1a1c389744fce2d58 in 280ms, sequenceid=5, compaction
> requested=false
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 0 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-1,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52762,1535901497314)-snapshot-pool9-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 5 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52758,1535901497238)-snapshot-pool11-thread-2,5,FailOnTimeoutGroup]
> 2018-09-02 08:18:23,234 DEBUG
> [rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2]
> regionserver.MetricsTableAggregateSourceImpl(84): it took 6 ms to register
> testtb-1535901500805
> Thread[rs(hw13463.attlocal.net,52760,1535901497280)-snapshot-pool10-thread-2,5,FailOnTimeoutGroup]
> {code}
> This is a regression.
> When first region of the table is opened on region server, we can proactively
> register table metrics.
> This would avoid the penalty on first flushes for the table.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)