[
https://issues.apache.org/jira/browse/HBASE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674687#comment-13674687
]
Jeffrey Zhong commented on HBASE-8680:
--------------------------------------
{quote}
The returned connection is not being used/referenced anywhere.
{quote}
The line is to create a connection during splitlogworker starts up. The
connection is cached for later usage because creating a new connection instance
take 4+ seconds.
{quote}
When this can happen? Isn't the split/merge/compaction stuff disabled while
replay is going on? This method is called while it is still replaying,
{quote}
Merge/split won't happen during replay while imagine the following time line:
1) a region accepts some writes and have some wal entries
2) the region split/merge
3) the split/merged region(s) receive more writes
4) the region server hosting the region crashes
Therefore, the old wal entries before split/merge still has old region encoded
name while META will return a different region.
{quote}
Is this call to clearRegionCache needed before closing?
{quote}
Since the connection instance is "managed" one, its life cycle is controlled by
reference count. The hconn.close won't really close the connection. The
function clearRegionCache is to prevent next log splitting work from using
stale region locations(most likely) and free related cached memory.
Thanks for the reviewing.
> distributedLogReplay performance regression
> -------------------------------------------
>
> Key: HBASE-8680
> URL: https://issues.apache.org/jira/browse/HBASE-8680
> Project: HBase
> Issue Type: Bug
> Components: MTTR
> Reporter: Jeffrey Zhong
> Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.95.2
>
> Attachments: 8680-v2.patch, hbase-8680.patch, hbase-8680-v3.patch
>
>
> The JIRA is to check in changes to address performance issues found during my
> performance testing as following:
> 1) When WALEdits belongs to a region which split/merged later, replay incurs
> extra waitUntilRegionOnline RPC call
> 2) Using a single shared connection for replaying achieves better
> performance. Everytime creating a new connection, it incurs 4+ seconds to
> establish a connection to ZK.
> 3) other small modifications to mitigate excessive logging
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira