[ 
https://issues.apache.org/jira/browse/HBASE-26878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-26878.
-----------------------------------------
    Fix Version/s: 2.5.0
                   2.6.0
                   3.0.0-alpha-3
                   2.4.12
     Hadoop Flags: Reviewed
       Resolution: Fixed

> TableInputFormatBase should cache RegionSizeCalculator
> ------------------------------------------------------
>
>                 Key: HBASE-26878
>                 URL: https://issues.apache.org/jira/browse/HBASE-26878
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Minor
>             Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3, 2.4.12
>
>
> TableInputFormatBase's getSplits() method instantiates a new 
> RegionSizeCalculator every time. Instantiating a RegionSizeCalculator 
> involves scanning for all regionlocations for a given table in meta. This can 
> be costly for large tables, and we don't know how often a subclass will call 
> getSplits().
> When initializeTable is called, we already cache the RegionLocator and Admin 
> that are used for passing into the RegionSizeCalculator. We should similarly 
> cache the RegionSizeCalculator itself at that same time to avoid unnecessary 
> meta scans on repeat getSplits() calls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to