[ 
https://issues.apache.org/jira/browse/PHOENIX-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor resolved PHOENIX-2189.
-----------------------------------
    Resolution: Not A Problem

> Starting from HBase 1.x, phoenix shouldn't probably override the 
> hbase.client.scanner.caching attribute
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2189
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2189
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>
> After PHOENIX-2188 is fixed, we need to think about whether it makes sense to 
> override the scanner cache size in Phoenix for branches HBase 1.x. For ex  - 
> in HBase 1.1, the default value of hbase.client.scanner.caching is now 
> Integer.MAX_VALUE.
> {code:xml}
> <property>
>     <name>hbase.client.scanner.caching</name>
>     <value>2147483647</value>
>     <description>Number of rows that we try to fetch when calling next
>     on a scanner if it is not served from (local, client) memory. This 
> configuration
>     works together with hbase.client.scanner.max.result.size to try and use 
> the
>     network efficiently. The default value is Integer.MAX_VALUE by default so 
> that
>     the network will fill the chunk size defined by 
> hbase.client.scanner.max.result.size
>     rather than be limited by a particular number of rows since the size of 
> rows varies
>     table to table. If you know ahead of time that you will not require more 
> than a certain
>     number of rows from a scan, this configuration should be set to that row 
> limit via
>     Scan#setCaching. Higher caching values will enable faster scanners but 
> will eat up more
>     memory and some calls of next may take longer and longer times when the 
> cache is empty.
>     Do not set this value such that the time between invocations is greater 
> than the scanner
>     timeout; i.e. hbase.client.scanner.timeout.period</description>
>   </property>
> {code:xml}
> From the comments it sounds like, by default, HBase is going to provide an 
> upper bound on the scanner cache size in bytes and not number of records. 
> If we end up overriding the hbase.client.scanner.caching to 1000, then 
> potentially for narrower rows we will likely be fetching too few rows. For 
> wider rows, likely the bytes limit will kick in to make sure we don't end up 
> caching too much on the client.
> Maybe we shouldn't be using the scanner caching override at all? Thoughts? 
> [~jamestaylor], [~lhofhansl]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to