[ 
https://issues.apache.org/jira/browse/PHOENIX-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223553#comment-14223553
 ] 

James Taylor commented on PHOENIX-1475:
---------------------------------------

Talked with [~mujtaba] about this offline a bit. The use he has in mind is 
related mostly to performance testing, where you'd like to be able to update 
the stats on demand. The solution here is to set the 
phoenix.stats.minUpdateFrequency property to 0.

For other use cases, I think we can help this a bit and say that only explicit 
calls to UPDATE STATISTICS should be affected by the 
phoenix.stats.minUpdateFrequency. For example, if a major compaction runs, you 
should still be able to run UPDATE STATISTICS immediately afterwards. We'd only 
prevent subsequent/multiple calls to UPDATE STATISTICS.

I'd rather keep the hints we add to a minimum  as otherwise were deviating too 
much from standard SQL.

Thoughts?

> Statistics are not immediately used after data load
> ---------------------------------------------------
>
>                 Key: PHOENIX-1475
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1475
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.2
>            Reporter: Mujtaba Chohan
>            Assignee: James Taylor
>            Priority: Minor
>
> Although statistics are available in stats table but they are not used when 
> query is executed immediately after data is loaded in a table. Also _update 
> statistics_ cannot be executed since _phoenix.stats.minUpdateFrequency_ has 
> not elapsed nor are they are used immediately after running _major 
> compaction_. 
> Stats are used after _phoenix.stats.updateFrequency_ has elapsed but there is 
> no way to force immediate use for uses cases such as performance test 
> utilities and real time systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to