[ 
https://issues.apache.org/jira/browse/PHOENIX-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344175#comment-14344175
 ] 

James Taylor commented on PHOENIX-1693:
---------------------------------------

[~mujtabachohan] - is the slow case only when a major compaction has never 
occurred on the table? Are statistics being populated prior to a major 
compaction occurring (as from the code it seems like they wouldn't be, unless 
an UPDATE STATISTICS is manually run)?

We should first rule in/out whether its a data locality issue. If the actual 
data isn't co-located with the region servers yet, but an UPDATE STATISTICS was 
done, that may explain the slowness, as you'd be pull a bunch of data (per 
guidepost) over the wire between RS & HDFS server to execute the query.



> Stats sometimes cause query to execute more slowly
> --------------------------------------------------
>
>                 Key: PHOENIX-1693
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1693
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Mujtaba Chohan
>
> After initial data load with Phoenix 4.3.0 with stats enabled for 500M FHA 
> table using Pherf, aggregate count queries over 5% or 15% of total rows are 
> some times 2-5X slow compared to queries without stats. Needs to investigate 
> the root cause. 
> Observations:
> * After major compaction with new stats are re-generated and queries become 
> fast compared to tables without stats.
> * Trucating stats table after initial data load also makes queries perf. 
> comparable to previous Phoenix release with stats disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to