[jira] [Commented] (PHOENIX-3560) Aggregate query performance is worse with encoded columns for schema with large number of columns

James Taylor (JIRA) Tue, 03 Jan 2017 17:48:28 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796855#comment-15796855
 ]


James Taylor commented on PHOENIX-3560:
---------------------------------------

This is a tricky one - the scan will be forced to load the KeyValue with all 
the data instead of the one single empty key value. You could potentially fake 
it out by adding another column family (as the first one) with a dummy column.

> Aggregate query performance is worse with encoded columns for schema with 
> large number of columns
> -------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3560
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3560
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Mujtaba Chohan
>            Assignee: Samarth Jain
>             Fix For: 4.10.0
>
>
> Schema with 5K columns
> {noformat}
> create table (k1 integer, k2 integer, c1 varchar ... c5000 varchar CONSTRAINT 
> PK PRIMARY KEY (K1, K2)) 
> VERSIONS=1, MULTI_TENANT=true, IMMUTABLE_ROWS=true
> {noformat}
> In this test, there are no null columns and each column contains 200 chars 
> i.e. 1MB of data per row.
> Count * aggregation is about 5X slower with encoded columns when compared to 
> table non-encoded columns using the same schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3560) Aggregate query performance is worse with encoded columns for schema with large number of columns

Reply via email to