[
https://issues.apache.org/jira/browse/PHOENIX-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816901#comment-15816901
]
Samarth Jain commented on PHOENIX-3560:
---------------------------------------
We use a SINGLE_KEYVALUE_COLUMN_QUALIFIER "1" which is sorted after our empty
key value column 0 ( I should probably change it use the Integer representation
of 1).
[~mujtabachohan] and I tested this out offline. And it turned that that
increasing the block cache size helped speed up the performance of the query.
It runs 2x faster than against non-encoded immutable table.
[~lhofhansl] pointed out that because HBase automatically increases the block
size to fit in a key value with the default block size being 64K. He mentioned
that what likely is happening in this case is that the "empty" key value and
the packed key value both end up on the block whose size is much larger than
64K. As a result, we are not able to really take advantage of the first key
only filter since we always have to read this entire large block before we
could skip to the next row.
> Aggregate query performance is worse with encoded columns for schema with
> large number of columns
> -------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-3560
> URL: https://issues.apache.org/jira/browse/PHOENIX-3560
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Mujtaba Chohan
> Assignee: Thomas D'Silva
> Fix For: 4.10.0
>
> Attachments: DataGenerator.java, PHOENIX-3565.patch
>
>
> Schema with 5K columns
> {noformat}
> create table (k1 integer, k2 integer, c1 varchar ... c5000 varchar CONSTRAINT
> PK PRIMARY KEY (K1, K2))
> VERSIONS=1, MULTI_TENANT=true, IMMUTABLE_ROWS=true
> {noformat}
> In this test, there are no null columns and each column contains 200 chars
> i.e. 1MB of data per row.
> Count * aggregation is about 5X slower with encoded columns when compared to
> table non-encoded columns using the same schema.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)