[jira] [Issue Comment Deleted] (HBASE-18586) Multiple column families - scan performance

Gavin (JIRA) Tue, 31 Jul 2018 23:26:39 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-18586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gavin updated HBASE-18586:
--------------------------
    Comment: was deleted

(was: A comment with security level 'jira-users' was removed.)

> Multiple column families - scan performance
> -------------------------------------------
>
>                 Key: HBASE-18586
>                 URL: https://issues.apache.org/jira/browse/HBASE-18586
>             Project: HBase
>          Issue Type: Bug
>          Components: scan
>            Reporter: PS0618
>            Priority: Major
>
> I have 2 HBase tables - one with a single column family, and other has 4 
> column families. Both tables are keyed by same rowkey, and the column 
> families all have a single column qualifier each, with a json string as value 
> (each json payload is about 10-20K in size). All column families use 
> fast-diff encoding and gzip compression.
> After loading about 60MM rows to each table, a scan test on (any) single 
> column family in the 2nd table takes 4x the time to scan the single column 
> family from the 1st table. In both cases, the scanner is bounded by a start 
> and stop key to scan 1MM rows. Performance did not change much even after 
> running a major compaction on both tables.
> Though HBase doc and other tech forums recommend not using more than 1 column 
> family per table, nothing I have read so far suggests scan performance will 
> linearly degrade based on number of column families. Has anyone else 
> experienced this, and is there a simple explanation for this?
> To note, the reason second table has 4 column families is even though I only 
> scan one column family at a time now, there are requirements to scan multiple 
> column families from that table given a set of rowkeys.
> Thanks for any insight into the performance question.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (HBASE-18586) Multiple column families - scan performance

Reply via email to