Daniel:
For the underlying column family, do you use any data block encoding /
compression ?

Which hbase release do you use ?

Thanks

On Thu, Jan 26, 2017 at 2:12 PM, Dave Birdsall <[email protected]>
wrote:

> My guess (and it is only a guess) is that you are traversing much less of
> the call stack when you fetch one row of 20 columns than when you fetch 20
> rows each with one column.
>
> -----Original Message-----
> From: Daniel Połaczański [mailto:[email protected]]
> Sent: Thursday, January 26, 2017 1:57 PM
> To: [email protected]
> Subject: table schema - row with many column vs many rows
>
> Hi,
> in the work we were testing the following scenarios regarding scan
> performance. We stored 2500 domain rows containing 20 attributes.And after
> that read one random row with all attributes couple times
>
> Scenario A
> every single attribute stored in dedicated column. one hbase row with 20
> columns.
>
> Scenario B
> every single attribute stored as a separate row under key like
> RowKey:AttributeKey so we have 20 rows for one domain row
>
> As we know in HBase everything is stored as following entry
> RowKey:ColumnKey:Value
>
> Theoritically we have in HBase the same amount of entries (2500*20) for
> both scenario, so there shouldn't be any difference in performance. But it
> looks that scanning in scenario A is much more faster (something like 10
> times).
>
> Do you havemaybe idea why Scenario A is better?
>
> Regards
>

Reply via email to