My guess (and it is only a guess) is that you are traversing much less of the 
call stack when you fetch one row of 20 columns than when you fetch 20 rows 
each with one column.

-----Original Message-----
From: Daniel Połaczański [mailto:[email protected]] 
Sent: Thursday, January 26, 2017 1:57 PM
To: [email protected]
Subject: table schema - row with many column vs many rows

Hi,
in the work we were testing the following scenarios regarding scan performance. 
We stored 2500 domain rows containing 20 attributes.And after that read one 
random row with all attributes couple times

Scenario A
every single attribute stored in dedicated column. one hbase row with 20 
columns.

Scenario B
every single attribute stored as a separate row under key like 
RowKey:AttributeKey so we have 20 rows for one domain row

As we know in HBase everything is stored as following entry 
RowKey:ColumnKey:Value

Theoritically we have in HBase the same amount of entries (2500*20) for both 
scenario, so there shouldn't be any difference in performance. But it looks 
that scanning in scenario A is much more faster (something like 10 times).

Do you havemaybe idea why Scenario A is better?

Regards

Reply via email to