Daniel: For the underlying column family, do you use any data block encoding / compression ?
Which hbase release do you use ? Thanks On Thu, Jan 26, 2017 at 2:12 PM, Dave Birdsall <[email protected]> wrote: > My guess (and it is only a guess) is that you are traversing much less of > the call stack when you fetch one row of 20 columns than when you fetch 20 > rows each with one column. > > -----Original Message----- > From: Daniel Połaczański [mailto:[email protected]] > Sent: Thursday, January 26, 2017 1:57 PM > To: [email protected] > Subject: table schema - row with many column vs many rows > > Hi, > in the work we were testing the following scenarios regarding scan > performance. We stored 2500 domain rows containing 20 attributes.And after > that read one random row with all attributes couple times > > Scenario A > every single attribute stored in dedicated column. one hbase row with 20 > columns. > > Scenario B > every single attribute stored as a separate row under key like > RowKey:AttributeKey so we have 20 rows for one domain row > > As we know in HBase everything is stored as following entry > RowKey:ColumnKey:Value > > Theoritically we have in HBase the same amount of entries (2500*20) for > both scenario, so there shouldn't be any difference in performance. But it > looks that scanning in scenario A is much more faster (something like 10 > times). > > Do you havemaybe idea why Scenario A is better? > > Regards >
