Prefix compression would lower the cost of storing value in rowkey. It was inspired by long rowkey, short value schema design.
PREFIX and FAST_DIFF encodings are most often used. Cheers On Thu, Mar 28, 2013 at 7:26 AM, Pankaj Gupta <[email protected]> wrote: > Would prefix compression (https://issues.apache.org/jira/browse/HBASE-4676) > improve this? > > This is an important question in terms of schema design. Given the choice > of storing a value in column vs rowkey, I would many times want to store a > value in a rowkey if I foresee it being used for constraining lookups, even > if that it is only a weak use case at the time of schema design. But, if > there is substantial overhead in keeping values in row vs column then I > would want to keep only the absolutely essential identifier in row. The > overhead of storing values in rowkey influences the choice of what to store > in rowkey. > > On Mar 25, 2013, at 11:28 PM, Anoop Sam John <[email protected]> wrote: > > > When the number of columns (qualifiers) are more yes it can impact the > performance. In HBase every where the storage will be in terms of KVs. The > key will be some thing like rowkey+cfname+columnname+TS... > > > > So when u have 26 cells in a put then there will be repetition of many > bytes in the key.(One KV per column) So u will end up in transferring more > data. Within memstore more data(actual KV data size) getting written and so > more frequent flushes.. etc.. > > > > Have a look at Intel Panthera Document Store impl. > > > > -Anoop- > > ________________________________________ > > From: Ankit Jain [[email protected]] > > Sent: Monday, March 25, 2013 10:19 PM > > To: [email protected] > > Subject: Getting less write throughput due to more number of columns > > > > Hi All, > > > > I am writing a records into HBase. I ran the performance test on > following > > two cases: > > > > Set1: Input record contains 26 columns and record size is 2Kb. > > > > Set2: Input record contain 1 column and record size is 2Kb. > > > > In second case I am getting 8MBps more performance than step. > > > > are the large number of columns have any impact on write performance and > If > > yes, how we can overcome it. > > > > -- > > Thanks, > > Ankit Jain > >
