Anoop Sharma created TRAFODION-2234:

             Summary: turn aligned row format for tables to ON by default
                 Key: TRAFODION-2234
             Project: Apache Trafodion
          Issue Type: Improvement
            Reporter: Anoop Sharma
            Assignee: Anoop Sharma
            Priority: Minor

Columns in Trafodion tables are stored in 2 formats:
-- regular hbase format where each column is stored as one cell
-- aligned format where the whole row is packed and stored in one cell

Aligned row provides performance boost during inserts and selects by
retrieving one cell from hbase instead of multiple cells. As the number
of columns in a table increase, perf of aligned format gets better.

There are some limitations with aligned format:
-- selection predicates cannot be pushed down to hbase region server
-- all columns need to be retrieved and updated as packed row in a cell
-- columns cannot be dropped without reloading the table
Over time, these limitations will be removed by use of user defined filters
and coprocessors to select/project rows at hbase region level.

During perf runs, the pros for aligned format outweigh the cons.

This jira is being filed to change the default from hbase format row
to aligned format row. Code for both aligned and hbase format already
exists and is being used.

A table can always be created in either of these 2 formats by explicitly
specifying the format during create time.
The default can also be changed to off or on by inserting the appropriate
value in the system defaults table.

Turning on aligned format as default will be done in 2 phases:
-- in phase 1, aligned default will be turning on during dev regressions run
  until it has stabilized.
-- in phase 2, system default will be changed to aligned. All table created
without an explicit format specification will be created in aligned format.

Metadata, repository, privilege and histogram tables will always be 
created in hbase format. This is needed for backward compatibility.

Any component or application that doesn't want to depend on the system
default must explicitly specify the row format in their create ddl.

This message was sent by Atlassian JIRA

Reply via email to