Create new row format for derby to optimize access to columns within a row
--------------------------------------------------------------------------

                 Key: DERBY-2168
                 URL: http://issues.apache.org/jira/browse/DERBY-2168
             Project: Derby
          Issue Type: Improvement
          Components: Store
    Affects Versions: 10.3.0.0
            Reporter: Mike Matrigali
            Priority: Minor


The current (and only) low level row format for derby was chosen to at the 
beginning of the project to be the most flexible.  So it treats every
column as variable length.  The simple row format is just a sequence of 
columns, with each column having a header indicating how long it
is.  So there is  no way to determine where the N'th column is in the row 
unless it first traverses the N-1 columns before
it.  A number of queries that might benefit from a different row format include:
1) non-covered queries which don't require all columns of data
2) non index scans which disqualify a number of rows based on a subset of 
columns that don't happen to be the 1st N columns of the row.

A pretty standard row format would have some sort of table at the beginning 
which would allow one to jump to a given offset of the row without
going through all the other columns.  Building up this table would likely 
increase the insert cost slightly, and would increase the diskspace required
to store rows.

Another standard kind of row format would be to optimize the  storage of fixed 
length fields.  Currently the store does not know anything about fixed
length fields as each datatype controls it's own storage.  New interfaces could 
be added either at create time or maybe in the datatypes themselves
to export the knowledge that datatypes are fixed length.  

This is a big project.  Note that a lot of performance work in StoredPage has 
made it "know" about the current record and field formats, as it was 
a big performance hit to make class calls for every field traversal.  This 
means that adding a new record and/or field format is not as isolated as
one might hope.  Also we are likely to need to support both the old and new 
format.  Anyone considering this work, I would suggest a very rough
prototype with peformance measurement first to make sure you are getting the 
expected performance before  doing a lot of work.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to