[ http://issues.apache.org/jira/browse/DERBY-1506?page=all ]

Mike Matrigali updated DERBY-1506:
----------------------------------


I haven't thought about this much, but the following approaches all would solve 
the problem, some easier than others, 
these address the long column issue (I don't think the long row issue is as 
important):

1) Provide alternate heap and/or container implementation where overflow 
pointer of overflow  column points to page 
     in another container, thus effectively moving "blob" space out of current 
container.  Need to decide how many 
     blob spaces per table.  Some options are:  1 per table,  N per table (each 
growing to X bytes where X may be max
      file size on the device), 1 per blob column, ...

      I lean toward moving the blob space out of the current space rather than 
segmenting the current space to blob 
      and non blob space.  This would allow a possible easier path in the 
future to allow stuff like non-logged blobs.

2) provide an alternate(upgraded) implementation of the container 
implementation where the page map tracked 
     page type in addition to allocated state.  Or separate page maps for page 
type.  Then the scan of "main" pages
     could be optimized to use the page maps to efficiently get to the "next" 
main page.  Should be careful not to make
      these new page maps a concurrency problems where multiple scans now block 
each other on access to the 
     page maps.

3) For already indexed tables, figure out way for optimizer to use the index  
for the scan  (I am likely to report this as a 
     separate JIRA issue). 

4) For unindexed tables, assuming fix for 3 is implemented.  We could create an 
internal index on the table that would
     use existing techonology and basically provide functionality of #2 - at 
the cost of maintaining the index.    My initial
      take would be that it is reasonable to assume some sort of index (or 
primary key) on large tables, in applications
      that care about performance.


> full table scans of tables which don't use indexes, which have blobs, but 
> don't reference blob data still read all pages of the table
> -------------------------------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-1506
>          URL: http://issues.apache.org/jira/browse/DERBY-1506
>      Project: Derby
>         Type: Improvement

>   Components: Store
>     Versions: 10.1.3.1
>     Reporter: Mike Matrigali
>     Priority: Minor

>
> A full table scan that does not use an index always reads every allocated 
> page of the table through the cache.  In two cases 
> this means logically it reads more data that necessary: long rows (where the 
> head columns span multiple pages) and long columns
> (columns where the data in a single column spans multiple pages).  In both 
> these cases the space for the "overflow" portion of the
> row or column is currently stored in the same space as the regular "main" 
> pages.  The current implementation of a table scan of 
> a heap container is to call raw store to give it a linear list of main pages 
> with rows, raw store conglomerate implementations step through each allocated 
> page in the container and returns the "main" pages (reading the overflow 
> pages into cache, identifying them, and skipping them) 
> the access layer which then returns rows as requested to the query processing 
> layer.
> If a table contains rows with very long columns (ie. 2gig blobs), and the 
> tablescan does not request the blob data then a lot of data
> is read from disk but not required by the query. 
> A more unusual case is a table scan on requiring a few columns from a table 
> made up of  2 gig rows made up of all less than 32k columns.,
> in this case also derby  will read all pages as part of a tablescan even if 
> only the first column is the only required column of the chain.
> Note that this is only a problem in tablescan of heap tables.  In both cases 
> if an index is used to get the row, then ONLY the required data is
> read from disk.  In the long column case the main row has only a pointer to 
> the overflow chain for the blob and it will not be read unless the
> blob data is required.  In the long row case data, columns appear in the 
> container in the order they are created in the original "create table"
> statement.  Data is read from disk into cache for all columns from the 1st up 
> to the "last"  one referenced in the query.  Data objects are only
> instantiated from the cache data for the columns referenced in the query.
> I have marked this low in priority as I believe that full, unindexed tables 
> scans of tables with gigabyte blobs are not the normal case.  Seems like most 
> applications would do keyed lookups of the table.    But there may be apps 
> that need to
> do full table reporting on the non'-blob data in such a table.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to