On Fri, 10 Mar 2006, James Austin wrote:

>I've been doing a fair bit of research into optimising larger tables and
>just want a bit of feedback (most of my lecturers are giving conflicting
>answers).
>
>Just say I have a table with 200+ fields (all relate uniquly to the primary
>key), when querying the first 10 fields of a table I get just as fast a
>response (about the same as if I were querying a table with the same number
>of rows but only 10 fields) however as I get further into the table (i.e.
>selecting fields 180..190 there seems to be a noticable decrease in speed
>(by noticable I mean around 750 benchmark queries per second slower)).


Have a read of:
http://www.sqlite.org/php2004/page-001.html
http://www.sqlite.org/php2004/slides-all.html

Tables are implemented in a Btree, and described from slide 34 onwards.

The first portion of a row is stored along with other rows in a btree leaf
page. If the row doesn't fit, it spills into overflow pages, that are
chained using a singly linked list of page pointers (overflow pages store
data from a single row.) Thus, to find column 180..190 in the above
example, first the row must be located, then the linked list traversed to
find the pages with the desired columns.


>
>As a measure to increase the speed I was thinking that a viable option would
>be to break up the table based on how it is queried, such that sets of
>fields will be divided into new tables that contain fields relating to the
>query (so long as any fields that may be used in multiple queries (and
>require rejoining later) are not separated as this is much slower than
>keeping all fields in one table to begin with).
>
>The main question is, is the method listed above the best way to improve the
>speed of a large table or should they all remain in the same table as
>splitting may cause other problems later on.


My question is, why do you need so many columns in a row? Is this a
hypothetical example, or a real world application?


>
>One method suggested by one of my lectures was to leave the table as one and
>use views, however after testing this out I found views slower than querying
>the table large table directly (and seeing that they don't increase the
>table size it leads me to believe that a view is simply an alias to a
>query).


Views won't help if your data is so inefficiently laid out. Views are
indeed an alias to a query, and very useful for common queries and
transforming data.

You'd be better breaking up your rows into more managable tables, and
using views to create a compatible view of your old data. If this database
is already normalised, however, god help you I say! However, if this is
purely research, it may be better directed elsewhere, as 200+ column rows
are rare even in the most badly run government agencies.


>
>Any feedback is appreciated,


My 2c (hmm, I'm spending a lot recently)


>James Austin.
>

Christian


-- 
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \

Reply via email to