I think Drizzle should consider the ability to push down OFFSET to the
storage engine. LIMIT/OFFSET is very common in most Web applications.
Where there are lists, there are LIMIT/OFFSETs. For example, each user
in Blog has a list of his articles. For this list could be long, we
display it in pages, using the following SQL:

SELECT <needed attributes> FROM Article WHERE UserID = ? ORDER BY
PublishTime DESC LIMIT <page size> OFFSET <page size * page index>;

Of course we will create an index on UserID+PublishTime, so that there
will be no sorting. However, this method doesn't scale well. If some
attributes needed are not in the index, this method generally doesn't
scale beyond a OFFSET more than 100, cause there will be to much
random reads.

At the present day, we have two choice to improve the performce.
1. Add all needed attributes to the index. Generally this will lead to
too fat index, so we don't use this choice too often.
2. Fetch a list of keys first and them fetch the needed attributes
using the keys. That is, using the following code:
<key list> = SELECT KeyAttribute  FROM Article WHERE UserID = ? ORDER
BY PublishTime DESC LIMIT <page size> OFFSET <page size * page index>;
SELECT ... FROM Article WHERE KeyAttribute IN (<key list>);

We of course add KeyAttribute to the index.

However even the second choice is far from ideal, for several reasons:
1. At the present day, Drizzle will call index_next for <page size *
page index> times but throw away their results. This is a big waste.
2. Some row could have been deleted after you fetch the key list, you
have to deal with it, or you have to use a transaction (yet another
overhead).
3. "SELECT ... FROM Article WHERE KeyAttribute IN (<key list>);" could
be a long SQL, thus parsing overhead is considerable.

Due to the performance issue, the second choice generally doesn't
scale beyond a OFFSET more than 1000. We measured this in a real Web
2.0 application.

However, if we can push down OFFSET to the storage engine, the
performance could be dramatically improved. To brief, and optimized
storage engine could skip lots of index entries at a time. For
example, If the minimal key and maximal key in an index leaf page both
match the scan condition, the whole page could be skipped, and the
number of keys in this page could be added to current OFFSET. Because
an index page generally contains hundreds of keys, this will leads to
100x speedup.

And more, we can use the simple select method again.

We are developping a storage enging and are considering this
optimization. However the storage engine doesn't know the OFFSET and
whether OFFSET could be pushed down. So we send this feature request.

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to