Toru,

Not much in terms of internals..

On 23 Mar 2010, at 21:36, Toru Maesaka wrote:

Hi Tom,

I am having trouble locating the specific docs at this time, I will look
deeper tonight.

Cool! Thanks for your help.

I believe that both the MySQL memory storage engine , and the ndbcluster storage engine perform some tricks to force the optimizer to favor one type
of access over another.

I don't know the NDB internals but I've been looking at MEMORY/HEAP's
source quite a bit lately and the code for records_in_range() seems to
only search through the Red-Black Tree. In brief,

 (1) If min_key exists, get the position of it in the tree.
 (2) If max_key exists, get the position of it in the tree.
 (3) Subtract the position of max_key with min_key.

So, what it's doing seems pretty darn simple (although I'll recheck
this). What I've started studying though is InnoDB's approach. There's
a pretty good documentation on it too =)


Looking at what cluster does it is even simpler,

I believe it does something like.. tell the optimizer that the table has 100 rows, and using an index will return 10 rows. This is sufficient to force an index use. Rather simple but it does get the index used.

It does have side effect and pollutes explain. See ndb_use_exact_count and ndb_index_stat_enable.

So not much help for a clean solution, but in general I recommend a view at the cluster source for examples as well because what cluster does to in terms of dealing with the optimizer is quite different than what the other engines might do.

For example see engine condition pushdown, and batching.

Best Regards,

Tom

http://www.innodb.com/doc/innodb_plugin-1.0/innodb-other-changes.html#innodb-other-changes-statistics-estimation

Cheers,
Toru


On Wed, Mar 24, 2010 at 3:38 AM, Tom Hanlon <[email protected]> wrote:
Toru,

I am having trouble locating the specific docs at this time, I will look
deeper tonight.

Wether this applies to your specific range optimization.. I am not sure.

But...
I believe that both the MySQL memory storage engine , and the ndbcluster storage engine perform some tricks to force the optimizer to favor one type
of access over another.

memory, because the index and the data are in memory vs the usual analysis
of index in memory, data on disk.

Cluster..
For similar reasons.. plus additional details.

I will see if I can find the docs.

--
Tom

On 22 Mar 2010, at 13:05, Toru Maesaka wrote:

Hi!

Would it make sense to have an engine-level option to tell the
optimizer to prioritize index use at all times? Something along the
line of HA_PRIORITIZE_INDEX.

I'm saying this because the relationship between
Cursor::records_in_range() and HA_READ_RANGE seems important to
effectively perform range scans at the moment. I tried purposely
returning a constant small cost factor in records_in_range() but it
seems that the range search will fail if the number of estimated rows
(in my case a total lie) is significantly smaller than the number of
actual rows in the table. If I don't set, HA_READ_RANGE then it seems
Drizzle will resort to a table scan (despite knowing that
'possible_keys' exist in EXPLAIN).

Preferably, I don't want to count through the number of rows in a
range for the optimizer on every simple query. Although... a
completely dishonest implementation of records_in_range() would dirty
the EXPLAIN statement result. So there is definitely a downside to
what I was experimenting with.

It would be great to hear what you all think :)

- Toru

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Tom Hanlon
[email protected]
Cloudera Certified Hadoop Developer
Certified MySQL DBA





--
Toru Maesaka <[email protected]>

Tom Hanlon
[email protected]
Cloudera Certified Hadoop Developer
Certified MySQL DBA


_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to