At 08:29 AM 10/11/05, Richard Huxton wrote:
Frank Bax wrote:
I have a table with only 434 rows in it. Two important columns are
"itemid" and "locn". Each item must be in one of seven locations. We
need to create a "combo" by selecting one item from each of seven
locations; then determine which "combo" is the "best" according to our
analysis (see below).
A subselect for items in a location looks something like:
(select * from suit_item where locn='Head' AND username='Walter' ORDER BY
itemid LIMIT 10) as Head
One subselect for each location, cross join them all and the query
generates 10,000,000 combinations! Without the "LIMIT 10", there are
78 * 37 * 91 * 81 * 99 * 47 * 1 = 98,981,901,018 results returned for
username='Walter' (the only user at the moment). The large volume is
causing a problem for my systems! The "ORDER BY itemid" was added only
so that same 10 items were processed on different computer systems I
tested this query on. Only one item for 7th locn in the database at the
moment.
Frank - it might just be me, but I've read your email twice and despite
all the information I still don't have any idea what you are trying to do.
Are you saying that you WANT to generate a cross-join, score the millions
of results and then pick the best 10? It's doing what you want, but you'd
like it to be faster.
Or are you saying that you'd like to avoid the explosion in rows altogether?
In either case - I don't suppose you could provide a real example of the
query, so we can see exactly what you're trying to do.
There is no "best 10". I currently limit each subselect to 10 items so
that query will actually run without crashing. I would like to remove the
"ORDER BY itemid LIMIT 10" mentioned above. At the end of the query I have
a "LIMIT 100" clause which will stay and produces a list of "best 100" combos.
Either of your solutions would be acceptable; since avoiding the
"explosion" would also make the query faster. Current calculations
indicate that running the query without "LIMIT 10" in subselect would take
years to process.
The query is filled with expressions. I'm not sure I can shorten it
without making typos or deleting something important, so I'll make it
available on web here:
http://www.execulink.com/~fbax/JOINS/
Results of "explain analyse" is also there.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match