On Thu, 2014-10-09 at 17:41 +0700, Dan Kennedy wrote: > I think I understand the second optimization. You're saying that given this: > > SELECT DISTINCT main.uid > FROM main LEFT JOIN email_list ON main.uid = email_list.uid > WHERE email_list.email LIKE 'al%' > > SQLite should deduce that since the LIKE implies "email_list.email IS > NOT NULL"
( Digression: SQLite *should* deduce the LIKE implies the IS NOT NULL condition. My *actual* queries all have, for some bizarre historical reason I haven't yet explored, email_list.value IS NOT NULL AND email_list.value LIKE 'foo%' You'd think that the former condition would end up optimised away, but timings show that it clearly isn't. I wasn't going to bring that up here since I think it is so clearly a stupid thing to be putting in our query in the first place. But you just made me do it anyway. ) > the LEFT JOIN is equivalent to a regular JOIN. Which would allow > SQLite to reorder the tables and perhaps come up with a more efficient > query plan. Correct? Indeed. As I said, a glib response to this suggested optimisation in *isolation* might be that the user shouldn't have asked for a LEFT JOIN at all if they didn't need it. But of course when coupled with the other suggested transforms, that glib response is a lot less valid. > > The first optimization might be trickier. ... > I guess queries that have a DISTINCT clause don't have this problem. Right. I think it only works with DISTINCT, but that's OK. > So it could in theory transform your query to: > > SELECT DISTINCT > main.uid FROM main LEFT JOIN email_list ON main.uid = email_list.uid > WHERE email_list.email LIKE 'al%' > UNION ALL > SELECT main.uid FROM main LEFT JOIN email_list ON main.uid = email_list.uid > WHERE main.first_name like 'al%' > UNION ALL > SELECT main.uid FROM main LEFT JOIN email_list ON main.uid = email_list.uid > WHERE main.last_name like 'al%'; > > The the hypothetical optimization above could figure out that the > first LEFT JOIN could just as easily be a JOIN. Right. > And that the other two LEFT JOINs are not required at all due to the > DISTINCT and the fact that the WHERE clauses on the sub-selects do not > reference the joined table at all. Yeah, that's probably true. I'd assumed that we'd do it as part of the "first optimisation" introducing the UNION but it's probably better handled as a separate transform as you have it here. > It seems like there are a few moving parts here. And none of these are > trivial changes. So, good ideas that might show up in an SQLite > release at some point, but it's not realistic to expect these > optimizations to be implemented quickly. TBH that's all I was really looking for; thanks. I *hate* it when people silently work around limitations in my software without bringing them to my attention, and I didn't want to be guilty of doing the same. I certainly don't expect an instant fix; just knowing that it's on the radar is perfectly sufficient. Now I can commit my hackish workaround to rewrite certain special-case queries to optimise the output of the current query planner for them, and still sleep at night :) Thanks. > Unless, of course, you can propose a patch and they turn out to be > simpler than they look. That would be interesting but realistically it's outside my capacity at the moment. I am already *so* far into shaving this yak that I can barely remember what I was actually intending to do in the first place :) -- David Woodhouse Open Source Technology Centre david.woodho...@intel.com Intel Corporation
_______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users