On Fri, Mar 21, 2014 at 5:59 PM, Erik van Zijst <erik.van.zi...@gmail.com>wrote:
> Hi there, > > I've got a relatively simple query that contains expensive BCRYPT > functions that gets optimized in a way that causes postgres to compute > more bcrypt hashes than necessary, thereby dramatically slowing things > down. > > In a certain part of our application we need to lookup users by their > username, email address and password. Now we don't store plaintext > passwords and so the query needs to compute bcrypt hashes on the fly: > > SELECT DISTINCT u.* > FROM auth_user u > JOIN bb_userprofile p ON p.user_id = u.id > JOIN bb_identity i ON i.profile_id = p.id > WHERE > ( > ( > u.username ILIKE 'detkin' > OR > i.email ILIKE 'f...@example.com' > ) > AND > ( > SUBSTRING(password FROM 8) = CRYPT( > 'detkin', SUBSTRING(password FROM 8)) > ) > ) > > These queries are generated by a parser that translates from an > external query language to SQL run on the database. This test db > contains 12 user records. > > With a single bcrypt hash taking ~300ms to compute, this is a recipe > for disaster and so the app only allows queries that require only a > very small number of bcrypt computation. > > E.g. the user must always "AND" the password lookup with a clause like > " username = 'foo' AND password = 'bar'" (username is unique). > > However, while the query above technically only needs to compute 1 > hash (there is a user 'detkin' and email 'f...@example.com' does not > exist), it instead creates a query plan that computes hashes *before* > filtering on username and email, leading to 12 hash computations and a > very slow query. > > The EXPLAIN (ANALYZE, BUFFERS) is here: http://explain.depesz.com/s/yhE > > The schemas for the 3 tables involved are here: > http://pgsql.privatepaste.com/f72020ad0a > > As a quick experiment I tried moving the joins and email lookup into a > nested IN query, but that still generates a plan that computes hashes > for all 12 users, before picking out the 1 whose username matches. > > Is there any way I can get postgres to perform the hash calculations > on the *result* of the other parts of the where clause, instead of the > other way around? Or else rewrite the query? > > Cheers, > Erik > (untested), but how about something like the following: WITH au AS ( SELECT DISTINCT u.* FROM auth_user u JOIN bb_userprofile p ON p.user_id = u.id JOIN bb_identity i ON i.profile_id = p.id WHERE u.username ILIKE 'detkin' OR i.email ILIKE 'f...@example.com') SELECT au.* FROM au WHERE SUBSTRING(au.password FROM 8) = CRYPT('detkin', SUBSTRING(au.password FROM 8));