Hello,
I have the following schema:
CREATE TABLE users ( id BIGSERIAL PRIMARY KEY, name TEXT
NOT NULL UNIQUE ); CREATE TABLE friends ( user_id
BIGINT NOT NULL REFERENCES users, friend_user_id BIGINT NOT NULL
REFERENCES users, UNIQUE (user_id, friend_user_id) ); CREATE
TABLE posts ( id BIGSERIAL PRIMARY KEY, user_id BIGINT
NOT NULL REFERENCES users, content TEXT NOT NULL ); CREATE
INDEX posts_user_id_id_index ON posts(user_id, id);
Each user can unilaterally follow any number of friends. The posts table has a
large number of rows and is rapidly growing.
My goal is to retrieve the 10 most recent posts of a user's friends. This query
gives the correct result, but is inefficient:
SELECT posts.id, users.name, posts.content FROM posts JOIN users ON
posts.user_id = users.id WHERE posts.user_id IN (SELECT friend_user_id FROM
friends WHERE user_id = 1) ORDER BY posts.id DESC LIMIT 10;
If the user's friends have recently posted, the query is still reasonably fast
(https://explain.depesz.com/s/6ykR). But if the user's friends haven't recently
posted or the user has no friends, it quickly deteriorates
(https://explain.depesz.com/s/OnoG).
If I match only a single post author (e.g. WHERE posts.user_id = 5), Postgres
uses the index posts_user_id_id_index. But if I use IN, the index doesn't
appear to be used at all.
How can I get these results more efficiently?
I've uploaded the schema and the queries I've tried to dbfiddle at
http://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=cf1489b7f6d53c3fe0b55ed7ccbad1f0.
The output of "SELECT version()" is "PostgreSQL 9.6.5 on x86_64-pc-linux-gnu,
compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit" for me.
Thank you in advance for any insights, pointers or suggestions you are able to
give me.
Regards,Milo