Hi Andreas,

[New to this list, forgive my ignorance.]

On 05/01/2014 01:26 PM, Andreas Joseph Krogh wrote:
I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu
My machine has PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu.
I have a schema where I have lots of messages and some users who might have read some of them. When a message is read by a user I create an entry i a table message_property holding the property (is_read) for that user.
The schema is as follows:
drop table if exists message_property;
drop table if exists message;
drop table if exists person;
create table person(
    id serial primary key,
    username varchar not null unique
);
create table message(
    id serial primary key,
    subject varchar
);
create table message_property(
    message_id integer not null references message(id),
    person_id integer not null references person(id),
    is_read boolean not null default false,
    unique(message_id, person_id)
);
[snip]
So, for person 1 there are 10 unread messages, out of a total 1mill. 5 of those unread does not have an entry in message_property and 5 have an entry and is_read set to FALSE. I have the following query to list all un-read messages for person with id=1:
SELECT
    m.id                          AS message_id,
    prop.person_id,
    coalesce(prop.is_read, FALSE) AS is_read,
    m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1
WHERE 1 = 1
      AND NOT EXISTS(SELECT
                         *
                     FROM message_property pr
WHERE pr.message_id = m.id AND pr.person_id = prop.person_id AND prop.is_read = TRUE)
    ;

The problem is that it's not quite efficient and performs badly, explain analyze shows:
[snip]

Does anyone have suggestions on how to optimize the query or schema?

I'm getting better performance with:

SELECT
m.id AS message_id,
1 AS person_id,
FALSE AS is_read,
m.subject
FROM message m
WHERE 1 = 1
AND NOT EXISTS(SELECT
    *
    FROM message_property pr
    WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);

You then lose the distinction between message_property with is_read = FALSE, and nonexistent message_property for the message row.

If that is essential, I'm getting a roughly 2x speedup on my non-tuned PostgreSQL with:
 SELECT
    m.id                          AS message_id,
    prop.person_id,
    coalesce(prop.is_read, FALSE) AS is_read,
    m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1
WHERE not coalesce(prop.is_read, false);

HTH,
Jochem

--
Jochem Berndsen | joc...@functor.nl



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to