What statistics do you have on the data? I suppose most messages are read
by low number of users, mostly 0 or one.
I can see two options to consider:
1) Use arrays to store information on which users have already read the
message. You may need GIN/GIST index to search fast.
2) Introduce some kind of special column(s) for the cases when the message
is unread by everybody or was read by at most one user. E.g. read_by
columns with null value for unread, special value for read by many and real
user if read by only one.
in this case your condition would be (read_by is null or read_by not in
(current_user or special_value) or (read_by = special_value and not
exists()). Note that optimizer may have problems with such a complex
expression nd you may need to use union all instead on or. Partial
index(es) for null/special value may help.
Best regards, Vitalii Tymchyshyn
2014-05-02 10:20 GMT+03:00 Andreas Joseph Krogh andr...@visena.com:
På fredag 02. mai 2014 kl. 02:17:58, skrev Craig James
cja...@emolecules.com:
On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph Krogh
andr...@visena.comwrote:
I have a schema where I have lots of messages and some users who might
have read some of them. When a message is read by a user I create an entry
i a table message_property holding the property (is_read) for that user.
The schema is as follows:
[...]
create table person(
id serial primary key,
username varchar not null unique
);
create table message(
id serial primary key,
subject varchar
);
create table message_property(
message_id integer not null references message(id),
person_id integer not null references person(id),
is_read boolean not null default false,
unique(message_id, person_id)
);
[...]
So, for person 1 there are 10 unread messages, out of a total 1mill. 5
of those unread does not have an entry in message_property and 5 have an
entry and is_read set to FALSE.
Here's a possible enhancement: add two columns, an indexed timestamp to
the message table, and a timestamp of the oldest message this user has NOT
read on the person table. If most users read messages in a timely fashion,
this would (in most cases) narrow down the portion of the messages table to
a tiny fraction of the total -- just those messages newer than the oldest
message this user has not read.
When you sign up a new user, you can set his timestamp to the time the
account was created, since presumably messages before that time don't apply.
Whether this will help depends a lot on actual use patterns, i.e. do users
typically read all messages or do they leave a bunch of unread messages
sitting around forever?
Thanks fort the suggestion. A user must be able to read arbitrary old
messages, and messages don't expire.
--
*Andreas Jospeh Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com
www.visena.com
https://www.visena.com