Re: [PERFORM] How would you store read/unread topic status?

2009-06-24 Thread Chris St Denis

Mathieu Nebra wrote:

Alexander Staubo a écrit :
  

On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebramate...@siteduzero.com wrote:


This flags table has more or less the following fields:

UserID - TopicID - LastReadAnswerID
  

We are doing pretty much same thing.



My problem is that everytime a user READS a topic, it UPDATES this flags
table to remember he has read it. This leads to multiple updates at the
same time on the same table, and an update can take a few seconds. This
is not acceptable for my users.
  

First of all, and I'm sure you thought of this, an update isn't needed
every time a user reads a topic; only when there are new answers that
need to be marked as read. So an update ... where last_read_answer_id
 ? should avoid the need for an update.



We don't work that way. We just remember he has read these answers and
then we can tell him there are no new messages for you to read.
So we just need to write what he has read when he reads it.

  

(That said, I believe PostgreSQL diffs tuple updates, so in practice
PostgreSQL might not be writing anything if you run an update with
the same value. I will let someone more intimate with the internal
details of updates to comment on this.)

Secondly, an update should not take a few seconds. You might want to
investigate this part before you turn to further optimizations.



Yes, I know there is a problem but I don't know if I am competent enough
to tune PostgreSQL for that. It can take a while to understand the
problem, and I'm not sure I'll have the time for that.

I am, however, opened to suggestions. Maybe I'm doing something wrong
somewhere.

  

In our application we defer the updates to a separate asynchronous
process using a simple queue mechanism, but in our case, we found that
the updates are fast enough (in the order of a few milliseconds) not
to warrant batching them into single transactions.



A few milliseconds would be cool.
In fact, defering to another process is a good idea, but I'm not sure if
it is easy to implement. It would be great to have some sort of UPDATE
... LOW PRIORITY to make the request non blocking.

Thanks.

  
I use pg_send_query() 
http://ca2.php.net/manual/en/function.pg-send-query.php in php to 
achieve this for a views counter. Script execution is not blocked while 
the queries are executing.


It looks like this may just be a direct translation of PQsendQuery() 
from libpq. Your preferred language may have a function like this.




[PERFORM] tsvector_update_trigger performance?

2009-06-24 Thread Chris St Denis
Is tsvector_update_trigger() smart enough to not bother updating a 
tsvector if the text in that column has not changed?


If not, can I make my own update trigger with something like

   if new.description != old.description
   return tsvector_update_trigger('fti_all', 'pg_catalog.english',
   'title', 'keywords', 'description');
   else
   return new;

or do I need to do it from scratch?


I'm seeing very high cpu load on my database server and my current 
theory is that some of the triggers may be causing it.


Re: [PERFORM] tsvector_update_trigger performance?

2009-06-24 Thread Chris St Denis

Dimitri Fontaine wrote:

Hi,

Le 24 juin 09 à 18:29, Alvaro Herrera a écrit :

Oleg Bartunov wrote:

On Wed, 24 Jun 2009, Chris St Denis wrote:


Is tsvector_update_trigger() smart enough to not bother updating a
tsvector if the text in that column has not changed?


no, you should do check yourself. There are several examples in 
mailing lists.


Or you could try using the supress_redundant_updates_trigger() function
that has been included in 8.4 (should be easy to backport)


  http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/backports/min_update/
  http://blog.tapoueh.org/projects.html#sec9

But it won't handle the case where some other random column has 
changed, but the UPDATE is not affecting the text indexed...
Tho this looks useful for some things, it doesn't solve my specific 
problem any. But thanks for the suggestion anyway.


This sounds like something that should just be on by default, not a 
trigger. Is there some reason it would waste the io of writing a new row 
to disk if nothing has changed? or is it just considered too much 
unnecessary overhead to compare them?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance