Re: [PERFORM] Optimize query for listing un-read messages

2014-05-02 Thread Andreas Joseph Krogh
På fredag 02. mai 2014 kl. 02:17:58, skrev Craig James cja...@emolecules.com 
mailto:cja...@emolecules.com: On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph 
Kroghandr...@visena.com mailto:andr...@visena.com wrote: I have a schema 
where I have lots of messages and some users who might have read some of them. 
When a message is read by a user I create an entry i a table message_property 
holding the property (is_read) for that user.  The schema is as follows: [...]  
 create table person(    id serial primary key,
     username varchar not null unique
 );   create table message(
     id serial primary key,
     subject varchar
 );   create table message_property(
     message_id integer not null references message(id),
     person_id integer not null references person(id),
     is_read boolean not null default false,
     unique(message_id, person_id)
 );   [...]  So, for person 1 there are 10 unread messages, out of a total 
1mill. 5 of those unread does not have an entry in message_property and 5 have 
an entry and is_read set to FALSE.   Here's a possible enhancement: add two 
columns, an indexed timestamp to the message table, and a timestamp of the 
oldest message this user has NOT read on the person table. If most users read 
messages in a timely fashion, this would (in most cases) narrow down the 
portion of the messages table to a tiny fraction of the total -- just those 
messages newer than the oldest message this user has not read.
   When you sign up a new user, you can set his timestamp to the time the 
account was created, since presumably messages before that time don't apply.
   Whether this will help depends a lot on actual use patterns, i.e. do users 
typically read all messages or do they leave a bunch of unread messages sitting 
around forever?   Thanks fort the suggestion. A user must be able to read 
arbitrary old messages, and messages don't expire.   -- Andreas Jospeh Krogh 
CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com 
mailto:andr...@visena.com www.visena.com https://www.visena.com  
https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-02 Thread Vitalii Tymchyshyn
What statistics do you have on the data? I suppose most messages are read
by low number of users, mostly 0 or one.
I can see two options to consider:
1) Use arrays to store information on which users have already read the
message. You may need GIN/GIST index to search fast.
2) Introduce some kind of special column(s) for the cases when the message
is unread by everybody or was read by at most one user. E.g. read_by
columns with null value for unread, special value for read by many and real
user if read by only one.
in this case your condition would be (read_by is null or read_by not in
(current_user or special_value) or (read_by = special_value and not
exists()). Note that optimizer may have problems with such a complex
expression nd you may need to use union all instead on or. Partial
index(es) for null/special value may help.

Best regards, Vitalii Tymchyshyn


2014-05-02 10:20 GMT+03:00 Andreas Joseph Krogh andr...@visena.com:

 På fredag 02. mai 2014 kl. 02:17:58, skrev Craig James 
 cja...@emolecules.com:

 On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph Krogh 
 andr...@visena.comwrote:

 I have a schema where I have lots of messages and some users who might
 have read some of them. When a message is read by a user I create an entry
 i a table message_property holding the property (is_read) for that user.

 The schema is as follows:

 [...]


 create table person(
 id serial primary key,
 username varchar not null unique
 );

 create table message(
 id serial primary key,
 subject varchar
 );

 create table message_property(
 message_id integer not null references message(id),
 person_id integer not null references person(id),
 is_read boolean not null default false,
 unique(message_id, person_id)
 );


 [...]

  So, for person 1 there are 10 unread messages, out of a total 1mill. 5
 of those unread does not have an entry in message_property and 5 have an
 entry and is_read set to FALSE.


 Here's a possible enhancement: add two columns, an indexed timestamp to
 the message table, and a timestamp of the oldest message this user has NOT
 read on the person table. If most users read messages in a timely fashion,
 this would (in most cases) narrow down the portion of the messages table to
 a tiny fraction of the total -- just those messages newer than the oldest
 message this user has not read.

 When you sign up a new user, you can set his timestamp to the time the
 account was created, since presumably messages before that time don't apply.

 Whether this will help depends a lot on actual use patterns, i.e. do users
 typically read all messages or do they leave a bunch of unread messages
 sitting around forever?


 Thanks fort the suggestion. A user must be able to read arbitrary old
 messages, and messages don't expire.

  --
 *Andreas Jospeh Krogh*
 CTO / Partner - Visena AS
 Mobile: +47 909 56 963
 andr...@visena.com
 www.visena.com
  https://www.visena.com




[PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
Hi all,   I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu 
 I have a schema where I have lots of messages and some users who might have 
read some of them. When a message is read by a user I create an entry i a table 
message_property holding the property (is_read) for that user.   The schema is 
as follows:   drop table if exists message_property;
 drop table if exists message;
 drop table if exists person;   create table person(
     id serial primary key,
     username varchar not null unique
 );   create table message(
     id serial primary key,
     subject varchar
 );   create table message_property(
     message_id integer not null references message(id),
     person_id integer not null references person(id),
     is_read boolean not null default false,
     unique(message_id, person_id)
 );   insert into person(username) values('user_' || generate_series(0, 999));
 insert into message(subject) values('Subject ' || random() || 
generate_series(0, 99));
 insert into message_property(message_id, person_id, is_read) select id, 1, 
true from message order by id limit 90;
 insert into message_property(message_id, person_id, is_read) select id, 1, 
false from message order by id limit 5 offset 90; analyze;   So, for person 
1 there are 10 unread messages, out of a total 1mill. 5 of those unread does 
not have an entry in message_property and 5 have an entry and is_read set to 
FALSE.   I have the following query to list all un-read messages for person 
with id=1:   SELECT
     m.id  AS message_id,
     prop.person_id,
     coalesce(prop.is_read, FALSE) AS is_read,
     m.subject
 FROM message m
     LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND 
prop.person_id = 1
 WHERE 1 = 1
   AND NOT EXISTS(SELECT
  *
  FROM message_property pr
  WHERE pr.message_id = m.id AND pr.person_id = 
prop.person_id AND prop.is_read = TRUE)
     ; 
 The problem is that it's not quite efficient and performs badly, explain 
analyze shows: 

 
QUERY PLAN
 
-
  Merge Anti Join  (cost=1.27..148784.09 rows=5 width=40) (actual 
time=918.906..918.913 rows=10 loops=1)
    Merge Cond: (m.id = pr.message_id)
    Join Filter: (prop.is_read AND (pr.person_id = prop.person_id))
    Rows Removed by Join Filter: 5
    -  Merge Left Join  (cost=0.85..90300.76 rows=100 width=40) (actual 
time=0.040..530.748 rows=100 loops=1)
  Merge Cond: (m.id = prop.message_id)
  -  Index Scan using message_pkey on message m  (cost=0.42..34317.43 
rows=100 width=35) (actual time=0.014..115.829 rows=100 loops=1)
  -  Index Scan using message_property_message_id_person_id_key on 
message_property prop  (cost=0.42..40983.40 rows=95 width=9) (actual 
time=0.020..130.728 rows=95 loops=1)
    Index Cond: (person_id = 1)
    -  Index Only Scan using message_property_message_id_person_id_key on 
message_property pr  (cost=0.42..40983.40 rows=95 width=8) (actual 
time=0.024..140.349 rows=95 loops=1)
  Index Cond: (person_id = 1)
  Heap Fetches: 95
  Total runtime: 918.975 ms
 (13 rows)   
 Does anyone have suggestions on how to optimize the query or schema? It's 
important that any message not having an entry in message_property for a user 
is considered un-read. 
 Thanks!   -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 
56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Jochem Berndsen


Hi Andreas,

[New to this list, forgive my ignorance.]

On 05/01/2014 01:26 PM, Andreas Joseph Krogh wrote:
I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu 

My machine has PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu.
I have a schema where I have lots of messages and some users who might 
have read some of them. When a message is read by a user I create an 
entry i a table message_property holding the property (is_read) for 
that user.

The schema is as follows:
drop table if exists message_property;
drop table if exists message;
drop table if exists person;
create table person(
id serial primary key,
username varchar not null unique
);
create table message(
id serial primary key,
subject varchar
);
create table message_property(
message_id integer not null references message(id),
person_id integer not null references person(id),
is_read boolean not null default false,
unique(message_id, person_id)
);

[snip]
So, for person 1 there are 10 unread messages, out of a total 1mill. 5 
of those unread does not have an entry in message_property and 5 have 
an entry and is_read set to FALSE.
I have the following query to list all un-read messages for person 
with id=1:

SELECT
m.id  AS message_id,
prop.person_id,
coalesce(prop.is_read, FALSE) AS is_read,
m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id 
AND prop.person_id = 1

WHERE 1 = 1
  AND NOT EXISTS(SELECT
 *
 FROM message_property pr
 WHERE pr.message_id = m.id AND pr.person_id = 
prop.person_id AND prop.is_read = TRUE)

;

The problem is that it's not quite efficient and performs badly, 
explain analyze shows:

[snip]


Does anyone have suggestions on how to optimize the query or schema?


I'm getting better performance with:

SELECT
m.id AS message_id,
1 AS person_id,
FALSE AS is_read,
m.subject
FROM message m
WHERE 1 = 1
AND NOT EXISTS(SELECT
*
FROM message_property pr
WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);

You then lose the distinction between message_property with is_read = 
FALSE, and nonexistent message_property for the message row.


If that is essential, I'm getting a roughly 2x speedup on my non-tuned 
PostgreSQL with:

 SELECT
m.id  AS message_id,
prop.person_id,
coalesce(prop.is_read, FALSE) AS is_read,
m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND 
prop.person_id = 1

WHERE not coalesce(prop.is_read, false);

HTH,
Jochem

--
Jochem Berndsen | joc...@functor.nl



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På torsdag 01. mai 2014 kl. 20:35:07, skrev Jochem Berndsen joc...@functor.nl 
mailto:joc...@functor.nl: 
 Hi Andreas,

 [New to this list, forgive my ignorance.]
 [snip]
 I'm getting better performance with:

 SELECT
 m.id AS message_id,
 1 AS person_id,
 FALSE AS is_read,
 m.subject
 FROM message m
 WHERE 1 = 1
 AND NOT EXISTS(SELECT
      *
      FROM message_property pr
      WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);

 You then lose the distinction between message_property with is_read =
 FALSE, and nonexistent message_property for the message row.

 If that is essential, I'm getting a roughly 2x speedup on my non-tuned
 PostgreSQL with:
   SELECT
      m.id                          AS message_id,
      prop.person_id,
      coalesce(prop.is_read, FALSE) AS is_read,
      m.subject
 FROM message m
      LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
 prop.person_id = 1
 WHERE not coalesce(prop.is_read, false);     Hi Jochem,   Thansk for looking 
at it. I'm still seing ~500ms being spent and I was hoping for a way to do this 
using index so one could achieve 1-10ms, but maybe that's impossible given the 
schema?   Is there a way to design an equivalent  schema to achieve 10ms 
execution-time?   -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 
909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Pavel Stehule
Hello


2014-05-01 21:17 GMT+02:00 Andreas Joseph Krogh andr...@visena.com:

 På torsdag 01. mai 2014 kl. 20:35:07, skrev Jochem Berndsen 
 joc...@functor.nl:


 Hi Andreas,

 [New to this list, forgive my ignorance.]
 [snip]
 I'm getting better performance with:

 SELECT
 m.id AS message_id,
 1 AS person_id,
 FALSE AS is_read,
 m.subject
 FROM message m
 WHERE 1 = 1
 AND NOT EXISTS(SELECT
  *
  FROM message_property pr
  WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);

 You then lose the distinction between message_property with is_read =
 FALSE, and nonexistent message_property for the message row.

 If that is essential, I'm getting a roughly 2x speedup on my non-tuned
 PostgreSQL with:
   SELECT
  m.id  AS message_id,
  prop.person_id,
  coalesce(prop.is_read, FALSE) AS is_read,
  m.subject
 FROM message m
  LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
 prop.person_id = 1
 WHERE not coalesce(prop.is_read, false);



 Hi Jochem,

 Thansk for looking at it. I'm still seing ~500ms being spent and I was
 hoping for a way to do this using index so one could achieve 1-10ms, but
 maybe that's impossible given the schema?

 Is there a way to design an equivalent  schema to achieve 10ms
 execution-time?


I had a perfect success on similar use case with descent ordered partial
index

http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html

Regards

Pavel



  --
 *Andreas Jospeh Krogh*
 CTO / Partner - Visena AS
 Mobile: +47 909 56 963
 andr...@visena.com
 www.visena.com
  https://www.visena.com




Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule 
pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: Hello [snip]   I had 
a perfect success on similar use case with descent ordered partial index

http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html 
http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html   I'm not 
getting good performance. Are you able to craft an example using my schema and 
partial index?   Thanks.   -- Andreas Jospeh Krogh CTO / Partner - Visena AS 
Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com 
www.visena.com https://www.visena.com  https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Pavel Stehule
2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh andr...@visena.com:

 På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule 
 pavel.steh...@gmail.com:

 Hello
 [snip]

 I had a perfect success on similar use case with descent ordered partial
 index

 http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html


 I'm not getting good performance. Are you able to craft an example using
 my schema and partial index?


maybe some like

CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read

When I am thinking about your schema, it is designed well, but it is not
index friendly, so for some fast access you should to hold a cache (table)
of unread messages.

Regards

Pavel



 Thanks.

  --
 *Andreas Jospeh Krogh*
 CTO / Partner - Visena AS
 Mobile: +47 909 56 963
 andr...@visena.com
 www.visena.com
  https://www.visena.com




Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule 
pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com:     2014-05-01 21:39 
GMT+02:00 Andreas Joseph Kroghandr...@visena.com mailto:andr...@visena.com: 
På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule 
pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com: Hello [snip]   I had 
a perfect success on similar use case with descent ordered partial index

http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html 
http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html   I'm not 
getting good performance. Are you able to craft an example using my schema and 
partial index?   maybe some like
   CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read
   When I am thinking about your schema, it is designed well, but it is not 
index friendly, so for some fast access you should to hold a cache (table) of 
unread messages   Ah, that's what I was hoping to not having to do. In my 
system, messages arrive all the time and having to update a cache for all new 
messages for all users seems messy... Seems I could just as well create a 
message_property for all users when a new message arrives, so I can INNER JOIN 
it and get good performance. But that table will quickly grow *very* large...   
--Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 
andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Pavel Stehule
2014-05-01 22:30 GMT+02:00 Andreas Joseph Krogh andr...@visena.com:

 På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule 
 pavel.steh...@gmail.com:



 2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh andr...@visena.com:

 På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule 
 pavel.steh...@gmail.com:

 Hello
 [snip]

 I had a perfect success on similar use case with descent ordered partial
 index

 http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html


 I'm not getting good performance. Are you able to craft an example using
 my schema and partial index?


 maybe some like

 CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read

 When I am thinking about your schema, it is designed well, but it is not
 index friendly, so for some fast access you should to hold a cache (table)
 of unread messages


 Ah, that's what I was hoping to not having to do. In my system, messages
 arrive all the time and having to update a cache for all new messages for
 all users seems messy... Seems I could just as well create a
 message_property for all users when a new message arrives, so I can INNER
 JOIN it and get good performance. But that table will quickly grow *very*
 large...


What you need is a JOIN index, that is not possible in Postgres.

I afraid so some ugly solutions is necessary (when you require extra fast
access). You need a index (small index) and it require some existing set -
you cannot do index on the difference two sets.

I expect so some flag on the relation message - like it should not be
not read can helps little bit - and can be used in partial index as
conditions. Other possibility is some variant of partitioning - you can
divide a messages and users to distinct sets and then you decrease a number
of possible combinations.

Regards

Pavel



  --
 *Andreas Jospeh Krogh*
 CTO / Partner - Visena AS
 Mobile: +47 909 56 963
 andr...@visena.com
 www.visena.com
  https://www.visena.com




Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread David G Johnston
How does something like:

WITH unreads AS (
SELECT messageid FROM message
EXCEPT
SELECT messageid FROM message_property WHERE personid=1 AND has_read
)
SELECT ...
FROM unreads
JOIN messages USING (messageid)
;

perform?

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Optimize-query-for-listing-un-read-messages-tp5802097p5802157.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På torsdag 01. mai 2014 kl. 23:19:55, skrev David G Johnston 
david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: How does 
something like:

 WITH unreads AS (
 SELECT messageid FROM message
 EXCEPT
 SELECT messageid FROM message_property WHERE personid=1 AND has_read
 )
 SELECT ...
 FROM unreads
 JOIN messages USING (messageid)
 ;

 perform?   It actually performs worse.   The best query so far is:   SELECT
     m.id  AS message_id,
     prop.person_id,
     coalesce(prop.is_read, FALSE) AS is_read,
     m.subject
 FROM message m
     LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
  prop.person_id = 1
 WHERE coalesce(prop.is_read, false) = false;   Giving the plan: 

  
QUERY PLAN
 
---
  Merge Left Join  (cost=4.20..90300.76 rows=50 width=40) (actual 
time=445.021..445.025 rows=10 loops=1)
    Merge Cond: (m.id = prop.message_id)
    Filter: (NOT COALESCE(prop.is_read, false))
    Rows Removed by Filter: 90
    -  Index Scan using message_pkey on message m  (cost=0.42..34317.43 
rows=100 width=35) (actual time=0.014..113.314 rows=100 loops=1)
    -  Index Scan using message_property_message_id_person_id_key on 
message_property prop  (cost=0.42..40983.40 rows=95 width=9) (actual 
time=0.018..115.019 rows=95 loops=1)
  Index Cond: (person_id = 1)
  Total runtime: 445.076 ms
 (8 rows)   -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 
56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Tomas Vondra
On 1.5.2014 23:19, Andreas Joseph Krogh wrote:
 På torsdag 01. mai 2014 kl. 23:02:13, skrev Pavel Stehule
 pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com:
 
  
  
 2014-05-01 22:30 GMT+02:00 Andreas Joseph Krogh andr...@visena.com
 mailto:andr...@visena.com:
 
 På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule
 pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com:
 
  
  
 2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh
 andr...@visena.com mailto:andr...@visena.com:
 
 På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel
 Stehule pavel.steh...@gmail.com
 mailto:pavel.steh...@gmail.com:
 
 Hello
 [snip]
  
 I had a perfect success on similar use case with
 descent ordered partial index
 
 
 http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html
 
  
 I'm not getting good performance. Are you able to craft
 an example using my schema and partial index?
 
  
 maybe some like
  
 CREATE INDEX ON message_property (person_id, message_id)
 WHERE pr.is_read
  
 When I am thinking about your schema, it is designed well,
 but it is not index friendly, so for some fast access you
 should to hold a cache (table) of unread messages
 
  
 Ah, that's what I was hoping to not having to do. In my system,
 messages arrive all the time and having to update a cache for
 all new messages for all users seems messy... Seems I could just
 as well create a message_property for all users when a new
 message arrives, so I can INNER JOIN it and get good
 performance. But that table will quickly grow *very* large...
 
  
 What you need is a JOIN index, that is not possible in Postgres.
  
 I afraid so some ugly solutions is necessary (when you require
 extra fast access). You need a index (small index) and it require
 some existing set - you cannot do index on the difference two sets.
  
 I expect so some flag on the relation message - like it should
 not be not read can helps little bit - and can be used in partial
 index as conditions. Other possibility is some variant of
 partitioning - you can divide a messages and users to distinct sets
 and then you decrease a number of possible combinations.
 
  
 Just curious:
 Is such a JOIN index possible in other DBs, if so - which?
 Can other DBs do index on difference between two sets?
 Will PG ever have this, is it even possible?

I'm not aware of such database, but maybe it's possible at least for
some cases. But I'd expect that to significantly depend on the schema.
And I'm not aware of any such effort in case of PostgreSQL, do don't
hold your breath.

IMHO the problem with your schema is that while each 'read' message has
a matching row in message_property, 'undread' messages may or may not
have a matching row. Is there a particular reason for that?

If you could get rid of this, i.e. require that each pair (message,
recipient) has a row in message_propery (irrespectedly whether the
message was read or not), you can do this:

CREATE INDEX message_unread_idx
ON message_property(message_id, person_id) WHERE (NOT is_read)

and then simply do the query like this:

SELECT
m.id,
prop.person_id,
prop.is_read,
m.subject
FROM messages m JOIN message_property p ON (m.id = p.message_id)
WHERE (NOT is_read) AND person_id = :pid

and I'd expect this to use the partial index, and being much faster.

regards
Tomas



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På torsdag 01. mai 2014 kl. 23:45:49, skrev Tomas Vondra t...@fuzzy.cz 
mailto:t...@fuzzy.cz: On 1.5.2014 23:19, Andreas Joseph Krogh wrote:
  På torsdag 01. mai 2014 kl. 23:02:13, skrev Pavel Stehule
  pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com:
 
      
      
      2014-05-01 22:30 GMT+02:00 Andreas Joseph Krogh andr...@visena.com
      mailto:andr...@visena.com:
 
          På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule
          pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com:
 
              
              
              2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh
              andr...@visena.com mailto:andr...@visena.com:
 
                  På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel
                  Stehule pavel.steh...@gmail.com
                  mailto:pavel.steh...@gmail.com:
 
                      Hello
                      [snip]
                      
                      I had a perfect success on similar use case with
                      descent ordered partial index
 
                     
 http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html
 
                  
                  I'm not getting good performance. Are you able to craft
                  an example using my schema and partial index?
 
              
              maybe some like
              
              CREATE INDEX ON message_property (person_id, message_id)
              WHERE pr.is_read
              
              When I am thinking about your schema, it is designed well,
              but it is not index friendly, so for some fast access you
              should to hold a cache (table) of unread messages
 
          
          Ah, that's what I was hoping to not having to do. In my system,
          messages arrive all the time and having to update a cache for
          all new messages for all users seems messy... Seems I could just
          as well create a message_property for all users when a new
          message arrives, so I can INNER JOIN it and get good
          performance. But that table will quickly grow *very* large...
 
      
      What you need is a JOIN index, that is not possible in Postgres.
      
      I afraid so some ugly solutions is necessary (when you require
      extra fast access). You need a index (small index) and it require
      some existing set - you cannot do index on the difference two sets.
      
      I expect so some flag on the relation message - like it should
      not be not read can helps little bit - and can be used in partial
      index as conditions. Other possibility is some variant of
      partitioning - you can divide a messages and users to distinct sets
      and then you decrease a number of possible combinations.
 
  
  Just curious:
  Is such a JOIN index possible in other DBs, if so - which?
  Can other DBs do index on difference between two sets?
  Will PG ever have this, is it even possible?

 I'm not aware of such database, but maybe it's possible at least for
 some cases. But I'd expect that to significantly depend on the schema.
 And I'm not aware of any such effort in case of PostgreSQL, do don't
 hold your breath.

 IMHO the problem with your schema is that while each 'read' message has
 a matching row in message_property, 'undread' messages may or may not
 have a matching row. Is there a particular reason for that?     Yes. The 
point is that maintaining a message_property pair for all messages for all 
users in the system imposes quite a maintainance-headache. As the schema is now 
any new message is per definition un-read, and when a user reads it then it 
gets an entry with is_read=true in message_property. This table holds other 
properties too. This way I'm avoiding having to book-keep so much when a new 
message arrives and when a new user is created. A message in my system does not 
necessarily have only one recipient, it might have one, many or none, and might 
be visible to many.   If you could get rid of this, i.e. require that each pair 
(message,
 recipient) has a row in message_propery (irrespectedly whether the
 message was read or not), you can do this:

 CREATE INDEX message_unread_idx
     ON message_property(message_id, person_id) WHERE (NOT is_read)

 and then simply do the query like this:

 SELECT
     m.id,
     prop.person_id,
     prop.is_read,
     m.subject
 FROM messages m JOIN message_property p ON (m.id = p.message_id)
 WHERE (NOT is_read) AND person_id = :pid

 and I'd expect this to use the partial index, and being much faster.   I'm 
aware of the performance-gain using such a plain JOIN-query.   Thanks for your 
feedback.   -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile: +47 909 
56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Tomas Vondra
On 1.5.2014 23:58, Andreas Joseph Krogh wrote:
 På torsdag 01. mai 2014 kl. 23:45:49, skrev Tomas Vondra t...@fuzzy.cz
 mailto:t...@fuzzy.cz:
 
 On 1.5.2014 23:19, Andreas Joseph Krogh wrote:
  Just curious:
  Is such a JOIN index possible in other DBs, if so - which?
  Can other DBs do index on difference between two sets?
  Will PG ever have this, is it even possible?
 
 I'm not aware of such database, but maybe it's possible at least for
 some cases. But I'd expect that to significantly depend on the schema.
 And I'm not aware of any such effort in case of PostgreSQL, do don't
 hold your breath.
 
 IMHO the problem with your schema is that while each 'read' message has
 a matching row in message_property, 'undread' messages may or may not
 have a matching row. Is there a particular reason for that?
 
  
  
 Yes. The point is that maintaining a message_property pair for all
 messages for all users in the system imposes quite a
 maintainance-headache. As the schema is now any new message is per
 definition un-read, and when a user reads it then it gets an entry with
 is_read=true in message_property. This table holds other properties too.
 This way I'm avoiding having to book-keep so much when a new message
 arrives and when a new user is created. A message in my system does not
 necessarily have only one recipient, it might have one, many or none,
 and might be visible to many.

So how do you determine who's the recipient of a message? Or is that the
case that everyone can read everything (which is why you're displaying
them the unread messages, right)?

I understand you're trying to solve this without storing a row for each
possible message-person combination, but won't this eventually happen
anyway (with is_read=true for all rows)?

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På fredag 02. mai 2014 kl. 00:34:34, skrev Tomas Vondra t...@fuzzy.cz 
mailto:t...@fuzzy.cz: On 1.5.2014 23:58, Andreas Joseph Krogh wrote:
  På torsdag 01. mai 2014 kl. 23:45:49, skrev Tomas Vondra t...@fuzzy.cz
  mailto:t...@fuzzy.cz:
 
      On 1.5.2014 23:19, Andreas Joseph Krogh wrote:
       Just curious:
       Is such a JOIN index possible in other DBs, if so - which?
       Can other DBs do index on difference between two sets?
       Will PG ever have this, is it even possible?
 
      I'm not aware of such database, but maybe it's possible at least for
      some cases. But I'd expect that to significantly depend on the schema.
      And I'm not aware of any such effort in case of PostgreSQL, do don't
      hold your breath.
 
      IMHO the problem with your schema is that while each 'read' message has
      a matching row in message_property, 'undread' messages may or may not
      have a matching row. Is there a particular reason for that?
 
  
  
  Yes. The point is that maintaining a message_property pair for all
  messages for all users in the system imposes quite a
  maintainance-headache. As the schema is now any new message is per
  definition un-read, and when a user reads it then it gets an entry with
  is_read=true in message_property. This table holds other properties too.
  This way I'm avoiding having to book-keep so much when a new message
  arrives and when a new user is created. A message in my system does not
  necessarily have only one recipient, it might have one, many or none,
  and might be visible to many.

 So how do you determine who's the recipient of a message? Or is that the
 case that everyone can read everything (which is why you're displaying
 them the unread messages, right)?     A message might have a recipient and 
might be read by others.   I understand you're trying to solve this without 
storing a row for each
 possible message-person combination, but won't this eventually happen
 anyway (with is_read=true for all rows)?     I will end up with that only if 
all users read all messages, which is not nearly the case.   -- Andreas Jospeh 
Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com 
mailto:andr...@visena.com www.visena.com https://www.visena.com  
https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread David G Johnston
Andreas Joseph Krogh-2 wrote
 I will end up with that only if 
 all users read all messages, which is not nearly the case.

These observations probably won't help but...

You have what amounts to a mathematical spare matrix problem on your
hands...

Is there any way to expire messages so that dimension does not grow
unbounded?

Per-User caching does seem to be something that is going to be needed...

Depending on how many users are being tracked would storing the reader_id
in an indexed array improve matters?   SELECT ... FROM message WHERE NOT (1
= ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE
messageid = ...  I'm not that familiar with how well indexes over arrays
work or which kind is needed (i.e. gin/gist).

HTH

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Optimize-query-for-listing-un-read-messages-tp5802097p5802170.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På fredag 02. mai 2014 kl. 00:55:25, skrev David G Johnston 
david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: Andreas Joseph 
Krogh-2 wrote
  I will end up with that only if
  all users read all messages, which is not nearly the case.

 These observations probably won't help but...

 You have what amounts to a mathematical spare matrix problem on your
 hands...

 Is there any way to expire messages so that dimension does not grow
 unbounded?     No, unfortunately...   Per-User caching does seem to be 
something that is going to be needed...

 Depending on how many users are being tracked would storing the reader_id
 in an indexed array improve matters?   SELECT ... FROM message WHERE NOT (1
 = ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE
 messageid = ...  I'm not that familiar with how well indexes over arrays
 work or which kind is needed (i.e. gin/gist).     is_read is one of many 
properties being tracked for a message...   -- Andreas Jospeh Krogh CTO / 
Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com 
mailto:andr...@visena.com www.visena.com https://www.visena.com  
https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread David G Johnston


 Per-User caching does seem to be something that is going to be needed...

 Depending on how many users are being tracked would storing the reader_id
 in an indexed array improve matters?   SELECT ... FROM message WHERE NOT
 (1
 = ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE
 messageid = ...  I'm not that familiar with how well indexes over arrays
 work or which kind is needed (i.e. gin/gist).



 is_read is one of many properties being tracked for a message...


​But you don't have to have all of them on the same table.  Once you've
identified the messages in question performing a standard join onto a
supplemental detail table should be fairly straight-forward.

Do these other properties have values when is_read is false or only when
is_read is true?  Since you already allow for the possibility of a
missing record (giving it the meaning of not read)​ these other
properties cannot currently exist in that situation.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Optimize-query-for-listing-un-read-messages-tp5802097p5802174.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Andreas Joseph Krogh
På fredag 02. mai 2014 kl. 01:58:04, skrev David G Johnston 
david.g.johns...@gmail.com mailto:david.g.johns...@gmail.com: 
 Per-User caching does seem to be something that is going to be needed...

 Depending on how many users are being tracked would storing the reader_id
 in an indexed array improve matters?   SELECT ... FROM message WHERE NOT (1
 = ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE
 messageid = ...  I'm not that familiar with how well indexes over arrays
 work or which kind is needed (i.e. gin/gist).     is_read is one of many 
properties being tracked for a message...     ​But you don't have to have all 
of them on the same table.  Once you've identified the messages in question 
performing a standard join onto a supplemental detail table should be fairly 
straight-forward.   Do these other properties have values when is_read is 
false or only when is_read is true?  Since you already allow for the 
possibility of a missing record (giving it the meaning of not read)​ these 
other properties cannot currently exist in that situation.     A message might 
hold a property (ie. is_important) when is_read is FALSE (it might be set back 
to is_read=FALSE after being read the first time).   -- Andreas Jospeh Krogh 
CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com 
mailto:andr...@visena.com www.visena.com https://www.visena.com  
https://www.visena.com  

Re: [PERFORM] Optimize query for listing un-read messages

2014-05-01 Thread Craig James
On Thu, May 1, 2014 at 4:26 AM, Andreas Joseph Krogh andr...@visena.comwrote:

 I have a schema where I have lots of messages and some users who might
 have read some of them. When a message is read by a user I create an entry
 i a table message_property holding the property (is_read) for that user.

 The schema is as follows:

[...]


 create table person(
 id serial primary key,
 username varchar not null unique
 );

 create table message(
 id serial primary key,
 subject varchar
 );

 create table message_property(
 message_id integer not null references message(id),
 person_id integer not null references person(id),
 is_read boolean not null default false,
 unique(message_id, person_id)
 );

[...]

  So, for person 1 there are 10 unread messages, out of a total 1mill. 5 of
 those unread does not have an entry in message_property and 5 have an entry
 and is_read set to FALSE.


Here's a possible enhancement: add two columns, an indexed timestamp to the
message table, and a timestamp of the oldest message this user has NOT
read on the person table. If most users read messages in a timely fashion,
this would (in most cases) narrow down the portion of the messages table to
a tiny fraction of the total -- just those messages newer than the oldest
message this user has not read.

When you sign up a new user, you can set his timestamp to the time the
account was created, since presumably messages before that time don't apply.

Whether this will help depends a lot on actual use patterns, i.e. do users
typically read all messages or do they leave a bunch of unread messages
sitting around forever?

Craig


Re: [PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-06 Thread Julien Cigar

On 03/06/2013 00:51, Niels Kristian Schjødt wrote:

Hi, thanks for answering. See comments inline.

Den 05/03/2013 kl. 15.26 skrev Julien Cigar jci...@ulb.ac.be:


On 03/05/2013 15:00, Niels Kristian Schjødt wrote:

Hi,

I'm running a rails app, where I have a model called Car that has_many Images. 
Now when I tell rails to include those images, when querying say 50 cars, then 
it often decides to use a SELECT * from images WHERE car_id IN 
(id1,id2,id3,id4…) instead of doing a join.

why do you want a join here ? if you don't need any cars data there is no 
need to JOIN that table.

I need both

Now a select ... from ... where id in (id1, id2, ..., idn) isn't very scalable.

Instead of passing id1, id2, ..., idn you'be better pass the condition and do a 
where id in (select ... ), or where exists (select 1 ... where ...), or a join, 
or …


I tried this now, and it doesn't seem to do a very big difference unfortunately…


could you paste the full query, an explain analyze of it, and some 
details about your config (how much ram ? what's your: shared_buffers, 
effective_cache_size, cpu_tuple_cost, work_mem, ...) ?



Now either way it uses the index I
have on car_id:

Index Scan using car_id_ix on adverts  (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY 
('{7097561,7253541,5159633,6674471,...}'::integer[]))

But it's slow, it's very slow. In this case it took 3,323ms

3ms isn't slow


Sorry, it's 3323ms!


Can I do anything to optimize that query or maybe the index or something?

your index is already used

Okay this leaves me with - get better hardware or?


The table has 16.000.000 rows



--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance






--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-06 Thread Jeff Janes
On Tue, Mar 5, 2013 at 4:07 PM, Joshua D. Drake j...@commandprompt.comwrote:


 On 03/05/2013 03:51 PM, Niels Kristian Schjødt wrote:

  3ms isn't slow

  Sorry, it's 3323ms!

  Can I do anything to optimize that query or maybe the index or something?


 your index is already used


 Okay this leaves me with - get better hardware or?


 What does explain analyze say versus just explain.



Better yet, explain (analyze, buffers) with track_io_timing turned on.

Cheers,

Jeff


[PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-05 Thread Niels Kristian Schjødt
Hi,

I'm running a rails app, where I have a model called Car that has_many Images. 
Now when I tell rails to include those images, when querying say 50 cars, then 
it often decides to use a SELECT * from images WHERE car_id IN 
(id1,id2,id3,id4…) instead of doing a join. 

Now either way it uses the index I have on car_id:

Index Scan using car_id_ix on adverts  (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY 
('{7097561,7253541,5159633,6674471,...}'::integer[]))

But it's slow, it's very slow. In this case it took 3,323ms

Can I do anything to optimize that query or maybe the index or something?

The table has 16.000.000 rows

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-05 Thread Julien Cigar

On 03/05/2013 15:00, Niels Kristian Schjødt wrote:

Hi,

I'm running a rails app, where I have a model called Car that has_many Images. 
Now when I tell rails to include those images, when querying say 50 cars, then 
it often decides to use a SELECT * from images WHERE car_id IN 
(id1,id2,id3,id4…) instead of doing a join.


why do you want a join here ? if you don't need any cars data there is 
no need to JOIN that table.
Now a select ... from ... where id in (id1, id2, ..., idn) isn't very 
scalable.


Instead of passing id1, id2, ..., idn you'be better pass the condition 
and do a where id in (select ... ), or where exists (select 1 ... where 
...), or a join, or ...



Now either way it uses the index I have on car_id:

Index Scan using car_id_ix on adverts  (cost=0.47..5665.34 rows=1224 width=234)
Index Cond: (car_id = ANY 
('{7097561,7253541,5159633,6674471,...}'::integer[]))

But it's slow, it's very slow. In this case it took 3,323ms


3ms isn't slow


Can I do anything to optimize that query or maybe the index or something?


your index is already used


The table has 16.000.000 rows




--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-05 Thread Niels Kristian Schjødt
Hi, thanks for answering. See comments inline.

Den 05/03/2013 kl. 15.26 skrev Julien Cigar jci...@ulb.ac.be:

 On 03/05/2013 15:00, Niels Kristian Schjødt wrote:
 Hi,
 
 I'm running a rails app, where I have a model called Car that has_many 
 Images. Now when I tell rails to include those images, when querying say 50 
 cars, then it often decides to use a SELECT * from images WHERE car_id IN 
 (id1,id2,id3,id4…) instead of doing a join.
 
 why do you want a join here ? if you don't need any cars data there is no 
 need to JOIN that table.
I need both
 Now a select ... from ... where id in (id1, id2, ..., idn) isn't very 
 scalable.
 
 Instead of passing id1, id2, ..., idn you'be better pass the condition and do 
 a where id in (select ... ), or where exists (select 1 ... where ...), or a 
 join, or …
 
I tried this now, and it doesn't seem to do a very big difference unfortunately…

 Now either way it uses the index I
 have on car_id:
 
 Index Scan using car_id_ix on adverts  (cost=0.47..5665.34 rows=1224 
 width=234)
  Index Cond: (car_id = ANY 
 ('{7097561,7253541,5159633,6674471,...}'::integer[]))
 
 But it's slow, it's very slow. In this case it took 3,323ms
 
 3ms isn't slow
 
Sorry, it's 3323ms!

 Can I do anything to optimize that query or maybe the index or something?
 
 your index is already used

Okay this leaves me with - get better hardware or?

 
 The table has 16.000.000 rows
 
 
 
 -- 
 No trees were killed in the creation of this message.
 However, many electrons were terribly inconvenienced.
 
 
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize SELECT * from table WHERE foreign_key_id IN (key1,key2,key3,key4...)

2013-03-05 Thread Joshua D. Drake


On 03/05/2013 03:51 PM, Niels Kristian Schjødt wrote:


3ms isn't slow


Sorry, it's 3323ms!


Can I do anything to optimize that query or maybe the index or something?


your index is already used


Okay this leaves me with - get better hardware or?


What does explain analyze say versus just explain.

JD
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-12-03 Thread Niels Kristian Schjødt

Den 30/11/2012 kl. 17.06 skrev Shaun Thomas stho...@optionshouse.com:

 On 11/30/2012 09:44 AM, Niels Kristian Schjødt wrote:
 
 Just a note on your iostat numbers. The first reading is actually just a 
 summary. You want the subsequent readings.
 
 The pgsql_tmp dir is not changing at all it's constantly empty (a size
 of 4.0K).
 
 Good.
 
 Filesystem 1K-blocksUsed Available Use% Mounted on
 /dev/md3   230619228 5483796 213420620   3% /ssd
 
 Good.
 
 You could just be seeing lots of genuine activity. But going back on the 
 thread, I remember seeing this in your postgresql.conf:
 
 shared_buffers = 7680MB
 
 Change this to:
 
 shared_buffers = 4GB
 
 I say that because you mentioned you're using Ubuntu 12.04, and we were 
 having some problems with PG on that platform. With shared_buffers over 4GB, 
 it starts doing really weird things to the memory subsystem. Whatever it does 
 causes the kernel to purge cache rather aggressively. We saw a 60% reduction 
 in read IO by reducing shared_buffers to 4GB. Without as many reads, your 
 writes should be much less disruptive.
 
 You'll need to restart PG to adopt that change.
 
 But I encourage you to keep iostat running in a terminal window so you can 
 watch it for a while. It's very revealing.
 
 -- 
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com
 
 __
 
 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email
Couldn't this be if you haven't changed these: 
http://www.postgresql.org/docs/9.2/static/kernel-resources.html ?
I have changed the following in my configuration:

kernel.shmmax = 8589934592 #(8GB)
kernel.shmall = 17179869184 #(16GB)



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-12-02 Thread Vitalii Tymchyshyn
Well, it seems that my data can be outdated, sorry for that. I've just
checked performance numbers on Tom's hardware and it seems that best sad
really do 500 MB/s. Some others do 100. So, I'd say one must choose wisely
(as always :-) ).

Best regards,
Vitalii Tymchyshyn
1 груд. 2012 00:43, Mark Kirkwood mark.kirkw...@catalyst.net.nz напис.

 Hmm - not strictly true as stated: 1 SSD will typically do 500MB/s
 sequential read/write. 1 HDD will be lucky to get a 1/3 that.

 We are looking at replacing 4 to 6 disk RAID10 arrays of HDD with a RAID1
 pair of SSD, as they perform about the same for sequential work and vastly
 better at random. Plus they only use 2x 2.5 slots (or, ahem 2x PCIe
 sockets), so allow smaller form factor servers and save on power and
 cooling.

 Cheers

 Mark

 On 30/11/12 23:07, Vitalii Tymchyshyn wrote:

 Oh, yes. I don't imagine DB server without RAID+BBU :)
 When there is no BBU, SSD can be handy.
 But you know, SSD is worse in linear read/write than HDD.

 Best regards, Vitalii Tymchyshyn


 2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz
 mailto:mark.kirkwood@**catalyst.net.nz mark.kirkw...@catalyst.net.nz

 Most modern SSD are much faster for fsync type operations than a
 spinning disk - similar performance to spinning disk + writeback
 raid controller + battery.

 However as you mention, they are great at random IO too, so Niels,
 it might be worth putting your postgres logs *and* data on the SSDs
 and retesting.





Re: [PERFORM] Optimize update query

2012-12-02 Thread Mark Kirkwood

Yeah, this area is changing very fast!

I agree - choosing carefully is important, as there are still plenty of 
older models around that are substantially slower. Also choice of 
motherboard chipset can strongly effect overall performance too. The 6 
Gbit/s ports on Sandy and Ivy bridge Mobos [1] seem to get close to that 
rated performance out of the SSD that I've tested (Crucial m4, Intel 
various).


Cheers

Mark

[1] Which I think are actually Intel or Marvell controllers.

On 03/12/12 00:14, Vitalii Tymchyshyn wrote:

Well, it seems that my data can be outdated, sorry for that. I've just
checked performance numbers on Tom's hardware and it seems that best sad
really do 500 MB/s. Some others do 100. So, I'd say one must choose wisely
(as always :-) ).

Best regards,
Vitalii Tymchyshyn
1 груд. 2012 00:43, Mark Kirkwood mark.kirkw...@catalyst.net.nz напис.


Hmm - not strictly true as stated: 1 SSD will typically do 500MB/s
sequential read/write. 1 HDD will be lucky to get a 1/3 that.

We are looking at replacing 4 to 6 disk RAID10 arrays of HDD with a RAID1
pair of SSD, as they perform about the same for sequential work and vastly
better at random. Plus they only use 2x 2.5 slots (or, ahem 2x PCIe
sockets), so allow smaller form factor servers and save on power and
cooling.

Cheers

Mark

On 30/11/12 23:07, Vitalii Tymchyshyn wrote:


Oh, yes. I don't imagine DB server without RAID+BBU :)
When there is no BBU, SSD can be handy.
But you know, SSD is worse in linear read/write than HDD.

Best regards, Vitalii Tymchyshyn


2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz
mailto:mark.kirkwood@**catalyst.net.nz mark.kirkw...@catalyst.net.nz

 Most modern SSD are much faster for fsync type operations than a
 spinning disk - similar performance to spinning disk + writeback
 raid controller + battery.

 However as you mention, they are great at random IO too, so Niels,
 it might be worth putting your postgres logs *and* data on the SSDs
 and retesting.






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
Actually, what's the point in putting logs to ssd? SSDs are good for random
access and logs are accessed sequentially. I'd put table spaces on ssd and
leave logs on hdd
30 лист. 2012 04:33, Niels Kristian Schjødt nielskrist...@autouncle.com
напис.

 Hmm I'm getting suspicious here. Maybe my new great setup with the SSD's
 is not really working as it should., and maybe new relic is not monitoring
 as It should.

 If I do a sudo iostat -k 1
 I get a lot of output like this:
 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
 sda   0.00 0.00 0.00  0  0
 sdb   0.00 0.00 0.00  0  0
 sdc 546.00  2296.00  6808.00   2296   6808
 sdd 593.00  1040.00  7416.00   1040   7416
 md1   0.00 0.00 0.00  0  0
 md0   0.00 0.00 0.00  0  0
 md21398.00  3328.00 13064.00   3328  13064
 md3   0.00 0.00 0.00  0  0

 The storage thing is, that the sda and sdb is the SSD drives and the sdc
 and sdd is the HDD drives. The md0, md1 and md2 is the raid arrays on the
 HDD's and the md3 is the raid on the SSD's. Neither of the md3 or the SSD's
 are getting utilized - and I should expect that since they are serving my
 pg_xlog right? - so maybe I did something wrong in the setup. Here is the
 path I followed:

 # 1) First setup the SSD drives in a software RAID1 setup:
 #
 http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
 #
 # 2) Then move the postgres pg_xlog dir
 #   sudo /etc/init.d/postgresql-9.2 stop
 #   sudo mkdir -p /ssd/pg_xlog
 #   sudo chown -R  postgres.postgres /ssd/pg_xlog
 #   sudo chmod 700 /ssd/pg_xlog
 #   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
 #   sudo mv /var/lib/postgresql/9.2/main/pg_xlog
 /var/lib/postgresql/9.2/main/pg_xlog_old
 #   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
 #   sudo /etc/init.d/postgresql-9.2 start

 Can you spot something wrong?



 Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt 
 nielskrist...@autouncle.com:

  Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com:
 
  Niels Kristian Schjødt wrote:
 
  Okay, now I'm done the updating as described above. I did the
  postgres.conf changes. I did the kernel changes, i added two
  SSD's in a software RAID1 where the pg_xlog is now located -
  unfortunately the the picture is still the same :-(
 
  You said before that you were seeing high disk wait numbers. Now it
  is zero accourding to your disk utilization graph. That sounds like
  a change to me.
 
  When the database is under heavy load, there is almost no
  improvement to see in the performance compared to before the
  changes.
 
  In client-visible response time and throughput, I assume, not
  resource usage numbers?
 
  A lot of both read and writes takes more than a 1000 times as
  long as they usually do, under lighter overall load.
 
  As an odd coincidence, you showed your max_connections setting to
  be 1000.
 
  http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
  -Kevin
 
  Hehe, I'm sorry if it somehow was misleading, I just wrote a lot of
 I/O it was CPU I/O, it also states that in the chart in the link.
  However, as I'm not very familiar with these deep down database and
 server things, I had no idea wether a disk bottle neck could hide in this
 I/O, so i went along with Shauns great help, that unfortunately didn't
 solve my issues.
  Back to the issue: Could it be that it is the fact that I'm using
 ubuntus built in software raid to raid my disks, and that it is not at all
 capable of handling the throughput?
 



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Optimize update query

2012-11-30 Thread Mark Kirkwood
Most modern SSD are much faster for fsync type operations than a 
spinning disk - similar performance to spinning disk + writeback raid 
controller + battery.


However as you mention, they are great at random IO too, so Niels, it 
might be worth putting your postgres logs *and* data on the SSDs and 
retesting.


Regards

Mark



On 30/11/12 21:37, Vitalii Tymchyshyn wrote:

Actually, what's the point in putting logs to ssd? SSDs are good for
random access and logs are accessed sequentially. I'd put table spaces
on ssd and leave logs on hdd

30 лист. 2012 04:33, Niels Kristian Schjødt
nielskrist...@autouncle.com mailto:nielskrist...@autouncle.com напис.

Hmm I'm getting suspicious here. Maybe my new great setup with the
SSD's is not really working as it should., and maybe new relic is
not monitoring as It should.

If I do a sudo iostat -k 1
I get a lot of output like this:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0
sdc 546.00  2296.00  6808.00   2296   6808
sdd 593.00  1040.00  7416.00   1040   7416
md1   0.00 0.00 0.00  0  0
md0   0.00 0.00 0.00  0  0
md21398.00  3328.00 13064.00   3328  13064
md3   0.00 0.00 0.00  0  0

The storage thing is, that the sda and sdb is the SSD drives and the
sdc and sdd is the HDD drives. The md0, md1 and md2 is the raid
arrays on the HDD's and the md3 is the raid on the SSD's. Neither of
the md3 or the SSD's are getting utilized - and I should expect that
since they are serving my pg_xlog right? - so maybe I did something
wrong in the setup. Here is the path I followed:

# 1) First setup the SSD drives in a software RAID1 setup:
#

http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
#
# 2) Then move the postgres pg_xlog dir
#   sudo /etc/init.d/postgresql-9.2 stop
#   sudo mkdir -p /ssd/pg_xlog
#   sudo chown -R  postgres.postgres /ssd/pg_xlog
#   sudo chmod 700 /ssd/pg_xlog
#   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
#   sudo mv /var/lib/postgresql/9.2/main/pg_xlog
/var/lib/postgresql/9.2/main/pg_xlog_old
#   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
#   sudo /etc/init.d/postgresql-9.2 start

Can you spot something wrong?



Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt
nielskrist...@autouncle.com mailto:nielskrist...@autouncle.com:

  Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com
mailto:kgri...@mail.com:
 
  Niels Kristian Schjødt wrote:
 
  Okay, now I'm done the updating as described above. I did the
  postgres.conf changes. I did the kernel changes, i added two
  SSD's in a software RAID1 where the pg_xlog is now located -
  unfortunately the the picture is still the same :-(
 
  You said before that you were seeing high disk wait numbers. Now it
  is zero accourding to your disk utilization graph. That sounds like
  a change to me.
 
  When the database is under heavy load, there is almost no
  improvement to see in the performance compared to before the
  changes.
 
  In client-visible response time and throughput, I assume, not
  resource usage numbers?
 
  A lot of both read and writes takes more than a 1000 times as
  long as they usually do, under lighter overall load.
 
  As an odd coincidence, you showed your max_connections setting to
  be 1000.
 
  http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
  -Kevin
 
  Hehe, I'm sorry if it somehow was misleading, I just wrote a lot
of I/O it was CPU I/O, it also states that in the chart in the link.
  However, as I'm not very familiar with these deep down database
and server things, I had no idea wether a disk bottle neck could
hide in this I/O, so i went along with Shauns great help, that
unfortunately didn't solve my issues.
  Back to the issue: Could it be that it is the fact that I'm using
ubuntus built in software raid to raid my disks, and that it is not
at all capable of handling the throughput?
 



--
Sent via pgsql-performance mailing list
(pgsql-performance@postgresql.org
mailto:pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Mark Kirkwood
When I try your command sequence I end up with the contents of the new 
pg_xlog owned by root. Postgres will not start:


PANIC:  could not open file pg_xlog/000100060080 (log file 
6, segment 128): Permission denied


While this is fixable, I suspect you have managed to leave the xlogs 
directory that postgres is actually using on the HDD drives.



When I do this I normally do:
$   service postgresql stop
$   sudo mkdir -p /ssd/pg_xlog
$   sudo chown -R  postgres.postgres /ssd/pg_xlog
$   sudo chmod 700 /ssd/pg_xlog
$   sudo su - postgres
postgres $   mv /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
postgres $   rmdir /var/lib/postgresql/9.2/main/pg_xlog
postgres $   ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
postgres $   service postgresql start

regards

Mark

On 30/11/12 15:32, Niels Kristian Schjødt wrote:

Hmm I'm getting suspicious here. Maybe my new great setup with the SSD's is not 
really working as it should., and maybe new relic is not monitoring as It 
should.

If I do a sudo iostat -k 1
I get a lot of output like this:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0
sdc 546.00  2296.00  6808.00   2296   6808
sdd 593.00  1040.00  7416.00   1040   7416
md1   0.00 0.00 0.00  0  0
md0   0.00 0.00 0.00  0  0
md21398.00  3328.00 13064.00   3328  13064
md3   0.00 0.00 0.00  0  0

The storage thing is, that the sda and sdb is the SSD drives and the sdc and 
sdd is the HDD drives. The md0, md1 and md2 is the raid arrays on the HDD's and 
the md3 is the raid on the SSD's. Neither of the md3 or the SSD's are getting 
utilized - and I should expect that since they are serving my pg_xlog right? - 
so maybe I did something wrong in the setup. Here is the path I followed:

# 1) First setup the SSD drives in a software RAID1 setup:
#   
http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
#
# 2) Then move the postgres pg_xlog dir
#   sudo /etc/init.d/postgresql-9.2 stop
#   sudo mkdir -p /ssd/pg_xlog
#   sudo chown -R  postgres.postgres /ssd/pg_xlog
#   sudo chmod 700 /ssd/pg_xlog
#   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
#   sudo mv /var/lib/postgresql/9.2/main/pg_xlog 
/var/lib/postgresql/9.2/main/pg_xlog_old
#   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
#   sudo /etc/init.d/postgresql-9.2 start

Can you spot something wrong?



Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt 
nielskrist...@autouncle.com:


Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com:


Niels Kristian Schjødt wrote:


Okay, now I'm done the updating as described above. I did the
postgres.conf changes. I did the kernel changes, i added two
SSD's in a software RAID1 where the pg_xlog is now located -
unfortunately the the picture is still the same :-(


You said before that you were seeing high disk wait numbers. Now it
is zero accourding to your disk utilization graph. That sounds like
a change to me.


When the database is under heavy load, there is almost no
improvement to see in the performance compared to before the
changes.


In client-visible response time and throughput, I assume, not
resource usage numbers?


A lot of both read and writes takes more than a 1000 times as
long as they usually do, under lighter overall load.


As an odd coincidence, you showed your max_connections setting to
be 1000.

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

-Kevin


Hehe, I'm sorry if it somehow was misleading, I just wrote a lot of I/O it 
was CPU I/O, it also states that in the chart in the link.
However, as I'm not very familiar with these deep down database and server 
things, I had no idea wether a disk bottle neck could hide in this I/O, so i 
went along with Shauns great help, that unfortunately didn't solve my issues.
Back to the issue: Could it be that it is the fact that I'm using ubuntus built 
in software raid to raid my disks, and that it is not at all capable of 
handling the throughput?









--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
Oh, yes. I don't imagine DB server without RAID+BBU :)
When there is no BBU, SSD can be handy.
But you know, SSD is worse in linear read/write than HDD.

Best regards, Vitalii Tymchyshyn


2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz

 Most modern SSD are much faster for fsync type operations than a spinning
 disk - similar performance to spinning disk + writeback raid controller +
 battery.

 However as you mention, they are great at random IO too, so Niels, it
 might be worth putting your postgres logs *and* data on the SSDs and
 retesting.

 Regards

 Mark




 On 30/11/12 21:37, Vitalii Tymchyshyn wrote:

 Actually, what's the point in putting logs to ssd? SSDs are good for
 random access and logs are accessed sequentially. I'd put table spaces
 on ssd and leave logs on hdd

 30 лист. 2012 04:33, Niels Kristian Schjødt
 nielskrist...@autouncle.com 
 mailto:nielskristian@**autouncle.comnielskrist...@autouncle.com
 напис.


 Hmm I'm getting suspicious here. Maybe my new great setup with the
 SSD's is not really working as it should., and maybe new relic is
 not monitoring as It should.

 If I do a sudo iostat -k 1
 I get a lot of output like this:
 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
 sda   0.00 0.00 0.00  0  0
 sdb   0.00 0.00 0.00  0  0
 sdc 546.00  2296.00  6808.00   2296   6808
 sdd 593.00  1040.00  7416.00   1040   7416
 md1   0.00 0.00 0.00  0  0
 md0   0.00 0.00 0.00  0  0
 md21398.00  3328.00 13064.00   3328  13064
 md3   0.00 0.00 0.00  0  0

 The storage thing is, that the sda and sdb is the SSD drives and the
 sdc and sdd is the HDD drives. The md0, md1 and md2 is the raid
 arrays on the HDD's and the md3 is the raid on the SSD's. Neither of
 the md3 or the SSD's are getting utilized - and I should expect that
 since they are serving my pg_xlog right? - so maybe I did something
 wrong in the setup. Here is the path I followed:

 # 1) First setup the SSD drives in a software RAID1 setup:
 #
 http://askubuntu.com/**questions/223194/setup-of-two-**
 additional-ssd-drives-in-raid-**1http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
 #
 # 2) Then move the postgres pg_xlog dir
 #   sudo /etc/init.d/postgresql-9.2 stop
 #   sudo mkdir -p /ssd/pg_xlog
 #   sudo chown -R  postgres.postgres /ssd/pg_xlog
 #   sudo chmod 700 /ssd/pg_xlog
 #   sudo cp -rf /var/lib/postgresql/9.2/main/**pg_xlog/* /ssd/pg_xlog
 #   sudo mv /var/lib/postgresql/9.2/main/**pg_xlog
 /var/lib/postgresql/9.2/main/**pg_xlog_old
 #   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/**pg_xlog
 #   sudo /etc/init.d/postgresql-9.2 start

 Can you spot something wrong?



 Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt
 nielskrist...@autouncle.com 
 mailto:nielskristian@**autouncle.comnielskrist...@autouncle.com
 :


   Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com
 mailto:kgri...@mail.com:

  
   Niels Kristian Schjødt wrote:
  
   Okay, now I'm done the updating as described above. I did the
   postgres.conf changes. I did the kernel changes, i added two
   SSD's in a software RAID1 where the pg_xlog is now located -
   unfortunately the the picture is still the same :-(
  
   You said before that you were seeing high disk wait numbers. Now
 it
   is zero accourding to your disk utilization graph. That sounds
 like
   a change to me.
  
   When the database is under heavy load, there is almost no
   improvement to see in the performance compared to before the
   changes.
  
   In client-visible response time and throughput, I assume, not
   resource usage numbers?
  
   A lot of both read and writes takes more than a 1000 times as
   long as they usually do, under lighter overall load.
  
   As an odd coincidence, you showed your max_connections setting to
   be 1000.
  
   http://wiki.postgresql.org/**wiki/Number_Of_Database_**
 Connectionshttp://wiki.postgresql.org/wiki/Number_Of_Database_Connections
  
   -Kevin
  
   Hehe, I'm sorry if it somehow was misleading, I just wrote a lot
 of I/O it was CPU I/O, it also states that in the chart in the link.
   However, as I'm not very familiar with these deep down database
 and server things, I had no idea wether a disk bottle neck could
 hide in this I/O, so i went along with Shauns great help, that
 unfortunately didn't solve my issues.
   Back to the issue: Could it be that it is the fact 

Re: [PERFORM] Optimize update query

2012-11-30 Thread Willem Leenen


 Actually, what's the point in putting logs to ssd? SSDs are good for random 
access and logs are accessed sequentially. I'd put table spaces on ssd and 
leave logs on hdd

30 лист. 2012 04:33, Niels Kristian Schjødt nielskrist...@autouncle.com 
напис.

Because SSD's are considered faster. Then you have to put the most phyisical IO 
intensive operations on SSD. For the majority of databases, these are the 
logfiles. But you should investigate where the optimum is for your situation. 
  

Re: [PERFORM] Optimize update query

2012-11-30 Thread Vitalii Tymchyshyn
SSDs are not faster for sequential IO as I know. That's why (with BBU or
synchronious_commit=off) I prefer to have logs on regular HDDs.

Best reag


2012/11/30 Willem Leenen willem_lee...@hotmail.com


  Actually, what's the point in putting logs to ssd? SSDs are good for
 random access and logs are accessed sequentially. I'd put table spaces on
 ssd and leave logs on hdd
  30 лист. 2012 04:33, Niels Kristian Schjødt 
 nielskrist...@autouncle.com напис.
 Because SSD's are considered faster. Then you have to put the most
 phyisical IO intensive operations on SSD. For the majority of databases,
 these are the logfiles. But you should investigate where the optimum is for
 your situation.





-- 
Best regards,
 Vitalii Tymchyshyn


Re: [PERFORM] Optimize update query

2012-11-30 Thread Kevin Grittner
Niels Kristian Schjødt wrote:

 You said before that you were seeing high disk wait numbers. Now
 it is zero accourding to your disk utilization graph. That
 sounds like a change to me.

 Hehe, I'm sorry if it somehow was misleading, I just wrote a lot
 of I/O it was CPU I/O

 A lot of both read and writes takes more than a 1000 times as
 long as they usually do, under lighter overall load.
 
 As an odd coincidence, you showed your max_connections setting
 to be 1000.
 
 http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

 Back to the issue: Could it be that it is the fact that I'm using
 ubuntus built in software raid to raid my disks, and that it is
 not at all capable of handling the throughput?

For high performance situations I would always use a high quality
RAID controller with battery-backed RAM configured for write-back;
however:

The graphs you included suggest that your problem has nothing to do
with your storage system. Now maybe you didn't capture the data for
the graphs while the problem was occurring, in which case the
graphs would be absolutely useless; but based on what slim data you
have provided, you need a connection pool (like maybe pgbouncer
configured in transaction mode) to limit the number of database
connections used to something like twice the number of cores.

If you still have problems, pick the query which is using the most
time on your database server, and post it with the information
suggested on this page:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Niels Kristian Schjødt
Okay, So to understand this better before I go with that solution: 
In theory what difference should it make to the performance, to have a pool in 
front of the database, that all my workers and web servers connect to instead 
of connecting directly? Where is the performance gain coming from in that 
situation?

Den 30/11/2012 kl. 13.03 skrev Kevin Grittner kgri...@mail.com:

 Niels Kristian Schjødt wrote:
 
 You said before that you were seeing high disk wait numbers. Now
 it is zero accourding to your disk utilization graph. That
 sounds like a change to me.
 
 Hehe, I'm sorry if it somehow was misleading, I just wrote a lot
 of I/O it was CPU I/O
 
 A lot of both read and writes takes more than a 1000 times as
 long as they usually do, under lighter overall load.
 
 As an odd coincidence, you showed your max_connections setting
 to be 1000.
 
 http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
 Back to the issue: Could it be that it is the fact that I'm using
 ubuntus built in software raid to raid my disks, and that it is
 not at all capable of handling the throughput?
 
 For high performance situations I would always use a high quality
 RAID controller with battery-backed RAM configured for write-back;
 however:
 
 The graphs you included suggest that your problem has nothing to do
 with your storage system. Now maybe you didn't capture the data for
 the graphs while the problem was occurring, in which case the
 graphs would be absolutely useless; but based on what slim data you
 have provided, you need a connection pool (like maybe pgbouncer
 configured in transaction mode) to limit the number of database
 connections used to something like twice the number of cores.
 
 If you still have problems, pick the query which is using the most
 time on your database server, and post it with the information
 suggested on this page:
 
 http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
 -Kevin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Shaun Thomas

On 11/30/2012 07:31 AM, Niels Kristian Schjødt wrote:


In theory what difference should it make to the performance, to have
a pool in front of the database, that all my workers and web servers
connect to instead of connecting directly? Where is the performance
gain coming from in that situation?


If you have several more connections than you have processors, the
database does a *lot* more context switching, and among other things,
that drastically reduces PG performance. On a testbed, I can get over
150k transactions per second on PG 9.1 with a 1-1 relationship between
CPU and client. Increase that to a few hundred, and my TPS drops down to
30k. Simply having the clients there kills performance.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Shaun Thomas

On 11/29/2012 08:32 PM, Niels Kristian Schjødt wrote:


If I do a sudo iostat -k 1
I get a lot of output like this:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0
sdc 546.00  2296.00  6808.00   2296   6808
sdd 593.00  1040.00  7416.00   1040   7416
md1   0.00 0.00 0.00  0  0
md0   0.00 0.00 0.00  0  0
md21398.00  3328.00 13064.00   3328  13064
md3   0.00 0.00 0.00  0  0




The storage thing is, that the sda and sdb is the SSD drives and the
sdc and sdd is the HDD drives. The md0, md1 and md2 is the raid
arrays on the HDD's and the md3 is the raid on the SSD's. Neither of
the md3 or the SSD's are getting utilized - and I should expect that
since they are serving my pg_xlog right?


No, that's right. They are, but it would appear that the majority of
your traffic actually isn't due to transaction logs like I'd suspected.
If you get a chance, could you monitor the contents of:

/var/lib/postgresql/9.2/main/base/pgsql_tmp

Your main drives are getting way, way more writes than they should. 13MB
per second is ridiculous even under heavy write loads. Based on the TPS
count, you're basically saturating the ability of those two 3TB drives.
Those writes have to be coming from somewhere.


#   sudo mkdir -p /ssd/pg_xlog


This is going to sound stupid, but are you *sure* the SSD is mounted at
/ssd ?


#   sudo chown -R  postgres.postgres /ssd/pg_xlog
#   sudo chmod 700 /ssd/pg_xlog
#   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
#   sudo mv /var/lib/postgresql/9.2/main/pg_xlog 
/var/lib/postgresql/9.2/main/pg_xlog_old
#   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
#   sudo /etc/init.d/postgresql-9.2 start


The rest of this is fine, except that you probably should have added:

sudo chown -R postgres:postgres /ssd/pg_xlog/*


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Shaun Thomas

On 11/30/2012 02:37 AM, Vitalii Tymchyshyn wrote:


Actually, what's the point in putting logs to ssd? SSDs are good for
random access and logs are accessed sequentially.


While this is true, Niels' problem is that his regular HDs are getting 
saturated. In that case, moving any activity off of them is an improvement.


Why not move the data to the SSDs, you ask? Because he bought two 3TB 
drives. The assumption here is that a 256GB SSD will not have enough 
space for the long-term lifespan of this database.


Either way, based on the iostat activity he posted, clearly there's some 
other write stream happening we're not privy to.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Niels Kristian Schjødt
Den 30/11/2012 kl. 15.02 skrev Shaun Thomas stho...@optionshouse.com:

 On 11/29/2012 08:32 PM, Niels Kristian Schjødt wrote:
 
 If I do a sudo iostat -k 1
 I get a lot of output like this:
 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
 sda   0.00 0.00 0.00  0  0
 sdb   0.00 0.00 0.00  0  0
 sdc 546.00  2296.00  6808.00   2296   6808
 sdd 593.00  1040.00  7416.00   1040   7416
 md1   0.00 0.00 0.00  0  0
 md0   0.00 0.00 0.00  0  0
 md21398.00  3328.00 13064.00   3328  13064
 md3   0.00 0.00 0.00  0  0
 
 
 The storage thing is, that the sda and sdb is the SSD drives and the
 sdc and sdd is the HDD drives. The md0, md1 and md2 is the raid
 arrays on the HDD's and the md3 is the raid on the SSD's. Neither of
 the md3 or the SSD's are getting utilized - and I should expect that
 since they are serving my pg_xlog right?
 
 No, that's right. They are, but it would appear that the majority of your 
 traffic actually isn't due to transaction logs like I'd suspected. If you get 
 a chance, could you monitor the contents of:
 
 /var/lib/postgresql/9.2/main/base/pgsql_tmp
 
 Your main drives are getting way, way more writes than they should. 13MB per 
 second is ridiculous even under heavy write loads. Based on the TPS count, 
 you're basically saturating the ability of those two 3TB drives. Those writes 
 have to be coming from somewhere.
 
 #   sudo mkdir -p /ssd/pg_xlog
 
 This is going to sound stupid, but are you *sure* the SSD is mounted at /ssd ?
 
 #   sudo chown -R  postgres.postgres /ssd/pg_xlog
 #   sudo chmod 700 /ssd/pg_xlog
 #   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog
 #   sudo mv /var/lib/postgresql/9.2/main/pg_xlog 
 /var/lib/postgresql/9.2/main/pg_xlog_old
 #   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog
 #   sudo /etc/init.d/postgresql-9.2 start
 
 The rest of this is fine, except that you probably should have added:
 
 sudo chown -R postgres:postgres /ssd/pg_xlog/*
 
 
 -- 
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com
 
 __
 
 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email

Oh my, Shaun once again you nailed it! That's what you get from working too 
late in the night - I forgot to run 'sudo mount -a' I feel so embarrassed now 
:-( - In other words no the drive was not mounted to the /ssd dir. 
So now it is, and this has gained me a performance increase of roughly around 
20% - a little less than what I would have hoped for but still better - but 
anyways yes that's right.
I still see a lot of CPU I/O when doing a lot of writes, so the question is, 
what's next. Should I try and go' for the connection pooling thing or monitor 
that /var/lib/postgresql/9.2/main/base/pgsql_tmp dir (and what exactly do you 
mean by monitor - size?)

PS. comment on the Why not move the data to the SSDs you are exactly right. i 
don't think the SSD's will be big enough for the data within a not too long 
timeframe, so that is exactly why I want to keep my data on the big drives.
PPS. I talked with New Relic and it turns out there is something wrong with the 
disk monitoring tool, so that's why there was nothing in the disk charts but 
iostat showed a lot of activity.




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Shaun Thomas

On 11/30/2012 08:48 AM, Niels Kristian Schjødt wrote:


I forgot to run 'sudo mount -a' I feel so embarrassed now :-( - In
other words no the drive was not mounted to the /ssd dir.


Yeah, that'll get ya.


I still see a lot of CPU I/O when doing a lot of writes, so the
question is, what's next. Should I try and go' for the connection
pooling thing or monitor that
/var/lib/postgresql/9.2/main/base/pgsql_tmp dir (and what exactly do
you mean by monitor - size?)


Well, like Keven said, if you have more than a couple dozen connections
on your hardware, you're losing TPS. It's probably a good idea to
install pgbouncer or pgpool and let your clients connect to those
instead. You should see a good performance boost from that.

But what concerns me is that your previous CPU charts showed a lot of
iowait. Even with the SSD taking some of the load off your write stream,
something else is going on, there. That's why you need to monitor the
size in MB, or number of files, for the pgsql_tmp directory. That's
where PG puts temp files when sorts are too big for your work_mem. If
that's getting a ton of activity, that would explain some of your write
overhead.


PPS. I talked with New Relic and it turns out there is something
wrong with the disk monitoring tool, so that's why there was nothing
in the disk charts but iostat showed a lot of activity.


Yeah. Next time you need to check IO, use iostat. It's not as pretty,
but it tells everything. ;) Just to help out with that, use:

iostat -dmx

That will give you extended information, including the % utilization of
your drives. TPS stats are nice, but I was just guessing your drives
were stalling out based on experience. Getting an outright percentage is
better. You should also use sar. Just a plain:

sar 1 100

Will give you a lot of info on what the CPU is doing. You want that
%iowait column to be as low as possible.

Keep us updated.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Niels Kristian Schjødt
Hmm very very interesting. Currently I run at medium load compared to the 
very high loads in the night.
This is what the CPU I/O on new relic show: 
https://rpm.newrelic.com/public/charts/8RnSOlWjfBy
And this is what iostat shows:

Linux 3.2.0-33-generic (master-db)  11/30/2012  _x86_64_(8 CPU)

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
sda   0.00 3.46   26.62   57.06 1.66 0.6857.41 
0.040.430.770.28   0.09   0.73
sdb   0.0316.850.01   70.26 0.00 2.3568.36 
0.060.810.210.81   0.10   0.73
sdc   1.9656.37   25.45  172.56 0.53 3.7243.98
30.83  155.70   25.15  174.96   1.74  34.46
sdd   1.8356.52   25.48  172.42 0.52 3.7243.90
30.50  154.11   25.66  173.09   1.74  34.37
md1   0.00 0.000.000.00 0.00 0.00 3.02 
0.000.000.000.00   0.00   0.00
md0   0.00 0.000.570.59 0.00 0.00 8.00 
0.000.000.000.00   0.00   0.00
md2   0.00 0.00   54.14  227.94 1.05 3.7234.61 
0.000.000.000.00   0.00   0.00
md3   0.00 0.000.01   60.46 0.00 0.6823.12 
0.000.000.000.00   0.00   0.00

A little reminder md3 is the raid array of the ssd drives sda and sdb and the 
md0-2 is the array of the regular hdd drives sdc and sdd

The pgsql_tmp dir is not changing at all it's constantly empty (a size of 4.0K).

So It doesn't seem like the ssd drives is at all utilized but the regular 
drives certainly is. but now i know for sure that the /ssd is mounted correctly:

sudo df /ssd
Filesystem 1K-blocksUsed Available Use% Mounted on
/dev/md3   230619228 5483796 213420620   3% /ssd



 

Den 30/11/2012 kl. 16.00 skrev Shaun Thomas stho...@optionshouse.com:

 On 11/30/2012 08:48 AM, Niels Kristian Schjødt wrote:
 
 I forgot to run 'sudo mount -a' I feel so embarrassed now :-( - In
 other words no the drive was not mounted to the /ssd dir.
 
 Yeah, that'll get ya.
 
 I still see a lot of CPU I/O when doing a lot of writes, so the
 question is, what's next. Should I try and go' for the connection
 pooling thing or monitor that
 /var/lib/postgresql/9.2/main/base/pgsql_tmp dir (and what exactly do
 you mean by monitor - size?)
 
 Well, like Keven said, if you have more than a couple dozen connections on 
 your hardware, you're losing TPS. It's probably a good idea to install 
 pgbouncer or pgpool and let your clients connect to those instead. You should 
 see a good performance boost from that.
 
 But what concerns me is that your previous CPU charts showed a lot of iowait. 
 Even with the SSD taking some of the load off your write stream, something 
 else is going on, there. That's why you need to monitor the size in MB, or 
 number of files, for the pgsql_tmp directory. That's where PG puts temp files 
 when sorts are too big for your work_mem. If that's getting a ton of 
 activity, that would explain some of your write overhead.
 
 PPS. I talked with New Relic and it turns out there is something
 wrong with the disk monitoring tool, so that's why there was nothing
 in the disk charts but iostat showed a lot of activity.
 
 Yeah. Next time you need to check IO, use iostat. It's not as pretty, but it 
 tells everything. ;) Just to help out with that, use:
 
 iostat -dmx
 
 That will give you extended information, including the % utilization of your 
 drives. TPS stats are nice, but I was just guessing your drives were stalling 
 out based on experience. Getting an outright percentage is better. You should 
 also use sar. Just a plain:
 
 sar 1 100
 
 Will give you a lot of info on what the CPU is doing. You want that %iowait 
 column to be as low as possible.
 
 Keep us updated.
 
 -- 
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com
 
 __
 
 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email



Re: [PERFORM] Optimize update query

2012-11-30 Thread Shaun Thomas

On 11/30/2012 09:44 AM, Niels Kristian Schjødt wrote:

Just a note on your iostat numbers. The first reading is actually just
a summary. You want the subsequent readings.


The pgsql_tmp dir is not changing at all it's constantly empty (a size
of 4.0K).


Good.


Filesystem 1K-blocksUsed Available Use% Mounted on
/dev/md3   230619228 5483796 213420620   3% /ssd


Good.

You could just be seeing lots of genuine activity. But going back on the
thread, I remember seeing this in your postgresql.conf:

shared_buffers = 7680MB

Change this to:

shared_buffers = 4GB

I say that because you mentioned you're using Ubuntu 12.04, and we were
having some problems with PG on that platform. With shared_buffers over
4GB, it starts doing really weird things to the memory subsystem.
Whatever it does causes the kernel to purge cache rather aggressively.
We saw a 60% reduction in read IO by reducing shared_buffers to 4GB.
Without as many reads, your writes should be much less disruptive.

You'll need to restart PG to adopt that change.

But I encourage you to keep iostat running in a terminal window so you
can watch it for a while. It's very revealing.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-30 Thread Mark Kirkwood
Hmm - not strictly true as stated: 1 SSD will typically do 500MB/s 
sequential read/write. 1 HDD will be lucky to get a 1/3 that.


We are looking at replacing 4 to 6 disk RAID10 arrays of HDD with a 
RAID1 pair of SSD, as they perform about the same for sequential work 
and vastly better at random. Plus they only use 2x 2.5 slots (or, ahem 
2x PCIe sockets), so allow smaller form factor servers and save on power 
and cooling.


Cheers

Mark

On 30/11/12 23:07, Vitalii Tymchyshyn wrote:

Oh, yes. I don't imagine DB server without RAID+BBU :)
When there is no BBU, SSD can be handy.
But you know, SSD is worse in linear read/write than HDD.

Best regards, Vitalii Tymchyshyn


2012/11/30 Mark Kirkwood mark.kirkw...@catalyst.net.nz
mailto:mark.kirkw...@catalyst.net.nz

Most modern SSD are much faster for fsync type operations than a
spinning disk - similar performance to spinning disk + writeback
raid controller + battery.

However as you mention, they are great at random IO too, so Niels,
it might be worth putting your postgres logs *and* data on the SSDs
and retesting.





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-29 Thread Niels Kristian Schjødt

Den 28/11/2012 kl. 17.54 skrev Shaun Thomas stho...@optionshouse.com:

 On 11/28/2012 10:19 AM, Niels Kristian Schjødt wrote:
 
 https://rpm.newrelic.com/public/charts/h2dtedghfsv
 
 Doesn't this answer your question?
 
 That iowait is crushing your server into the ground. It's no surprise updates 
 are taking several seconds. That update you sent us *should* execute on the 
 order of only a few milliseconds.
 
 So I'll reiterate that you *must* move your pg_xlog location elsewhere. 
 You've got row lookup bandwidth conflicting with writes. There are a couple 
 other changes you should probably make to your config:
 
 checkpoint_segments = 16
 
 This is not enough for the workload you describe. Every time the database 
 checkpoints, all of those changes in pg_xlog are applied to the backend data 
 files. You should set these values:
 
 checkpoint_segments = 100
 checkpoint_timeout = 10m
 checkpoint_completion_target = 0.9
 
 This will reduce your overall write workload, and make it less active. Too 
 many checkpoints massively reduce write throughput. With the settings you 
 have, it's probably checkpointing constantly while your load runs. Start with 
 this, but experiment with increasing checkpoint_segments further.
 
 If you check your logs now, you probably see a ton of checkpoint starting: 
 xlog in there. That's very bad. It should say checkpoint starting: time 
 meaning it's keeping up with your writes naturally.
 
 work_mem = 160MB
 
 This is probably way too high. work_mem is used every sort operation in a 
 query. So each connection could have several of these allocated, thus 
 starting your system of memory which will reduce that available for page 
 cache. Change it to 8mb, and increase it in small increments if necessary.
 
 So correct me if I'm wrong here: my theory is, that I have too many
 too slow update queries, that then often end up in a situation, where
 they wait for each other to finish, hence the sometimes VERY long
 execution times.
 
 Sometimes this is the case, but for you, you're running into IO contention, 
 not lock contention. Your 3TB RAID-1 is simply insufficient for this workload.
 
 If you check your logs after making the changes I've suggested, take a look 
 at your checkpoint sync times. That will tell you how long it took the kernel 
 to physically commit those blocks to disk and get a confirmation back from 
 the controller. If those take longer than a second or two, you're probably 
 running into controller buffer overflows. You have a large amount of RAM, so 
 you should also make these two kernel changes to sysctl.conf:
 
 vm.dirty_ratio = 10
 vm.dirty_writeback_ratio = 1
 
 Then run this:
 
 sysctl -p
 
 This will help prevent large IO write spikes caused when the kernel decides 
 to write out dirty memory. That can make checkpoints take minutes to commit 
 in some cases, which basically stops all write traffic to your database 
 entirely.
 
 That should get you going, anyway. You still need more/better disks so you 
 can move your pg_xlog directory. With your write load, that will make a huge 
 difference.
 
 -- 
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com
 
 __
 
 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email

Okay, now I'm done the updating as described above. I did the postgres.conf 
changes. I did the kernel changes, i added two SSD's in a software RAID1 where 
the pg_xlog is now located - unfortunately the the picture is still the same 
:-( 
When the database is under heavy load, there is almost no improvement to see 
in the performance compared to before the changes. A lot of both read and 
writes takes more than a 1000 times as long as they usually do, under lighter 
overall load. 

I added All the overview charts I can get hold on from new relic beneath. What 
am I overlooking? There must be an obvious bottleneck? Where should I dive in?

Database server CPU usage
https://rpm.newrelic.com/public/charts/cEdIvvoQZCr

Database server load average
https://rpm.newrelic.com/public/charts/cMNdrYW51QJ

Database server physical memory
https://rpm.newrelic.com/public/charts/c3dZBntNpa1

Database server disk I/O utulization
https://rpm.newrelic.com/public/charts/9YEVw6RekFG

Database server network I/O (Mb/s)
https://rpm.newrelic.com/public/charts/lKiZ0Szmwe7

Top 5 database operations by wall clock time
https://rpm.newrelic.com/public/charts/dCt45YH12FK

Database throughput
https://rpm.newrelic.com/public/charts/bIbtQ1mDzMI

Database response time
https://rpm.newrelic.com/public/charts/fPcNL8WA6xx

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-29 Thread Kevin Grittner
Niels Kristian Schjødt wrote:

 Okay, now I'm done the updating as described above. I did the
 postgres.conf changes. I did the kernel changes, i added two
 SSD's in a software RAID1 where the pg_xlog is now located -
 unfortunately the the picture is still the same :-( 

You said before that you were seeing high disk wait numbers. Now it
is zero accourding to your disk utilization graph. That sounds like
a change to me.

 When the database is under heavy load, there is almost no
 improvement to see in the performance compared to before the
 changes.

In client-visible response time and throughput, I assume, not
resource usage numbers?

 A lot of both read and writes takes more than a 1000 times as
 long as they usually do, under lighter overall load.

As an odd coincidence, you showed your max_connections setting to
be 1000.

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-29 Thread Niels Kristian Schjødt
Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com:

 Niels Kristian Schjødt wrote:
 
 Okay, now I'm done the updating as described above. I did the
 postgres.conf changes. I did the kernel changes, i added two
 SSD's in a software RAID1 where the pg_xlog is now located -
 unfortunately the the picture is still the same :-( 
 
 You said before that you were seeing high disk wait numbers. Now it
 is zero accourding to your disk utilization graph. That sounds like
 a change to me.
 
 When the database is under heavy load, there is almost no
 improvement to see in the performance compared to before the
 changes.
 
 In client-visible response time and throughput, I assume, not
 resource usage numbers?
 
 A lot of both read and writes takes more than a 1000 times as
 long as they usually do, under lighter overall load.
 
 As an odd coincidence, you showed your max_connections setting to
 be 1000.
 
 http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
 -Kevin

Hehe, I'm sorry if it somehow was misleading, I just wrote a lot of I/O it 
was CPU I/O, it also states that in the chart in the link. 
However, as I'm not very familiar with these deep down database and server 
things, I had no idea wether a disk bottle neck could hide in this I/O, so i 
went along with Shauns great help, that unfortunately didn't solve my issues. 
Back to the issue: Could it be that it is the fact that I'm using ubuntus built 
in software raid to raid my disks, and that it is not at all capable of 
handling the throughput?



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-29 Thread Niels Kristian Schjødt
Hmm I'm getting suspicious here. Maybe my new great setup with the SSD's is not 
really working as it should., and maybe new relic is not monitoring as It 
should.

If I do a sudo iostat -k 1
I get a lot of output like this:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda   0.00 0.00 0.00  0  0
sdb   0.00 0.00 0.00  0  0
sdc 546.00  2296.00  6808.00   2296   6808
sdd 593.00  1040.00  7416.00   1040   7416
md1   0.00 0.00 0.00  0  0
md0   0.00 0.00 0.00  0  0
md21398.00  3328.00 13064.00   3328  13064
md3   0.00 0.00 0.00  0  0

The storage thing is, that the sda and sdb is the SSD drives and the sdc and 
sdd is the HDD drives. The md0, md1 and md2 is the raid arrays on the HDD's and 
the md3 is the raid on the SSD's. Neither of the md3 or the SSD's are getting 
utilized - and I should expect that since they are serving my pg_xlog right? - 
so maybe I did something wrong in the setup. Here is the path I followed:

# 1) First setup the SSD drives in a software RAID1 setup:
#   
http://askubuntu.com/questions/223194/setup-of-two-additional-ssd-drives-in-raid-1
#
# 2) Then move the postgres pg_xlog dir
#   sudo /etc/init.d/postgresql-9.2 stop 
#   sudo mkdir -p /ssd/pg_xlog 
#   sudo chown -R  postgres.postgres /ssd/pg_xlog 
#   sudo chmod 700 /ssd/pg_xlog 
#   sudo cp -rf /var/lib/postgresql/9.2/main/pg_xlog/* /ssd/pg_xlog 
#   sudo mv /var/lib/postgresql/9.2/main/pg_xlog 
/var/lib/postgresql/9.2/main/pg_xlog_old 
#   sudo ln -s /ssd/pg_xlog /var/lib/postgresql/9.2/main/pg_xlog 
#   sudo /etc/init.d/postgresql-9.2 start

Can you spot something wrong?


 
Den 30/11/2012 kl. 02.43 skrev Niels Kristian Schjødt 
nielskrist...@autouncle.com:

 Den 30/11/2012 kl. 02.24 skrev Kevin Grittner kgri...@mail.com:
 
 Niels Kristian Schjødt wrote:
 
 Okay, now I'm done the updating as described above. I did the
 postgres.conf changes. I did the kernel changes, i added two
 SSD's in a software RAID1 where the pg_xlog is now located -
 unfortunately the the picture is still the same :-( 
 
 You said before that you were seeing high disk wait numbers. Now it
 is zero accourding to your disk utilization graph. That sounds like
 a change to me.
 
 When the database is under heavy load, there is almost no
 improvement to see in the performance compared to before the
 changes.
 
 In client-visible response time and throughput, I assume, not
 resource usage numbers?
 
 A lot of both read and writes takes more than a 1000 times as
 long as they usually do, under lighter overall load.
 
 As an odd coincidence, you showed your max_connections setting to
 be 1000.
 
 http://wiki.postgresql.org/wiki/Number_Of_Database_Connections
 
 -Kevin
 
 Hehe, I'm sorry if it somehow was misleading, I just wrote a lot of I/O it 
 was CPU I/O, it also states that in the chart in the link. 
 However, as I'm not very familiar with these deep down database and server 
 things, I had no idea wether a disk bottle neck could hide in this I/O, so i 
 went along with Shauns great help, that unfortunately didn't solve my issues. 
 Back to the issue: Could it be that it is the fact that I'm using ubuntus 
 built in software raid to raid my disks, and that it is not at all capable of 
 handling the throughput?
 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Optimize update query

2012-11-28 Thread Niels Kristian Schjødt
Hi, i have these update queries, that run very often, and takes too long time, 
in order for us to reach the throughput we are aiming at. However, the update 
query is very simple, and I can't figure out any way to improve the situation. 
The query looks like this:

UPDATE adverts SET last_observed_at = '2012-11-28 00:02:30.265154', 
data_source_id ='83d024a57bc2958940f3ca281bddcbf4' WHEREadverts.id IN ( 
1602382, 4916432, 3221246, 4741057, 3853335, 571429, 3222740, 571736, 3544903, 
325378,5774338, 5921451, 4295768, 3223170, 5687001, 4741966, 325519, 580867, 
325721, 4412200, 4139598, 325567, 1616653,1616664, 6202007, 3223748, 325613, 
3223764, 325615, 4296536, 3854595, 4971428, 3224146, 5150522, 4412617, 
5073048,325747, 325771, 1622154, 5794384, 5736581, 1623767, 5686945, 3224627, 
5073009, 3224747, 3224749, 325809, 5687051,3224811, 5687052, 4917824, 5073013, 
3224816, 3224834, 4297331, 1623907, 325864, 1623947, 6169706, 325869, 
325877,3225074, 3225112, 325893, 325912, 3225151, 3225184, 3225175, 1624659, 
325901, 4033926, 325904, 325911, 4412835,1624737, 5073004, 5921434, 325915, 
3225285, 3225452, 4917672, 1624984, 3225472, 325940, 5380611, 325957, 
5073258,3225500, 1625002, 5923489, 4413009, 325952, 3961122, 363 ) ;

An explain outputs me the following:

Update on adverts  (cost=0.12..734.27 rows=95 width=168)
  -  Index Scan using adverts_pkey on adverts  (cost=0.12..734.27 rows=95 
width=168)
Index Cond: (id = ANY 
('{1602382,4916432,3221246,4741057,3853335,571429,3222740,571736,3544903,325378,5774338,5921451,4295768,3223170,5687001,4741966,325519,580867,325721,4412200,4139598,325567,1616653,1616664,6202007,3223748,325613,3223764,325615,4296536,3854595,4971428,3224146,5150522,4412617,5073048,325747,325771,1622154,5794384,5736581,1623767,5686945,3224627,5073009,3224747,3224749,325809,5687051,3224811,5687052,4917824,5073013,3224816,3224834,4297331,1623907,325864,1623947,6169706,325869,325877,3225074,3225112,325893,325912,3225151,3225184,3225175,1624659,325901,4033926,325904,325911,4412835,1624737,5073004,5921434,325915,3225285,3225452,4917672,1624984,3225472,325940,5380611,325957,5073258,3225500,1625002,5923489,4413009,325952,3961122,363}'::integer[]))

So as you can see, it's already pretty optimized, it's just not enough :-) So 
what can I do? the two columns last_observed_at and data_source_id has an 
index, and it is needed elsewhere, so I can't delete those.

PS. I'm on postgres 9.2 on a server with 32gb ram, 8 cores and two 3T disks in 
a software raid 1 setup.

Is the only way out of this really a SSD disk?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Shaun Thomas

On 11/28/2012 06:57 AM, Niels Kristian Schjødt wrote:

Before I go crazy, here... you really need to tell us what not enough
means. You didn't provide an explain analyze, so we don't know what your
actual performance is. But I have my suspicions.


So as you can see, it's already pretty optimized, it's just not
enough :-) So what can I do? the two columns last_observed_at and
data_source_id has an index, and it is needed elsewhere, so I can't
delete those.


Ok, so part of your problem is that you're tying an advertising system
directly to the database for direct updates. That's a big no-no. Any
time you got a huge influx of views, there would be a logjam. You need
to decouple this so you can use a second tool to load the database in
larger batches. You'll get much higher throughput this way.

If you absolutely must use this approach, you're going to have to beef
up your hardware.


PS. I'm on postgres 9.2 on a server with 32gb ram, 8 cores and two 3T
disks in a software raid 1 setup.


This is not sufficient for a high-bandwidth stream of updates. Not even
close. Even if those 3T disks are 7200 RPM, and even in RAID-1, you're
going to have major problems with concurrent reads and writes. You need
to do several things:

1. Move your transaction logs (pg_xlog) to another pair of disks
entirely. Do not put these on the same disks as your data if you need
high write throughput.
2. Get a better disk architecture. You need 10k, or 15k RPM disks.
Starting with 6 or more of them in a RAID-10 would be a good beginning.

You never told us your postgresql.conf settings, so I'm just going with
very generic advice. Essentially, you're expecting too much for too
little. That machine would have been low-spec three years ago, and
unsuited to database use simply due to the 2-disk RAID.


Is the only way out of this really a SSD disk?


No. There are many, many steps you can and should take before going this
route. You need to know the problem you're solving before making
potentially expensive hardware decisions.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Marcin Mirosław
W dniu 28.11.2012 15:07, Shaun Thomas pisze:
 On 11/28/2012 06:57 AM, Niels Kristian Schjødt wrote:
 
 Before I go crazy, here... you really need to tell us what not enough
 means. You didn't provide an explain analyze, so we don't know what your
 actual performance is. But I have my suspicions.
 
 So as you can see, it's already pretty optimized, it's just not
 enough :-) So what can I do? the two columns last_observed_at and
 data_source_id has an index, and it is needed elsewhere, so I can't
 delete those.
 
 Ok, so part of your problem is that you're tying an advertising system
 directly to the database for direct updates. That's a big no-no. Any
 time you got a huge influx of views, there would be a logjam. You need
 to decouple this so you can use a second tool to load the database in
 larger batches. You'll get much higher throughput this way.

+1, sql databases has limited number of inserts/updates per second. Even
with highend hardware you won't have more than XXX operations per
second. As Thomas said, you should feed something like nosql database
from www server and use other tool to do aggregation and batch inserts
to postgresql. It will scale much better.

Marcin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Willem Leenen


I assume that SQL databases ( Banks? Telecom?) can handle an used car shop. No 
need for an unstructured data tool.



 +1, sql databases has limited number of inserts/updates per second. Even
 with highend hardware you won't have more than XXX operations per
 second. As Thomas said, you should feed something like nosql database
 from www server and use other tool to do aggregation and batch inserts
 to postgresql. It will scale much better.
 
 Marcin

  

Re: [PERFORM] Optimize update query

2012-11-28 Thread Niels Kristian Schjødt
Okay guys,

Thanks for all the great help and advice already! Let me just clear some 
things, to make my question a little easier to answer :-)
Now my site is a search engine for used cars - not just a car shop with a few 
hundred cars.
The update query you look at, is an update that is executed once a day in 
chunks for all active adverts, so we know they are still for sale (one car can 
be advertised at several places hence several adverts). So it's not a 
constant stream but it has a fairly high volume especially at night time 
though.

A compressed version of my .conf looks like this (note: there is some tweaks at 
the end of the file)
  data_directory = '/var/lib/postgresql/9.2/main'
  hba_file = '/etc/postgresql/9.2/main/pg_hba.conf' 
  ident_file = '/etc/postgresql/9.2/main/pg_ident.conf'
  external_pid_file = '/var/run/postgresql/9.2-main.pid' 
  listen_addresses = '192.168.0.2, localhost'
  port = 5432
  max_connections = 1000 
  unix_socket_directory = '/var/run/postgresql'
  wal_level = hot_standby
  synchronous_commit = off
  archive_mode = onarchive_command = 'rsync -a %p 
postgres@192.168.0.4:/var/lib/postgresql/9.2/wals/%f /dev/null' 
  max_wal_senders = 1 
  wal_keep_segments = 32
  logging_collector = on 
  log_min_messages = debug1 
  log_min_error_statement = debug1
  log_min_duration_statement = 0
  log_checkpoints = on
  log_connections = on
  log_disconnections = onlog_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
  log_lock_waits = on log_temp_files = 0
  datestyle = 'iso, mdy' 
  lc_messages = 'C'
  lc_monetary = 'en_US.UTF-8'
  lc_numeric = 'en_US.UTF-8' 
  lc_time = 'en_US.UTF-8' 
  default_text_search_config = 'pg_catalog.english' 
  default_statistics_target = 100
  maintenance_work_mem = 1GB
  checkpoint_completion_target = 0.7
  effective_cache_size = 22GB
  work_mem = 160MB
  wal_buffers = 4MB
  checkpoint_segments = 16
  shared_buffers = 7680MB

# All the log stuff is mainly temporary requirement for pgBadger
# The database has been tuned with pgtuner

You might be familiar with new relic, and I use that for quite a lot of 
monitoring. So, this is what I see at night time (a lot of I/O). So I went to 
play around with pgBadger to get some insights at database level.
iframe src=https://rpm.newrelic.com/public/charts/h2dtedghfsv; width=500 
height=300 scrolling=no frameborder=no/iframe

This shows me, that the by far most time-consuming queries are updates (in 
general). On avg. a query like the one I showed you, take 1,3 sec (but often it 
takes several minutes - which makes me wonder). So correct me if I'm wrong 
here: my theory is, that I have too many too slow update queries, that then 
often end up in a situation, where they wait for each other to finish, hence 
the sometimes VERY long execution times. So my basic idea here is, that if I 
could reduce the cost of the updates, then I could get a hight throughput 
overall.

Here is a sample of the pgBadger analysis:

Queries that took up the most time (N) ^
RankTotal duration  Times executed  Av. duration (s)Query
1   1d15h28m38.71s  
948,711
0.15s   
COMMIT;

2   1d2h17m55.43s   
401,002
0.24s   
INSERT INTO car_images ( car_id, created_at, image, updated_at ) 
VALUES ( '', '', '', '' ) returning id;

3   23h18m33.68s
195,093
0.43s   
SELECT DISTINCT cars.id FROM cars LEFT OUTER JOIN adverts ON 
adverts.car_id = cars.id LEFT OUTERJOIN sellers ON sellers.id = 
adverts.seller_id WHERE cars.sales_state = '' AND cars.year = 0 
ANDcars.engine_size = 0.0 AND ( ( cars.id IS NOT NULL AND 
cars.brand = '' AND cars.model_name = ''AND cars.fuel = '' AND 
cars.km = 0 AND cars.price = 0 AND sellers.kind = '' ) ) LIMIT 0;

4   22h45m26.52s
3,374,133
0.02s   
SELECT adverts.* FROM adverts WHERE ( source_name = '' AND md5 ( url ) = 
md5 ( '' ) ) LIMIT 0;

5   10h31m37.18s
29,671
1.28s   
UPDATE adverts SET last_observed_at = '', data_source_id = '' WHERE 
adverts.id IN ( ... ) ;

6   7h18m40.65s 
396,393
0.07s   
UPDATE cars SET updated_at = '' WHERE cars.id = 0;

7   7h6m7.87s   
241,294
0.11s   
UPDATE cars SET images_count = COALESCE ( images_count, 0 ) + 0 WHERE 
cars.id = 0;

8   6h56m11.78s 
84,571
0.30s   
INSERT INTO failed_adverts ( active_record_object_class, 
advert_candidate, created_at, exception_class,exception_message, 
from_rescraper, last_retried_at, retry_count, source_name, 
stack_trace,updated_at, url ) VALUES ( NULL, '', '', '', '', NULL, NULL, 
'', '', '', '', '' ) returning id;

9   5h47m25.45s 
188,402
0.11s   
INSERT INTO adverts ( availability_state, car_id, created_at, 
data_source_id, deactivated_at,first_extraction, last_observed_at, 
price, seller_id, source_id, source_name, updated_at, url )VALUES ( 
'', '', '', '', NULL, '', '', '', '', '', '', '', '' ) returning id;

10  3h4m26.86s  
166,235
0.07s   
UPDATE adverts SET deactivated_at = '', availability_state = '', 
updated_at = '' WHERE adverts.id = 0;

(Yes I'm already 

Re: [PERFORM] Optimize update query

2012-11-28 Thread Bèrto ëd Sèra
max_connections = 1000 looks bad... why not a pooler in place?
Cheers
Bèrto

On 28 November 2012 16:19, Niels Kristian Schjødt
nielskrist...@autouncle.com wrote:
 max_connections = 1000



-- 
==
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Shaun Thomas

On 11/28/2012 10:19 AM, Niels Kristian Schjødt wrote:


https://rpm.newrelic.com/public/charts/h2dtedghfsv


Doesn't this answer your question?

That iowait is crushing your server into the ground. It's no surprise
updates are taking several seconds. That update you sent us *should*
execute on the order of only a few milliseconds.

So I'll reiterate that you *must* move your pg_xlog location elsewhere.
You've got row lookup bandwidth conflicting with writes. There are a
couple other changes you should probably make to your config:


checkpoint_segments = 16


This is not enough for the workload you describe. Every time the
database checkpoints, all of those changes in pg_xlog are applied to the
backend data files. You should set these values:

checkpoint_segments = 100
checkpoint_timeout = 10m
checkpoint_completion_target = 0.9

This will reduce your overall write workload, and make it less active.
Too many checkpoints massively reduce write throughput. With the
settings you have, it's probably checkpointing constantly while your
load runs. Start with this, but experiment with increasing
checkpoint_segments further.

If you check your logs now, you probably see a ton of checkpoint
starting: xlog in there. That's very bad. It should say checkpoint
starting: time meaning it's keeping up with your writes naturally.


work_mem = 160MB


This is probably way too high. work_mem is used every sort operation in
a query. So each connection could have several of these allocated, thus
starting your system of memory which will reduce that available for page
cache. Change it to 8mb, and increase it in small increments if necessary.


So correct me if I'm wrong here: my theory is, that I have too many
too slow update queries, that then often end up in a situation, where
they wait for each other to finish, hence the sometimes VERY long
execution times.


Sometimes this is the case, but for you, you're running into IO
contention, not lock contention. Your 3TB RAID-1 is simply insufficient
for this workload.

If you check your logs after making the changes I've suggested, take a
look at your checkpoint sync times. That will tell you how long it took
the kernel to physically commit those blocks to disk and get a
confirmation back from the controller. If those take longer than a
second or two, you're probably running into controller buffer overflows.
You have a large amount of RAM, so you should also make these two kernel
changes to sysctl.conf:

vm.dirty_ratio = 10
vm.dirty_writeback_ratio = 1

Then run this:

sysctl -p

This will help prevent large IO write spikes caused when the kernel
decides to write out dirty memory. That can make checkpoints take
minutes to commit in some cases, which basically stops all write traffic
to your database entirely.

That should get you going, anyway. You still need more/better disks so
you can move your pg_xlog directory. With your write load, that will
make a huge difference.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Shaun Thomas

On 11/28/2012 11:44 AM, Niels Kristian Schjødt wrote:


Thanks a lot - on the server I already have one additional SSD 250gb
disk, that I don't use for anything at the moment.


God. An SSD would actually be better for your data, as it follows
more random access patterns, and xlogs are more sequential. But it's
better than nothing.

And yes, you'd be better off with a RAID-1 of two of these SSDs, because
the xlogs are critical to database health. You have your archived copy
due to the rsync, which helps. But if you had a crash, there could
potentially be a need to replay unarchived transaction logs, and you'd
end up with some data loss.


BTW. as you might have seen from the .conf I have a second slave
server with the exact same setup, which currently runs as a hot
streaming replication slave. I might ask a stupid question here, but
this does not affect the performance of the master does it?


Only if you're using synchronous replication. From what I saw in the
config, that isn't the case.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Niels Kristian Schjødt
Hi, I have started to implement your suggestions . I have a small error so far 
though. The vm.dirty_writeback_ratio = 1 command rerurns: 
  error: vm.dirty_writeback_ratio is an unknown key
I'm on ubuntu 12.04


Den 28/11/2012 kl. 17.54 skrev Shaun Thomas stho...@optionshouse.com:

 On 11/28/2012 10:19 AM, Niels Kristian Schjødt wrote:
 
 https://rpm.newrelic.com/public/charts/h2dtedghfsv
 
 Doesn't this answer your question?
 
 That iowait is crushing your server into the ground. It's no surprise updates 
 are taking several seconds. That update you sent us *should* execute on the 
 order of only a few milliseconds.
 
 So I'll reiterate that you *must* move your pg_xlog location elsewhere. 
 You've got row lookup bandwidth conflicting with writes. There are a couple 
 other changes you should probably make to your config:
 
 checkpoint_segments = 16
 
 This is not enough for the workload you describe. Every time the database 
 checkpoints, all of those changes in pg_xlog are applied to the backend data 
 files. You should set these values:
 
 checkpoint_segments = 100
 checkpoint_timeout = 10m
 checkpoint_completion_target = 0.9
 
 This will reduce your overall write workload, and make it less active. Too 
 many checkpoints massively reduce write throughput. With the settings you 
 have, it's probably checkpointing constantly while your load runs. Start with 
 this, but experiment with increasing checkpoint_segments further.
 
 If you check your logs now, you probably see a ton of checkpoint starting: 
 xlog in there. That's very bad. It should say checkpoint starting: time 
 meaning it's keeping up with your writes naturally.
 
 work_mem = 160MB
 
 This is probably way too high. work_mem is used every sort operation in a 
 query. So each connection could have several of these allocated, thus 
 starting your system of memory which will reduce that available for page 
 cache. Change it to 8mb, and increase it in small increments if necessary.
 
 So correct me if I'm wrong here: my theory is, that I have too many
 too slow update queries, that then often end up in a situation, where
 they wait for each other to finish, hence the sometimes VERY long
 execution times.
 
 Sometimes this is the case, but for you, you're running into IO contention, 
 not lock contention. Your 3TB RAID-1 is simply insufficient for this workload.
 
 If you check your logs after making the changes I've suggested, take a look 
 at your checkpoint sync times. That will tell you how long it took the kernel 
 to physically commit those blocks to disk and get a confirmation back from 
 the controller. If those take longer than a second or two, you're probably 
 running into controller buffer overflows. You have a large amount of RAM, so 
 you should also make these two kernel changes to sysctl.conf:
 
 vm.dirty_ratio = 10
 vm.dirty_writeback_ratio = 1
 
 Then run this:
 
 sysctl -p
 
 This will help prevent large IO write spikes caused when the kernel decides 
 to write out dirty memory. That can make checkpoints take minutes to commit 
 in some cases, which basically stops all write traffic to your database 
 entirely.
 
 That should get you going, anyway. You still need more/better disks so you 
 can move your pg_xlog directory. With your write load, that will make a huge 
 difference.
 
 -- 
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com
 
 __
 
 See http://www.peak6.com/email_disclaimer/ for terms and conditions related 
 to this email



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize update query

2012-11-28 Thread Mark Kirkwood

In later kernels these have been renamed:


Welcome to Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-32-generic x86_64)

$ sysctl -a|grep dirty
vm.dirty_background_ratio = 5
vm.dirty_background_bytes = 0
vm.dirty_ratio = 10
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000

You the option of specifying either a ratio, or - more usefully for 
machines with a lot of ram - bytes.


Regards

Mark

P.s: People on this list usually prefer it if you *bottom* post (i.e 
reply underneath the original).



On 29/11/12 16:32, Niels Kristian Schjødt wrote:

Hi, I have started to implement your suggestions . I have a small error so far though. 
The vm.dirty_writeback_ratio = 1 command rerurns:
   error: vm.dirty_writeback_ratio is an unknown key
I'm on ubuntu 12.04






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Optimize the database performance

2011-10-17 Thread Micka
Hi,

I've a postgres 9.1 database used for map generating ( tiles ).
The system has 24Go RAM and 5 processors.
I'm using geoserver to generate the tiles.

My data used 8486 MB  = psql -d gis -c SELECT
pg_size_pretty(pg_database_size('gis'))

I've carefully indexes the table by the the_geom column.

Here is my database config :

-- change :
-- listen_addresses = '*'
-- max_connections = 50
-- tcp_keepalives_idle = 60# TCP_KEEPIDLE, in seconds;
-- shared_buffers = 1024MB # 10% of available RAM
-- work_mem = 256MB# min 64kB
-- maintenance_work_mem = 256MB# min 1MB
-- effective_cache_size = 5120MB
-- autovacuum = off

sudo nano /etc/sysctl.conf
--   kernel.shmmax=5368709120
--   kernel.shmall=5368709120

I wanted to have your opinion about this config ?  What can I do to optimize
the performance ?

Thank you,


Re: [PERFORM] Optimize the database performance

2011-10-17 Thread MirrorX
hello Micha,

i think that noone can tell you much without more information about your
system. roughly i would say that you could change the following parameters:
shared_buffers = 1024MB - 6GB
work_mem = 256MB - 30-50 MB 
effective_cache_size = 5120MB - 16GB (depends on whether its a dedicated db
server or not)
kernel.shmmax=5368709120 : now its 5GB, probably you need more here, i would
put 50% of ram
kernel.shmall=5368709120 you need less here. check he shmsetup.sh script for
more info
autovacuum off - on


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Optimize-the-database-performance-tp4909314p4909422.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize the database performance

2011-10-17 Thread Andy Colson

On 10/17/2011 04:48 AM, Micka wrote:

Hi,

I've a postgres 9.1 database used for map generating ( tiles ).
The system has 24Go RAM and 5 processors.
I'm using geoserver to generate the tiles.

My data used 8486 MB  = psql -d gis -c SELECT 
pg_size_pretty(pg_database_size('gis'))

I've carefully indexes the table by the the_geom column.

Here is my database config :

-- change :
-- listen_addresses = '*'
-- max_connections = 50
-- tcp_keepalives_idle = 60# TCP_KEEPIDLE, in seconds;
-- shared_buffers = 1024MB # 10% of available RAM
-- work_mem = 256MB# min 64kB
-- maintenance_work_mem = 256MB# min 1MB
-- effective_cache_size = 5120MB
-- autovacuum = off

sudo nano /etc/sysctl.conf
--   kernel.shmmax=5368709120
--   kernel.shmall=5368709120

I wanted to have your opinion about this config ?  What can I do to optimize 
the performance ?

Thank you,



Yeah... We're gonna need some more details.  Whats slow?  Are you CPU bound or 
IO bound?  How many concurrent db connections?  What does vmstat look like?  
And 10% of 24 gig is 2.4 gig, not 1 gig.

Or is this box doing something else.  I noticeeffective_cache_size is only 5 
gig, so you must be doing other things on this box.


-- autovacuum = off


Are you vacuuming by hand!?   If not this is a really bad idea (tm)(c)(r)

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize the database performance

2011-10-17 Thread Cédric Villemain
2011/10/17 Micka mickamus...@gmail.com:
 Hi,

 I've a postgres 9.1 database used for map generating ( tiles ).
 The system has 24Go RAM and 5 processors.
 I'm using geoserver to generate the tiles.

 My data used 8486 MB  = psql -d gis -c SELECT
 pg_size_pretty(pg_database_size('gis'))

 I've carefully indexes the table by the the_geom column.

 Here is my database config :

 -- change :
 -- listen_addresses = '*'
 -- max_connections = 50
 -- tcp_keepalives_idle = 60    # TCP_KEEPIDLE, in seconds;
 -- shared_buffers = 1024MB # 10% of available RAM
 -- work_mem = 256MB    # min 64kB
 -- maintenance_work_mem = 256MB    # min 1MB
 -- effective_cache_size = 5120MB
 -- autovacuum = off

 sudo nano /etc/sysctl.conf
 --   kernel.shmmax=5368709120
 --   kernel.shmall=5368709120

 I wanted to have your opinion about this config ?  What can I do to optimize
 the performance ?


as other poeple said, you need to give more information on your
hardware and usage of it to get more accurate answers.

Assuming that all your db can stay in RAM, I would start with
random_page_cost = 1 and seq_page_cost = 1.

effective_cache_size should be the sum of all cache space (linux and
postgresql), any number larger than 10GB should be fine, there is no
risk other than bad planning to set it too large (and it won't affect
you here I think)

You have memory available? you can increase the maintenance_work_mem
(and you probably want to do that if you have a maintenance window
when you do the vacuum manually - why not autovacum ?)

For shared_buffers, you should use pg_buffercache to see what's
happening and maybe change the value to something higher (2GB, 4GB,
...) . You can also just test and find the best size for your
application workload.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-02 Thread David Jarvis
Sorry, Alvaro.

I was contemplating using a GIN or GiST index as a way of optimizing the
query.

Instead, I found that re-inserting the data in order of station ID (the
primary look-up column) and then CLUSTER'ing on the station ID, taken date,
and category index increased the speed by an order of magnitude.

I might be able to drop the station/taken/category index in favour of the
simple station index and CLUSTER on that, instead (saving plenty of disk
space). Either way, it's fast right now so I'm not keen to try and make it
much faster.

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-02 Thread David Jarvis
Hi,

Hmm, that's nice, though I cannot but wonder whether the exclusive lock
 required by CLUSTER is going to be a problem in the long run.


Not an issue; the inserts are one-time (or very rare; at most: once a year).


  Hm, keep in mind that if the station clause alone is not selective
 enough, scanning it may be too expensive.  The current three column


The seven child tables (split on category ID) have the following indexes:

   1. Primary key (unique ID, sequence)
   2. Station ID (table data is physically inserted by station ID order)
   3. Station ID, Date, and Category ID (this index is CLUSTER'ed)

I agree that the last index is probably all that is necessary. 99% of the
searches use the station ID, date, and category. I don't think PostgreSQL
necessarily uses that last index, though.

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-01 Thread Matthew Wakeling

On Sun, 23 May 2010, David Jarvis wrote:

The measurement table indexes (on date and weather station) were not being
used because the only given date ranges (e.g., 1900 - 2009) were causing the
planner to do a full table scan, which is correct.


I wonder if you might see some benefit from CLUSTERing the tables on the 
index.


Matthew

--
And the lexer will say Oh look, there's a null string. Oooh, there's 
another. And another., and will fall over spectacularly when it realises

there are actually rather a lot.
- Computer Science Lecturer (edited)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-01 Thread Alvaro Herrera
Excerpts from Matthew Wakeling's message of mar jun 01 05:55:35 -0400 2010:
 On Sun, 23 May 2010, David Jarvis wrote:
  The measurement table indexes (on date and weather station) were not being
  used because the only given date ranges (e.g., 1900 - 2009) were causing the
  planner to do a full table scan, which is correct.
 
 I wonder if you might see some benefit from CLUSTERing the tables on the 
 index.

Eh, isn't this a GIN or GiST index?  I don't think you can cluster on
those, can you?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-06-01 Thread Alvaro Herrera
Excerpts from David Jarvis's message of mar jun 01 14:01:22 -0400 2010:
 Sorry, Alvaro.
 
 I was contemplating using a GIN or GiST index as a way of optimizing the
 query.

My fault -- I didn't read the whole thread.

 Instead, I found that re-inserting the data in order of station ID (the
 primary look-up column) and then CLUSTER'ing on the station ID, taken date,
 and category index increased the speed by an order of magnitude.

Hmm, that's nice, though I cannot but wonder whether the exclusive lock
required by CLUSTER is going to be a problem in the long run.

 I might be able to drop the station/taken/category index in favour of the
 simple station index and CLUSTER on that, instead (saving plenty of disk
 space). Either way, it's fast right now so I'm not keen to try and make it
 much faster.

Hm, keep in mind that if the station clause alone is not selective
enough, scanning it may be too expensive.  The current three column
index is probably a lot faster to search (though of course it's causing
more work to be kept up to date on insertions).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-23 Thread David Jarvis
Hi,

The problem is now solved (in theory).

Well, it's not the functions per se that's the problem, it's the lack of
 a useful index on the expression.


The measurement table indexes (on date and weather station) were not being
used because the only given date ranges (e.g., 1900 - 2009) were causing the
planner to do a full table scan, which is correct. What I had to do was find
a way to reduce the dates so that the planner would actually use the index,
rather than doing a full table scan on 43 million records. By passing in
1955 - 1960 the full table scan went away in favour of an index scan, as
expected.

Each weather station has a known lifespan (per climate category). That is,
not all weather stations between 1880 and 2009 collected data.  For example,
one weather station monitored the maximum daily temperature between
2006-11-29 and 2009-12-31. Some stations span more than 30 years, but I
believe those are in the minority (e.g., 1896-12-01 to 1959-01-31). (I'll be
able to verify once the analysis is finished.)

I will add another table that maps the stations to category and min/max
dates. I can then use this reference table which should (theory part here)
tell the planner to use the index.

What is *really impressive*, though... If my understanding is correct...

PostgreSQL scanned 43 million rows 78 times, returning results in ~90 sec.

Thanks again for all your help, everybody. I sincerely appreciate your
patience, comments, and ideas.

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-22 Thread David Jarvis
Hi,

CREATE INDEX measurement_01_001_y_idx
  ON climate.measurement_01_001
  USING btree
  (date_part('year'::text, taken));

 Is that equivalent to what you suggest?


 No. It is not the same function, so Postgres has no way to know it produces
 the same results (if it does).


This is what I ran:

CREATE INDEX
  measurement_013_taken_year_idx
ON
  climate.measurement_013
  (EXTRACT( YEAR FROM taken ));

This is what pgadmin3 shows me:

CREATE INDEX measurement_013_taken_year_idx
  ON climate.measurement_013
  USING btree
  (date_part('year'::text, taken));

As far as I can tell, it appears they are equivalent?

Either way, the cost for performing a GROUP BY is high (I ran once with
extract and once with date_part). The date_part EXPLAIN ANALYSE resulted in:

Limit  (cost=1748024.65..1748028.65 rows=200 width=12) (actual
time=65471.448..65471.542 rows=101 loops=1)

The EXTRACT EXPLAIN ANALYSE came to:

Limit  (cost=1748024.65..1748028.65 rows=200 width=12) (actual
time=44913.263..44913.330 rows=101 loops=1)

If PG treats them differently, I'd like to know how so that I can do the
right thing. As it is, I cannot see the difference in performance between
date_part and EXTRACT.

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-22 Thread David Jarvis
Hi,

certainly understand that you wouldn't want to partition by year.  It


Definitely not.


 does strike me that perhaps you could partition by day ranges, but you'd


I don't think that will work; users can choose any day range, with the most
common as Jan 1 - Dec 31, followed by seasonal ranges, followed by arbitrary
ranges.


 some of this data..  If users are going to typically use 1900-2009 for
 years, then could the information about all of those years be aggregated
 apriori to make those queries faster?


I'm not sure what you mean. I could create a separate table that lumps the
aggregated averages per year per station per category, but that will only
help in the one case. There are five different reporting pages (Basic
through Guru). On three of those pages the user must select arbitrary day
ranges. On one of those pages, the user can select a season, which then maps
to, for all intents and purposes, an arbitrary day range.

Only the most basic page do not offer the user a day range selection.


 Do not get hung up on having to have a separate table for every unique
 value in the column- you don't need that.  constraint_exclusion will


That's good advice. I have repartitioned the data into seven tables: one per
category.


 I agee with Matthew Wakeling in a different post: its probably wise to
 I would agree with this too- get it working first, then look at
 partitioning.  Even more so- work on a smaller data set to begin with


The query speed has now much improved thanks to everybody's advice.

From a cost of 10006220141 down to 704924. Here is the query:

SELECT
  avg(m.amount),
  extract(YEAR FROM m.taken) AS year_taken
FROM
  climate.city c,
  climate.station s,
  climate.measurement m
WHERE
  c.id = 5182 AND
  6371.009 * SQRT(
POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) +
(COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) *
 POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2))
  ) = 25 AND
  s.elevation BETWEEN 0 AND 3000 AND
  m.category_id = 7 AND
  m.station_id = s.id AND
  extract(YEAR FROM m.taken) BETWEEN 1900 AND 2000
GROUP BY
  extract(YEAR FROM m.taken)
ORDER BY
  extract(YEAR FROM m.taken)

(Note that *extract(YEAR FROM m.taken)* is much faster than
*date_part('year'::text,
m.taken)*.)

The query plan for the above SQL reveals:

Sort  (cost=704924.25..704924.75 rows=200 width=9) (actual
time=9476.518..9476.521 rows=46 loops=1)
  Sort Key: (date_part('year'::text, (m.taken)::timestamp without time
zone))
  Sort Method:  quicksort  Memory: 28kB
  -  HashAggregate  (cost=704913.10..704916.60 rows=200 width=9) (actual
time=9476.465..9476.489 rows=46 loops=1)
-  Hash Join  (cost=1043.52..679956.79 rows=4991262 width=9)
(actual time=46.399..9344.537 rows=120678 loops=1)
  Hash Cond: (m.station_id = s.id)
  -  Append  (cost=0.00..529175.42 rows=14973786 width=13)
(actual time=0.076..7739.647 rows=14874909 loops=1)
-  Seq Scan on measurement m  (cost=0.00..43.00 rows=1
width=20) (actual time=0.000..0.000 rows=0 loops=1)
  Filter: ((category_id = 7) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
1900::double precision) AND (date_part('year'::text, (taken)::timestamp
without time zone) = 2000::double precision))
-  Index Scan using measurement_013_taken_year_idx on
measurement_013 m  (cost=0.01..529132.42 rows=14973785 width=13) (actual
time=0.075..6266.385 rows=14874909 loops=1)
  Index Cond: ((date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2000::double precision))
  Filter: (category_id = 7)
  -  Hash  (cost=992.94..992.94 rows=4046 width=4) (actual
time=43.420..43.420 rows=78 loops=1)
-  Nested Loop  (cost=0.00..992.94 rows=4046 width=4)
(actual time=0.053..43.390 rows=78 loops=1)
  Join Filter: ((6371.009::double precision *
sqrt((pow(radians(((c.latitude_decimal - s.latitude_decimal))::double
precision), 2::double precision) + (cos((radians(((c.latitude_decimal +
s.latitude_decimal))::double precision) / 2::double precision)) *
pow(radians(((c.longitude_decimal - s.longitude_decimal))::double
precision), 2::double precision) = 25::double precision)
  -  Index Scan using city_pkey1 on city c
(cost=0.00..4.27 rows=1 width=16) (actual time=0.021..0.022 rows=1 loops=1)
Index Cond: (id = 5182)
  -  Seq Scan on station s  (cost=0.00..321.08
rows=12138 width=20) (actual time=0.008..5.457 rows=12139 loops=1)
Filter: ((s.elevation = 0) AND
(s.elevation = 3000))
Total runtime: 9476.626 ms

That's about 10 seconds using the category with the smallest table. The
largest table takes 17 seconds (fantastic!) after a 

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Yeb Havinga
David Jarvis wrote: 


Also, you're trying to do constraint_exclusion, but have you made sure
that it's turned on?  And have you made sure that those
constraints are
really the right ones and that they make sense?  You're using a
bunch of
extract()'s there too, why not just specify a CHECK constraint on the
date ranges which are allowed in the table..?


I don't know what the date ranges are? So I can't partition them by year?

Right now I created 72 child tables by using the category and month. 
This may have been a bad choice. But at least all the data is in the 
system now so dissecting or integrating it back in different ways 
shouldn't take days.


Thanks everyone for all your help, I really appreciate the time you've 
taken to guide me in the right direction to make the system as fast as 
it can be.


My $0.02 - its hard to comment inline due to the number of responses, 
but: the partitioning is only useful for speed, if it matches how your 
queries select data. For time based data I would for sure go for year 
based indexing. If you want a fixed number of partitions, you could 
perhaps do something like year % 64. I did a test to see of the 
constraint exclusion could work with extract but that failed:


test=# create table parent(t timestamptz);
test=# create table child1(check ((extract(year from t)::int % 2)=0)) 
inherits( parent);
test=# create table child2(check ((extract(year from t)::int % 2)=1)) 
inherits(parent);
test=# explain select * from parent where (extract(year from t)::int % 
2) = 0;
   QUERY PLAN
---

Result  (cost=0.00..158.40 rows=33 width=8)
  -  Append  (cost=0.00..158.40 rows=33 width=8)
-  Seq Scan on parent  (cost=0.00..52.80 rows=11 width=8)
  Filter: (((date_part('year'::text, t))::integer % 2) = 0)
-  Seq Scan on child1 parent  (cost=0.00..52.80 rows=11 width=8)
  Filter: (((date_part('year'::text, t))::integer % 2) = 0)
-  Seq Scan on child2 parent  (cost=0.00..52.80 rows=11 width=8)
  Filter: (((date_part('year'::text, t))::integer % 2) = 0)

It hits all partitions even when I requested for a single year.

So an extra column would be needed, attempt 2 with added year smallint.

test=# create table parent(t timestamptz, y smallint);
test=# create table child1(check ((y % 2)=0)) inherits( parent);
test=# create table child2(check ((y % 2)=1)) inherits( parent);
test=# explain select * from parent where (y % 2) between 0 and 0;
  QUERY 
PLAN   
-

Result  (cost=0.00..122.00 rows=20 width=10)
  -  Append  (cost=0.00..122.00 rows=20 width=10)
-  Seq Scan on parent  (cost=0.00..61.00 rows=10 width=10)
  Filter: y)::integer % 2) = 0) AND (((y)::integer % 
2) = 0))

-  Seq Scan on child1 parent  (cost=0.00..61.00 rows=10 width=10)
  Filter: y)::integer % 2) = 0) AND (((y)::integer % 
2) = 0))


This works: only one child table hit.

That made me think: if you'd scan two consecutive years, you'd always 
hit two different partitions. For your use case it'd be nice if some 
year wraparounds would fall in the same partition. The following query 
shows partition numbers for 1900 - 2010 with 4 consecutive years in the 
same partition. It also shows that in this case 32 partitions is enough:


test=# select x, (x / 4) % 32 from generate_series(1900,2010) as x(x);
x   | ?column?
--+--
1900 |   27
1901 |   27
1902 |   27
1903 |   27
1904 |   28
1905 |   28
etc
1918 |   31
1919 |   31
1920 |0
1921 |0
etc
2005 |   21
2006 |   21
2007 |   21
2008 |   22
2009 |   22
2010 |   22
(111 rows)

This would mean that a extra smallint column is needed which would 
inflate the 300M relation with.. almost a GB, but I still think it'd be 
a good idea.


create or replace function yearmod(int) RETURNS int
as 'select (($1  2) %32);'
language sql
immutable
strict;

create table parent(t timestamptz, y smallint);

select 'create table child'||x||'(check (yearmod(y)='||x-1||')) 
inherits(parent);' from generate_series(1,32) as x(x);
  ?column?   
---

create table child1(check (yearmod(y)=0)) inherits(parent);
create table child2(check (yearmod(y)=1)) inherits(parent);
create table child3(check (yearmod(y)=2)) inherits(parent);
etc
create table child30(check (yearmod(y)=29)) inherits(parent);
create table child31(check (yearmod(y)=30)) inherits(parent);
create table child32(check (yearmod(y)=31)) inherits(parent);
(32 rows)

Copy and paste output of this query in psql to create child tables.

Example with period 

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Yeb Havinga

There is a thing that might lead to confusion in the previous post:

create or replace function yearmod(int) RETURNS int
as 'select (($1  2) %32);'
language sql
immutable
strict;

is equivalent with

create or replace function yearmod(int) RETURNS int
as 'select (($1 / 4) %32);'
language sql
immutable
strict;

and that is the function that was used with all the other output (it can 
be seen inlined in the explain output). I did not catch this until after 
the post.


regards,
Yeb Havinga



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Matthew Wakeling

On Fri, 21 May 2010, Yeb Havinga wrote:

For time based data I would for sure go for year based indexing.


On the contrary, most of the queries seem to be over many years, but 
rather restricting on the time of year. Therefore, partitioning by month 
or some other per-year method would seem sensible.


Regarding the leap year problem, you might consider creating a modified 
day of year field, which always assumes that the year contains a leap day. 
Then a given number always resolves to a given date, regardless of year. 
If you then partition (or index) on that field, then you may get a 
benefit.


In this case, partitioning is only really useful when you are going to be 
forced to do seq scans. If you can get a suitably selective index, in the 
case where you are selecting a small proportion of the data, then I would 
concentrate on getting the index right, rather than the partition, and 
maybe even not do partitioning.


Matthew

--
Trying to write a program that can't be written is... well, it can be an
enormous amount of fun! -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread David Jarvis
Hi, Yeb.

This is starting to go back to the design I used with MySQL:

   - YEAR_REF - Has year and station
   - MONTH_REF - Has month, category, and yea referencer
   - MEASUREMENT - Has month reference, amount, and day

Normalizing by date parts was fast. Partitioning the tables by year won't do
much good -- users will probably choose 1900 to 2009, predominately.

I thought about splitting the data by station by category, but that's ~73000
tables. My understanding is that PostgreSQL uses files per index, which
would be messy at the OS level (Linux 2.6.31). Even by station alone is
12139 tables, which might be tolerable for now, but with an order of
magnitude more stations on the distant horizon, it will not scale.

I also thought about splitting the data by station district by category --
there are 79 districts, yielding 474 child tables, which is ~575000 rows per
child table. Most of the time I'd imagine only one or two districts would be
selected. (Again, hard to know exactly.)

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Yeb Havinga

Matthew Wakeling wrote:

On Fri, 21 May 2010, Yeb Havinga wrote:

For time based data I would for sure go for year based indexing.


On the contrary, most of the queries seem to be over many years, but 
rather restricting on the time of year. Therefore, partitioning by 
month or some other per-year method would seem sensible.
The fact is that at the time I wrote my mail, I had not read a specifion 
of distribution of parameters (or I missed it). That's why the sentence 
of my mail before the one you quoted said: the partitioning is only 
useful for speed, if it matches how your queries select data.. In most 
of the databases I've worked with, the recent data was queried most 
(accounting, medical) but I can see that for climate analysis this might 
be different.
Regarding the leap year problem, you might consider creating a 
modified day of year field, which always assumes that the year 
contains a leap day. Then a given number always resolves to a given 
date, regardless of year. If you then partition (or index) on that 
field, then you may get a benefit.
Shouldn't it be just the other way around - assume all years are non 
leap years for the doy part field to be indexed.


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Yeb Havinga

David Jarvis wrote:

Hi, Yeb.

This is starting to go back to the design I used with MySQL:

* YEAR_REF - Has year and station
* MONTH_REF - Has month, category, and yea referencer
* MEASUREMENT - Has month reference, amount, and day

Normalizing by date parts was fast. Partitioning the tables by year 
won't do much good -- users will probably choose 1900 to 2009, 
predominately.

Ok, in that case it is a bad idea.
I thought about splitting the data by station by category, but that's 
~73000 tables. My understanding is that PostgreSQL uses files per 
index, which would be messy at the OS level (Linux 2.6.31). Even by 
station alone is 12139 tables, which might be tolerable for now, but 
with an order of magnitude more stations on the distant horizon, it 
will not scale.
Yes, I've read a few times now that PG's partitioning doesn't scale 
beyond a few 100 partitions.
I also thought about splitting the data by station district by 
category -- there are 79 districts, yielding 474 child tables, which 
is ~575000 rows per child table. Most of the time I'd imagine only one 
or two districts would be selected. (Again, hard to know exactly.)
I agee with Matthew Wakeling in a different post: its probably wise to 
first see how fast things can get by using indexes. Only if that fails 
to be fast, partitioning might be an option. (Though sequentially 
scanning 0.5M rows is not cheap).


I experimented a bit with a doy and year function.

-- note: leap year fix must still be added
create or replace function doy(timestamptz) RETURNS float8
as 'select extract(doy from $1);'
language sql
immutable
strict;
create or replace function year(timestamptz) RETURNS float8
as 'select extract(year from $1);'
language sql
immutable
strict;

\d parent
Table public.parent
Column |   Type   | Modifiers
+--+---
t  | timestamp with time zone |
y  | smallint |
Indexes:
   doy_i btree (doy(t))
   year_i btree (year(t))

A plan like the following is probably what you want

test=# explain select * from parent where doy(t) between 10 and 20 and 
year(t) between 1900 and 2009;
  
QUERY 
PLAN


-
Bitmap Heap Scan on parent  (cost=9.95..14.97 rows=1 width=10)
  Recheck Cond: ((year(t) = 1900::double precision) AND (year(t) = 
2009::double precision) AND (doy(t) = 10::double precision) AND (doy(t) 
= 20::double precision))

  -  BitmapAnd  (cost=9.95..9.95 rows=1 width=0)
-  Bitmap Index Scan on year_i  (cost=0.00..4.85 rows=10 width=0)
  Index Cond: ((year(t) = 1900::double precision) AND 
(year(t) = 2009::double precision))

-  Bitmap Index Scan on doy_i  (cost=0.00..4.85 rows=10 width=0)
  Index Cond: ((doy(t) = 10::double precision) AND (doy(t) 
= 20::double precision))

(7 rows)

regards,
Yeb Havinga




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Matthew Wakeling
Regarding the leap year problem, you might consider creating a modified day 
of year field, which always assumes that the year contains a leap day. Then 
a given number always resolves to a given date, regardless of year. If you 
then partition (or index) on that field, then you may get a benefit.

On Fri, 21 May 2010, Yeb Havinga wrote:
Shouldn't it be just the other way around - assume all years are non leap 
years for the doy part field to be indexed.


The mapping doesn't matter massively, as long as all days of the year can 
be mapped uniquely onto a number, and the numbers are sequential. Your 
suggestion does not satisfy the first of those two requirements.


If you assume that all yeasr are leap years, then you merely skip a number 
in the middle of the year, which isn't a problem when you want to check 
for days between two bounds. However, if you assume non leap year, then 
there is no representation for the 29th of February, so not all data 
points will have a representative number to insert into the database.


Matthew

--
No, C++ isn't equal to D. 'C' is undeclared, so we assume it's an int,
with a default value of zero.  Hence, C++ should really be called 1.
-- met24, commenting on the quote C++ -- shouldn't it be called D?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-21 Thread Stephen Frost
* Yeb Havinga (yebhavi...@gmail.com) wrote:
 Normalizing by date parts was fast. Partitioning the tables by year  
 won't do much good -- users will probably choose 1900 to 2009,  
 predominately.
 Ok, in that case it is a bad idea.

Yeah, now that I understand what the user actually wants, I can
certainly understand that you wouldn't want to partition by year.  It
does strike me that perhaps you could partition by day ranges, but you'd
have to store them as something other than the 'date' type, which is
certainly frustrating, but you're not really operating on these in a
'normal' fashion as you would with a date.

The next question I would have, however, is if you could pre-aggregate
some of this data..  If users are going to typically use 1900-2009 for
years, then could the information about all of those years be aggregated
apriori to make those queries faster?

 I thought about splitting the data by station by category, but that's  
 ~73000 tables.

Do not get hung up on having to have a separate table for every unique
value in the column- you don't need that.  constraint_exclusion will
work just fine with ranges too- the problem is that you need to have
ranges that make sense with the data type you're using and with the
queries you're running.  That doesn't really work here with the
measurement_date, but it might work just fine with your station_id
field.

 I also thought about splitting the data by station district by  
 category -- there are 79 districts, yielding 474 child tables, which  
 is ~575000 rows per child table. Most of the time I'd imagine only one  
 or two districts would be selected. (Again, hard to know exactly.)

Also realize that PG will use multiple files for a single table once the
size of that table goes beyond 1G.

 I agee with Matthew Wakeling in a different post: its probably wise to  
 first see how fast things can get by using indexes. Only if that fails  
 to be fast, partitioning might be an option. (Though sequentially  
 scanning 0.5M rows is not cheap).

I would agree with this too- get it working first, then look at
partitioning.  Even more so- work on a smaller data set to begin with
while you're figuring out how to get the right answer in a generally
efficient way (not doing seq. scans through everything because you're
operating on every row for something).  It needs to be a couple
hundred-thousand rows, but it doesn't need to be the full data set, imv.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Yeb Havinga

Hello David,
The table aggregates 237 million rows from its child tables. The 
sluggishness comes from this part of the query:


  m.taken BETWEEN
/* Start date. */
  (extract( YEAR FROM m.taken )||'-01-01')::date AND
/* End date. Calculated by checking to see if the end date wraps
  into the next year. If it does, then add 1 to the current year.
*/
(cast(extract( YEAR FROM m.taken ) + greatest( -1 *
  sign(
(extract( YEAR FROM m.taken )||'-12-31')::date -
(extract( YEAR FROM m.taken )||'-01-01')::date ), 0
) AS text)||'-12-31')::date
Either I had too less coffee and completely misunderstand this 
expression, or it is always true and can be omitted. Could you explain a 
bit what this part tries to do and maybe also show it's original 
counterpart in the source database?


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Thom Brown
On 20 May 2010 06:06, David Jarvis thanga...@gmail.com wrote:
 Hi,

 I recently switched to PostgreSQL from MySQL so that I can use PL/R for data
 analysis. The query in MySQL form (against a more complex table structure)
 takes ~5 seconds to run. The query in PostgreSQL I have yet to let finish,
 as it takes over a minute. I think I have the correct table structure in
 place (it is much simpler than the former structure in MySQL), however the
 query executes a full table scan against the parent table's 273 million
 rows.

 Questions

 What is the proper way to index the dates to avoid full table scans?

 Options I have considered:

 GIN
 GiST
 Rewrite the WHERE clause
 Separate year_taken, month_taken, and day_taken columns to the tables

 Details

 The HashAggregate from the plan shows a cost of 10006220141.11, which is, I
 suspect, on the astronomically huge side. There is a full table scan on the
 measurement table (itself having neither data nor indexes) being performed.
 The table aggregates 237 million rows from its child tables. The
 sluggishness comes from this part of the query:

   m.taken BETWEEN
     /* Start date. */
   (extract( YEAR FROM m.taken )||'-01-01')::date AND
     /* End date. Calculated by checking to see if the end date wraps
   into the next year. If it does, then add 1 to the current year.
     */
     (cast(extract( YEAR FROM m.taken ) + greatest( -1 *
   sign(
     (extract( YEAR FROM m.taken )||'-12-31')::date -
     (extract( YEAR FROM m.taken )||'-01-01')::date ), 0
     ) AS text)||'-12-31')::date

 There are 72 child tables, each having a year index and a station index,
 which are defined as follows:

     CREATE TABLE climate.measurement_12_013 (
     -- Inherited from table climate.measurement_12_013:  id bigint NOT NULL
 DEFAULT nextval('climate.measurement_id_seq'::regclass),
     -- Inherited from table climate.measurement_12_013:  station_id integer
 NOT NULL,
     -- Inherited from table climate.measurement_12_013:  taken date NOT
 NULL,
     -- Inherited from table climate.measurement_12_013:  amount numeric(8,2)
 NOT NULL,
     -- Inherited from table climate.measurement_12_013:  category_id
 smallint NOT NULL,
     -- Inherited from table climate.measurement_12_013:  flag character
 varying(1) NOT NULL DEFAULT ' '::character varying,
   CONSTRAINT measurement_12_013_category_id_check CHECK (category_id =
 7),
   CONSTRAINT measurement_12_013_taken_check CHECK
 (date_part('month'::text, taken)::integer = 12)
     )
     INHERITS (climate.measurement)

     CREATE INDEX measurement_12_013_s_idx
   ON climate.measurement_12_013
   USING btree
   (station_id);
     CREATE INDEX measurement_12_013_y_idx
   ON climate.measurement_12_013
   USING btree
   (date_part('year'::text, taken));

 (Foreign key constraints to be added later.)

 The following query runs abysmally slow due to a full table scan:

     SELECT
   count(1) AS measurements,
   avg(m.amount) AS amount
     FROM
   climate.measurement m
     WHERE
   m.station_id IN (
     SELECT
   s.id
     FROM
   climate.station s,
   climate.city c
     WHERE
     /* For one city... */
     c.id = 5182 AND

     /* Where stations are within an elevation range... */
     s.elevation BETWEEN 0 AND 3000 AND

     /* and within a specific radius... */
     6371.009 * SQRT(
   POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) +
     (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) *
   POW(RADIANS(c.longitude_decimal - s.longitude_decimal),
 2))
     ) = 50
     ) AND

   /* Data before 1900 is shaky; insufficient after 2009. */
   extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND

   /* Whittled down by category... */
   m.category_id = 1 AND

   /* Between the selected days and years... */
   m.taken BETWEEN
    /* Start date. */
    (extract( YEAR FROM m.taken )||'-01-01')::date AND
     /* End date. Calculated by checking to see if the end date wraps
    into the next year. If it does, then add 1 to the current year.
     */
     (cast(extract( YEAR FROM m.taken ) + greatest( -1 *
   sign(
     (extract( YEAR FROM m.taken )||'-12-31')::date -
     (extract( YEAR FROM m.taken )||'-01-01')::date ), 0
     ) AS text)||'-12-31')::date
     GROUP BY
   extract( YEAR FROM m.taken )

 What are your thoughts?

 Thank you!



Could you provide the EXPLAIN output for that slow query?

Thom

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Matthew Wakeling

On Wed, 19 May 2010, David Jarvis wrote:

 extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND


That portion of the WHERE clause cannot use an index on m.taken. Postgres 
does not look inside functions (like extract) to see if something 
indexable is present. To get an index to work, you could create an index 
on (extract(YEAR FROM m.taken)).


Matthew

--
Here we go - the Fairy Godmother redundancy proof.
   -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes:
 On Wed, 19 May 2010, David Jarvis wrote:
 extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND

 That portion of the WHERE clause cannot use an index on m.taken. Postgres 
 does not look inside functions (like extract) to see if something 
 indexable is present. To get an index to work, you could create an index 
 on (extract(YEAR FROM m.taken)).

What you really need to do is not do date arithmetic using text-string
operations.  The planner has no intelligence about that whatsoever.
Convert the operations to something natural using real date or timestamp
types, and then look at what indexes you need.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
Hi,

I have posted an image of the user inputs here:

http://i.imgur.com/MUkuZ.png

The problem is that I am given a range of days (Dec 22 - Mar 22) over a
range of years (1900 - 2009) and the range of days can span from one year to
the next. This is not the same as saying Dec 22, 1900 to Mar 22, 2009, for
which I do not need date math.

What you really need to do is not do date arithmetic using text-string
 operations.  The planner has no intelligence about that whatsoever.
 Convert the operations to something natural using real date or timestamp
 types, and then look at what indexes you need.


Any suggestions on how to go about this?

Thanks again!

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Thom Brown
On 20 May 2010 17:36, David Jarvis thanga...@gmail.com wrote:
 Hi, Thom.

 The query is given two items:

 Range of years
 Range of days

 I need to select all data between the range of days (e.g., Dec 22 - Mar 22)
 over the range of years (e.g., 1950 - 1970), such as shown here:

 http://i.imgur.com/MUkuZ.png

 For Jun 1 to Jul 1 it would be no problem because they the same year. But
 for Dec 22 to Mar 22, it is difficult because Mar 22 is in the next year
 (relative to Dec 22).

 How do I do that without strings?

 Dave



Okay, get your app to convert the month-date to a day of year, so we
have year_start, year_end, day_of_year_start, day_of_year_end

and your where clause would something like this:

WHERE extract(YEAR from m.taken) BETWEEN year1 and year2
AND (
extract(DOY from m.taken) BETWEEN day_of_year_start AND day_of_year_end
OR (
extract(DOY from m.taken) = day_of_year_start OR extract(DOY 
from
m.taken) = day_of_year_end
)
)

... substituting the placeholders where they appear.

So if we had:

year1=1941
year2=1952
day_of_year_start=244 (based on input date of 1st September)
day_of_year_end=94 (based on 4th April)

We'd have:

WHERE extract(YEAR from m.taken) BETWEEN 1941 and 1952
AND (
extract(DOY from m.taken) BETWEEN 244 AND 94
OR (
extract(DOY from m.taken) = 244 OR extract(DOY from m.taken) 
= 94
)
)

Then you could add expression indexes for the YEAR and DOY extract parts, like:

CREATE INDEX idx_taken_doy ON climate.measurement (EXTRACT(DOY from taken));
CREATE INDEX idx_taken_year ON climate.measurement (EXTRACT(YEAR from taken));

Although maybe you don't need those, depending on how the date
datatype matching works in the planner with the EXTRACT function.

Regards

Thom

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Thom Brown
On 20 May 2010 19:36, Thom Brown thombr...@gmail.com wrote:
 On 20 May 2010 17:36, David Jarvis thanga...@gmail.com wrote:
 Hi, Thom.

 The query is given two items:

 Range of years
 Range of days

 I need to select all data between the range of days (e.g., Dec 22 - Mar 22)
 over the range of years (e.g., 1950 - 1970), such as shown here:

 http://i.imgur.com/MUkuZ.png

 For Jun 1 to Jul 1 it would be no problem because they the same year. But
 for Dec 22 to Mar 22, it is difficult because Mar 22 is in the next year
 (relative to Dec 22).

 How do I do that without strings?

 Dave



 Okay, get your app to convert the month-date to a day of year, so we
 have year_start, year_end, day_of_year_start, day_of_year_end

 and your where clause would something like this:

 WHERE extract(YEAR from m.taken) BETWEEN year1 and year2
 AND (
        extract(DOY from m.taken) BETWEEN day_of_year_start AND day_of_year_end
        OR (
                extract(DOY from m.taken) = day_of_year_start OR extract(DOY 
 from
 m.taken) = day_of_year_end
        )
 )

 ... substituting the placeholders where they appear.

 So if we had:

 year1=1941
 year2=1952
 day_of_year_start=244 (based on input date of 1st September)
 day_of_year_end=94 (based on 4th April)

 We'd have:

 WHERE extract(YEAR from m.taken) BETWEEN 1941 and 1952
 AND (
        extract(DOY from m.taken) BETWEEN 244 AND 94
        OR (
                extract(DOY from m.taken) = 244 OR extract(DOY from m.taken) 
 = 94
        )
 )

 Then you could add expression indexes for the YEAR and DOY extract parts, 
 like:

 CREATE INDEX idx_taken_doy ON climate.measurement (EXTRACT(DOY from taken));
 CREATE INDEX idx_taken_year ON climate.measurement (EXTRACT(YEAR from taken));

 Although maybe you don't need those, depending on how the date
 datatype matching works in the planner with the EXTRACT function.

 Regards

 Thom


Actually, you could change that last bit from:

 OR (
                extract(DOY from m.taken) = day_of_year_start OR
extract(DOY from m.taken) = day_of_year_end
       )

to

OR extract(DOY from m.taken) NOT BETWEEN day_of_year_end AND day_of_year_start

That would be tidier and simpler :)

Thom

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Tom Lane
Thom Brown thombr...@gmail.com writes:
 On 20 May 2010 17:36, David Jarvis thanga...@gmail.com wrote:
 Okay, get your app to convert the month-date to a day of year, so we
 have year_start, year_end, day_of_year_start, day_of_year_end

 and your where clause would something like this:

 WHERE extract(YEAR from m.taken) BETWEEN year1 and year2
 AND (
   extract(DOY from m.taken) BETWEEN day_of_year_start AND day_of_year_end
   OR (
   extract(DOY from m.taken) = day_of_year_start OR extract(DOY 
 from
 m.taken) = day_of_year_end
   )
 )

extract(DOY) seems a bit problematic here, because its day numbering is
going to be different between leap years and non-leap years, and David's
problem statement doesn't allow for off-by-one errors.  You could
certainly invent your own function that worked similarly but always
translated a given month/day to the same number.

The other thing that's messy here is the wraparound requirement.
Rather than trying an OR like the above (which I think doesn't quite
work anyway --- won't it select everything?), it would be better if
you can have the app distinguish wraparound from non-wraparound cases
and issue different queries in the two cases.  In the non-wrap case
(start_day  end_day) it's pretty easy, just
my_doy(m.taken) BETWEEN start_val AND end_val
The easy way to handle the wrap case is
my_doy(m.taken) = start_val OR my_doy(m.taken) = end_val
although I can't help feeling there should be a smarter way to do
this where you can use an AND range check on some modified expression
derived from the date.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Thom Brown
On 20 May 2010 20:02, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown thombr...@gmail.com writes:
 On 20 May 2010 17:36, David Jarvis thanga...@gmail.com wrote:
 Okay, get your app to convert the month-date to a day of year, so we
 have year_start, year_end, day_of_year_start, day_of_year_end

 and your where clause would something like this:

 WHERE extract(YEAR from m.taken) BETWEEN year1 and year2
 AND (
       extract(DOY from m.taken) BETWEEN day_of_year_start AND day_of_year_end
       OR (
               extract(DOY from m.taken) = day_of_year_start OR extract(DOY 
 from
 m.taken) = day_of_year_end
       )
 )

 extract(DOY) seems a bit problematic here, because its day numbering is
 going to be different between leap years and non-leap years, and David's
 problem statement doesn't allow for off-by-one errors.  You could
 certainly invent your own function that worked similarly but always
 translated a given month/day to the same number.

 The other thing that's messy here is the wraparound requirement.
 Rather than trying an OR like the above (which I think doesn't quite
 work anyway --- won't it select everything?)

No.  It only would if using BETWEEN SYMMETRIC.

Like if m.taken is '2003-02-03', using a start day of year as 11th Nov
and end as 17th Feb, it would match the 2nd part of the outer OR
expression.  If you changed the end day of year to 2nd Feb, it would
yield no result as nothing is between 11th Nov and 17th Feb as it's a
negative difference, and 2nd Feb is lower than the taken date so fails
to match the first half of the inner most OR expression.

 , it would be better if
 you can have the app distinguish wraparound from non-wraparound cases
 and issue different queries in the two cases.  In the non-wrap case
 (start_day  end_day) it's pretty easy, just
        my_doy(m.taken) BETWEEN start_val AND end_val
 The easy way to handle the wrap case is
        my_doy(m.taken) = start_val OR my_doy(m.taken) = end_val
 although I can't help feeling there should be a smarter way to do
 this where you can use an AND range check on some modified expression
 derived from the date.

                        regards, tom lane


Yes, I guess I agree that the app can run different queries depending
on which date is higher.  I hadn't factored leap years into the
equation.  Can't think of what could be done for those cases off the
top of my head.  What is really needed is a way to match against day
and month parts instead of day, month and year without resorting
to casting to text of course.

Thom

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
When using MySQL, the performance was okay (~5 seconds per query) using:

  date( concat_ws( '-', y.year, m.month, d.day ) ) between
-- Start date.
date( concat_ws( '-', y.year, $P{Month1}, $P{Day1} ) ) AND
-- End date. Calculated by checking to see if the end date wraps
-- into the next year. If it does, then add 1 to the current year.
--
date(
  concat_ws( '-',
y.year + greatest( -1 *
  sign(
datediff(
  date(
concat_ws('-', y.year, $P{Month2}, $P{Day2} )
  ),
  date(
concat_ws('-', y.year, $P{Month1}, $P{Day1} )
  )
)
  ), 0
), $P{Month2}, $P{Day2}
  )
)

This calculated the correct start days and end days, including leap years.

With MySQL, I normalized the date into three different tables: year
references, month references, and day references. The days contained only
the day (of the month) the measurement was made and the measured value. The
month references contained the month number for the measurement. The year
references had the years and station. Each table had its own index on the
year, month, or day.

When I had proposed that solution to the mailing list, I was introduced to a
more PostgreSQL-way, which was to use indexes on the date field.

In PostgreSQL, I have a single measurement table for the data (divided
into 72 child tables), which includes the date and station. I like this
because it feels clean and it is easier to understand. So far, however, it
has not been fast.

I was thinking that I could add three more columns to the measurement table:

year_taken, month_taken, day_taken

Then index those. That should allow me to avoid extracting years, months,
and days from the *m.taken* date column.

What do you think?

Thanks again!
Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Tom Lane
David Jarvis thanga...@gmail.com writes:
 I was thinking that I could add three more columns to the measurement table:
 year_taken, month_taken, day_taken
 Then index those. That should allow me to avoid extracting years, months,
 and days from the *m.taken* date column.

You could, but I don't think there's any advantage to that versus
putting indexes on extract(day from taken) etc.  The extra fields
eat more space in the table proper, and the functional index isn't
really any more expensive than a plain index.  Not to mention that
you can have bugs with changing the date and forgetting to update
the derived columns, etc etc.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
What if I were to have the application pass in two sets of date ranges?

For the condition of Dec 22 to Mar 22:

Dec 22 would become:

   - Dec 22 - Dec 31

Mar 22 would become:

   - Jan 1 - Mar 22

The first range would always be for the current year; the second range would
always be for the year following the current year.

Would that allow PostgreSQL to use the index?

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Tom Lane
David Jarvis thanga...@gmail.com writes:
 What if I were to have the application pass in two sets of date ranges?
 For the condition of Dec 22 to Mar 22:
 Dec 22 would become:
- Dec 22 - Dec 31
 Mar 22 would become:
- Jan 1 - Mar 22

I think what you're essentially describing here is removing the OR from
the query in favor of issuing two queries and then combining the results
in the app.  Yeah, you could do that, but one would hope that it isn't
faster ;-)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
I was hoping to eliminate this part of the query:

(cast(extract( YEAR FROM m.taken ) + greatest( -1 *
  sign(
(extract( YEAR FROM m.taken )||'-12-31')::date -
(extract( YEAR FROM m.taken )||'-01-01')::date ), 0
) AS text)||'-12-31')::date

That uses functions to create the dates, which is definitely the problem.
I'd still have the query return all the results for both data sets. If
providing the query with two data sets won't work, what will?

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Tom Lane
David Jarvis thanga...@gmail.com writes:
 I was hoping to eliminate this part of the query:
 (cast(extract( YEAR FROM m.taken ) + greatest( -1 *
   sign(
 (extract( YEAR FROM m.taken )||'-12-31')::date -
 (extract( YEAR FROM m.taken )||'-01-01')::date ), 0
 ) AS text)||'-12-31')::date

 That uses functions to create the dates, which is definitely the problem.

Well, it's not the functions per se that's the problem, it's the lack of
a useful index on the expression.  But as somebody remarked upthread,
that expression doesn't look correct at all.  Doesn't the whole
greatest() subexpression reduce to a constant?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
Hi,

I was still referring to the measurement table. You have an index on
 stationid, but still seem to be getting a sequential scan. Maybe the planner
 does not realise that you are selecting a small number of stations. Posting
 an EXPLAIN ANALYSE would really help here.


Here is the result from an *EXPLAIN ANALYZE*:

HashAggregate  (cost=5486752.27..5486756.27 rows=200 width=12) (actual
time=314328.657..314328.728 rows=110 loops=1)
  -  Hash Semi Join  (cost=1045.52..5451155.11 rows=4746289 width=12)
(actual time=197.950..313605.795 rows=463926 loops=1)
Hash Cond: (m.station_id = s.id)
-  Append  (cost=0.00..5343318.08 rows=4746289 width=16) (actual
time=74.411..306533.820 rows=42737997 loops=1)
  -  Seq Scan on measurement m  (cost=0.00..148.00 rows=1
width=20) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2009::double precision) AND (taken = (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken = date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))
  -  Seq Scan on measurement_01_001 m  (cost=0.00..438102.26
rows=389080 width=16) (actual time=74.409..24800.171 rows=3503256 loops=1)
Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2009::double precision) AND (taken = (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken = date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))
  -  Seq Scan on measurement_02_001 m  (cost=0.00..399834.28
rows=354646 width=16) (actual time=29.217..22209.877 rows=3196631 loops=1)
Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2009::double precision) AND (taken = (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken = date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))
  -  Seq Scan on measurement_03_001 m  (cost=0.00..438380.23
rows=389148 width=16) (actual time=15.915..24366.766 rows=3503937 loops=1)
Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2009::double precision) AND (taken = (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken = date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))
  -  Seq Scan on measurement_04_001 m  (cost=0.00..432850.57
rows=384539 width=16) (actual time=15.852..24280.031 rows=3461931 loops=1)
Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) = 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) =
2009::double precision) AND (taken = (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken = date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -

Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread David Jarvis
The greatest() expression reduces to either the current year (year + 0) or
the next year (year + 1) by taking the sign of the difference in start/end
days. This allows me to derive an end date, such as:

Dec 22, 1900 to Mar 22, 1901

Then I check if the measured date falls between those two dates.

The expression might not be correct as I'm still quite new to PostgreSQL's
syntax.

Dave


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Stephen Frost
* David Jarvis (thanga...@gmail.com) wrote:
 I was hoping to eliminate this part of the query:
 (cast(extract( YEAR FROM m.taken ) + greatest( -1 *
   sign(
 (extract( YEAR FROM m.taken )||'-12-31')::date -
 (extract( YEAR FROM m.taken )||'-01-01')::date ), 0
 ) AS text)||'-12-31')::date

 That uses functions to create the dates, which is definitely the problem. 
[...]
 The greatest() expression reduces to either the current year (year + 0) or
 the next year (year + 1) by taking the sign of the difference in start/end
 days. This allows me to derive an end date, such as:
 
 Dec 22, 1900 to Mar 22, 1901

Something in here really smells fishy to me.  Those extract's above are
working on values which are from the table..  Why aren't you using these
functions to figure out how to construct the actual dates based on the
values provided by the *user*..?

Looking at your screenshot, I think you need to take those two date
values that the user provides, make them into actual dates (maybe you
need a CASE statement or something similar, that shouldn't be that hard,
and PG should just run that whole bit once, since to PG's point of view,
it's all constants), and then use those dates to query the tables.

Also, you're trying to do constraint_exclusion, but have you made sure
that it's turned on?  And have you made sure that those constraints are
really the right ones and that they make sense?  You're using a bunch of
extract()'s there too, why not just specify a CHECK constraint on the
date ranges which are allowed in the table..?

Maybe I've misunderstood the whole point here, but I don't think so.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Yeb Havinga

Tom Lane wrote:

David Jarvis thanga...@gmail.com writes:
  

I was hoping to eliminate this part of the query:
(cast(extract( YEAR FROM m.taken ) + greatest( -1 *
  sign(
(extract( YEAR FROM m.taken )||'-12-31')::date -
(extract( YEAR FROM m.taken )||'-01-01')::date ), 0
) AS text)||'-12-31')::date

That uses functions to create the dates, which is definitely the problem.



Well, it's not the functions per se that's the problem, it's the lack of
a useful index on the expression.  But as somebody remarked upthread,
that expression doesn't look correct at all.  Doesn't the whole
greatest() subexpression reduce to a constant?
  
That somebody was probably me. I still think the whole BETWEEN 
expression is a tautology. A small test did not provide a 
counterexample. In the select below everything but the select was 
copy/pasted.


create table m (taken timestamptz);
insert into m values (now());
insert into m values ('1900-12-31');
insert into m values ('2000-04-06');
select m.taken BETWEEN
   /* Start date. */
 (extract( YEAR FROM m.taken )||'-01-01')::date AND
   /* End date. Calculated by checking to see if the end date wraps
 into the next year. If it does, then add 1 to the current year.
   */
   (cast(extract( YEAR FROM m.taken ) + greatest( -1 *
 sign(
   (extract( YEAR FROM m.taken )||'-12-31')::date -
   (extract( YEAR FROM m.taken )||'-01-01')::date ), 0
   ) AS text)||'-12-31')::date from m;
 ?column?
--
t
t
t
(3 rows)

Another thing is that IF the climate measurements is partitioned on time 
(e.g each year?), then a function based index on the year part of 
m.taken is useless, pardon my french. I'm not sure if it is partitioned 
that way but it is an interesting thing to inspect, and perhaps rewrite 
the query to use constraint exclusion.


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?

2010-05-20 Thread Stephen Frost
* David Jarvis (thanga...@gmail.com) wrote:
 There are 72 child tables, each having a year index and a station index,
 which are defined as follows:

S, my thoughts:

Partition by something that makes sense...  Typically, I'd say that you
would do it by the category id and when the measurement was taken.  Then
set up the appropriate check constraints on that so that PG can use
constraint_exclusion to identify what table it needs to actually go look
in.  How much data are we talking about, by the way? (# of rows)  If
you're not in the milions, partitioning at all is probably overkill and
might be part of the problem here..

create table climate.measurement_12_013 (
id bigint not null DEFAULT 
nextval('climate.measurement_id_seq'::regclass),
station_id integer not null,
taken date not null,
amount numeric(8,2) not null,
category_id integer not null,
flag varchar(1) not null default ' ',
check (category_id = 7),
check (taken = '1913-12-01' and taken = '1913-12-31')
)
inherits (climate.measurement);

CREATE INDEX measurement_12_013_s_idx
  ON climate.measurement_12_013
  USING btree
  (station_id);

CREATE INDEX measurement_12_013_d_idx
  ON climate.measurement_12_013
  USING btree
  (taken);

SELECT
  count(1) AS measurements,
  avg(m.amount) AS amount
FROM
  climate.measurement m
WHERE
  m.station_id IN (
SELECT
  s.id
FROM
  climate.station s,
  climate.city c
WHERE
/* For one city... */
c.id = 5182 AND

/* Where stations are within an elevation range... */
s.elevation BETWEEN 0 AND 3000 AND

/* and within a specific radius... */
-- Seriously, you should be using PostGIS here, that can
-- then use a GIST index to do this alot faster with a
-- bounding box...
6371.009 * SQRT(
  POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) +
(COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) *
  POW(RADIANS(c.longitude_decimal - s.longitude_decimal),
2))
) = 50
) AND

  /* Data before 1900 is shaky; insufficient after 2009. */
  -- I have no idea why this is here..  Aren't you forcing
  -- this already in your application code that's checking
  -- user input values?  Also, do you actually *have* any
  -- data outside this range?  If so, just pull out the
  -- tables with that data from the inheiritance
  -- m.taken = '1900-01-01' AND m.taken = '2009-12-31'
  -- extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND

  /* Whittled down by category... */
  m.category_id = 1 AND

  /* Between the selected days and years... */
   CASE
 WHEN (user_start_year || user_start_day = user_stop_year || 
user_stop) THEN
 m.taken BETWEEN user_start_year || user_start_day  AND 
user_stop_year || user_stop
 WHEN (user_start_year || user_start_day  user_stop_year || 
user_stop) THEN
 m.taken BETWEEN (user_start_year || user_start_day)::date  AND
 ((user_stop_year || user_stop)::date + '1
 year'::interval)::date
-- I don't think you need/want this..?
-- GROUP BY
--  extract( YEAR FROM m.taken )

Enjoy,

Stephen


signature.asc
Description: Digital signature


  1   2   >