Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

2016-10-25 Thread Lars Aksel Opsahl

Hi

I have now tested through insert and the updated and it works extremely good 
with out doing any partitioning on big table (428.812.8392 rows) this case, 
when we used a common index as Tom Suggested. 

We are able to insert 172.000 rows pr. second.

The number of rows are competed based total time from when we start to read the 
csv files and until the last file is done. We use GNU parallel and run 5 
threads. The number of inserts are actually 172.000 * 2 because first we copy 
the rows into a temp table and there we prepare the date and then insert them 
into the common big main table. There is no errors in the log. 

We are able update 98.000 rows pr, second.

Since each update also means one insert we are close 200.000 inserts and 
updates pr. second. For update we give a column column that is null a value. 
Thats is done for all the 4.3 billions rows. We run 5 threads in parallel here 
also, and there is no error and no dead locks. 

To get around the problem with duplication of indexes it's solvable in this 
project because first we add date and then we do analyses, this means that we 
can have different indexes when adding data and we are using them.

In this project we going add about 25 billions geo located observations which 
which will be used for doing analyses. I suppose that we at some level have to 
do partitioning but so far Postgres has worked extremely well even if it's 
based on MVCC. 

Postgres/Postgis software and communities are sure for sure really fun to work 
with Postgres/Postgis open source software hold a very high quality. 

Thanks.

Lars


Fra: pgsql-performance-ow...@postgresql.org 
<pgsql-performance-ow...@postgresql.org> på vegne av Tom Lane 
<t...@sss.pgh.pa.us>
Sendt: 24. oktober 2016 14:52
Til: Lars Aksel Opsahl
Kopi: pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 
billion rows

Lars Aksel Opsahl <lars.ops...@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 
> 5 rows and then I try to do a standard simple join between this two 
> tables and this takes 397391  ms. with this SQL (the query plan is added is 
> further down)

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the table, which would be an even stronger reason to have a
unique index on it.

regards, tom lane


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

2016-10-24 Thread Lars Aksel Opsahl
Hi

Yes this makes both the update and both selects much faster. We are now down to 
3000 ms. for select, but then I get a problem with another SQL where I only use 
epoch in the query. 

SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch 
= 128844;
 count 
---
 97831
(1 row)
Time: 92763.389 ms

To get the SQL above work fast it seems like we also need a single index on the 
epoch column, this means two indexes on the same column and that eats memory 
when we have more than 4 billion rows.

Is it any way to avoid to two indexes on the epoch column ?

Thanks.

Lars

EXPLAIN  analyze SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation 
o WHERE o.epoch = 128844;
-[ RECORD 1 
]-
QUERY PLAN | Aggregate  (cost=44016888.13..44016888.14 rows=1 width=42) (actual 
time=91307.470..91307.471 rows=1 loops=1)
-[ RECORD 2 
]-
QUERY PLAN |   ->  Index Scan using 
idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o  
(cost=0.58..44016649.38 rows=95500 width=42) (actual time=1.942..91287.495 
rows=97831 loops=1)
-[ RECORD 3 
]-
QUERY PLAN | Index Cond: (epoch = 128844)
-[ RECORD 4 
]-
QUERY PLAN | Total runtime: 91307.534 ms


EXPLAIN  analyze
SELECT count(o.*)
FROM 
met_vaer_wisline.nora_bc25_observation o,
met_vaer_wisline.new_data n
WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch;
-[ RECORD 1 
]
QUERY PLAN | Aggregate  (cost=131857.71..131857.72 rows=1 width=42) (actual 
time=182.459..182.459 rows=1 loops=1)
-[ RECORD 2 
]
QUERY PLAN |   ->  Nested Loop  (cost=0.58..131727.00 rows=52283 width=42) 
(actual time=0.114..177.420 rows=5 loops=1)
-[ RECORD 3 
]
QUERY PLAN | ->  Seq Scan on new_data n  (cost=0.00..1136.00 rows=5 
width=8) (actual time=0.050..7.873 rows=5 loops=1)
-[ RECORD 4 
]
QUERY PLAN | ->  Index Scan using 
idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o  
(cost=0.58..2.60 rows=1 width=50) (actual time=0.003..0.003 rows=1 loops=5)
-[ RECORD 5 
]
QUERY PLAN |   Index Cond: ((point_uid_ref = n.id_point) AND (epoch 
= n.epoch))
-[ RECORD 6 
]
QUERY PLAN | Total runtime: 182.536 ms

Time: 3095.618 ms


Lars
 


Fra: pgsql-performance-ow...@postgresql.org 
<pgsql-performance-ow...@postgresql.org> på vegne av Tom Lane 
<t...@sss.pgh.pa.us>
Sendt: 24. oktober 2016 14:52
Til: Lars Aksel Opsahl
Kopi: pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 
billion rows

Lars Aksel Opsahl <lars.ops...@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 
> 5 rows and then I try to do a standard simple join between this two 
> tables and this takes 397391  ms. with this SQL (the query plan is added is 
> further down)

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the tab

[PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

2016-10-24 Thread Lars Aksel Opsahl
 | integer | not null

 windspeed_10m  | real|

 air_temperature_2m | real|

Indexes:

"idx_met_vaer_wisline_nora_bc25_observation_epoch" btree (epoch)

"idx_met_vaer_wisline_nora_bc25_observation_point_uid_ref" btree 
(point_uid_ref)



\d met_vaer_wisline.new_data ;

 Unlogged table "met_vaer_wisline.new_data"

   Column   |   Type| Modifiers

+---+---

 windspeed_10m  | real  |

 air_temperature_2m | real  |

 lon| character varying | not null

 lat| character varying | not null

 epoch  | integer   |

 epoch_as_numeric   | numeric   | not null

 rest   | character varying |

 id_point   | integer   |

Indexes:

"idx_met_vaer_wisline_new_data_epoch" btree (epoch)

"idx_met_vaer_wisline_new_data_id_point" btree (id_point)


vacuum analyze met_vaer_wisline.nora_bc25_observation;


vacuum analyze met_vaer_wisline.new_data;


SELECT count(*) from met_vaer_wisline.new_data;

 count

---

 5

(1 row)


SELECT count(*) from met_vaer_wisline.nora_bc25_observation ;

   count



  4263866304


Thanks .


Lars




Re: [PERFORM] synchronous_commit off

2011-08-02 Thread lars hofhansl
No: The commit has the same guarantees as a synchronous commit w.r.t. data 
consistency. The commit can only fail (as a whole) due to hardware problems or 
postgres backend crashes. 


And yes: The client commit returns, but the server can fail later and not 
persist the transaction and it will be lost (again as a whole).

Your application should be able to tolerate losing the latest committed 
transactions if you use this.

The difference to fsync=off is that a server crash will leave the database is a 
consistent state with just the latest transactions lost.




From: Anibal David Acosta a...@devshock.com
To: pgsql-performance@postgresql.org
Sent: Monday, August 1, 2011 6:29 AM
Subject: [PERFORM] synchronous_commit off


Can a transaction committed asynchronously report an error, duplicate key or 
something like that, causing a client with a OK transaction but server with a 
FAILED transaction.
 
Thanks

Re: [PERFORM] Which Join is better

2011-08-02 Thread lars hofhansl
Unless you use the explicit join syntax:


select p.* from A p join B q on (p.id = q.id)

and also set  join_collapse_limit= 1
The order of the joins is determined by the planner.


Also explain is your friend :)


From: Adarsh Sharma adarsh.sha...@orkash.com
To: pgsql-performance@postgresql.org
Sent: Monday, August 1, 2011 11:42 PM
Subject: [PERFORM] Which Join is better

Dear all,

Just want to know which join is better for querying data faster.

I have 2 tables A ( 70 GB )  B ( 7 MB )

A has 10 columns  B has 3 columns.Indexes exist on both tables's ids.

select p.* from table A p, B q where p.id=q.id

or

select p.* from table B q , A p where q.id=p.id


Thanks

-- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-16 Thread lars

On 07/14/2011 08:47 AM, Tom Lane wrote:


The implementation I was imagining was to define another bit in the info
parameter for XLogInsert, say XLOG_NON_TRANSACTIONAL.  This could be a
high-order bit that would not go to disk.  Anytime it was *not* set,
XLogInsert would set a global boolean that would remember that the
current transaction wrote a transactional WAL record.  This is the
right default since the vast majority of call sites are writing records
that we would want to have flushed at commit.  There are just a couple
of places that would need to be changed to add this flag to their calls.

regards, tom lane



If you have a patch in mind I'm happy to test it on my setup and report 
back.


-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Statistics and Multi-Column indexes

2011-07-15 Thread lars

On 07/10/2011 02:31 PM, Samuel Gendler wrote:
What about partitioning tables by tenant id and then maintaining 
indexes on each partition independent of tenant id, since constraint 
exclusion should handle filtering by tenant id for you.  That seems 
like a potentially more tolerable variant of #5 How many tenants are 
we talking about?  I gather partitioning starts to become problematic 
when the number of partitions gets large.



I thought I had replied... Apparently I didn't.

The database can grow in two dimensions: The number of tenants and the 
number of rows per tenant.
We have many tenants with relatively little data and a few with a lot of 
data. So the number of tenants

is known ahead of time and might be 1000's.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-15 Thread lars
) = 594968576 0.23
1310775406.843459 lseek(12, 0, SEEK_END) = 382787584 0.22
1310775406.860266 brk(0x249)= 0x249 0.46
1310775406.860968 brk(0x24d)= 0x24d 0.95
1310775406.862449 brk(0x255)= 0x255 0.000112
1310775406.865095 semop(229383, {{2, -1, 0}}, 1) = 0 0.111698
...
1310775407.027235 semop(229383, {{2, -1, 0}}, 1) = 0 0.39
1310775407.027503 semop(229383, {{2, -1, 0}}, 1) = 0 2.215137
1310775409.243291 semop(229383, {{1, 1, 0}}, 1) = 0 0.29
...
1310775409.246963 semop(229383, {{2, -1, 0}}, 1) = 0 0.24
1310775409.252029 brk(0x2452000)= 0x2452000 0.000168
1310775409.252288 gettimeofday({1310775409, 252307}, NULL) = 0 0.21
1310775409.252393 sendto(5, 
\2\0\0\0\350\0\0\0\1@\0\0\2\0\0\0\2\0\0\0\0\0\0\0\r\201\0\0\0\2\0..., 
232, 0, NULL, 0) = 232 0.78
1310775409.252557 sendto(6, 
T\0\0\0\36\0\1count\0\0\0\0\0\0\0\0\0\0\24\0\10\377\377\377\377\0\0D..., 
66, 0, NULL, 0) = 66 0.000201


No WAL, but checkout that one expensive semop! 2s!!

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-13 Thread lars
 rows)



There seems to be definitely something funky going on. Since created_by 
is indexed it shouldn't do any HOT logic.


Is there any other information that I can provide? I'm happy to 
recompile with a patch applied, etc.


Thanks.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-13 Thread lars

On 07/13/2011 07:46 AM, Kevin Grittner wrote:


I've mentioned this in a hand-wavy general sense, but I should have
mentioned specifics ages ago: for a database where the active
portion of the database is fully cached, it is best to set
seq_page_cost and random_page_cost to the same value, somewhere in
the 0.1 to 0.05 range.  (In your case I would use 0.05.)  In highly
cached databases I have sometimes also found it necessary to
increase cpu_tuple_cost.  (In your case I might try 0.02.)
  
I've been doing that for other tests already (I didn't want to add too 
many variations here).
The Bitmap Heap scans through the table are only useful for spinning 
media and not the cache

(just to state the obvious).

As an aside: I found that queries in a cold database take almost twice 
as long when I make that change,

so for spinning media this is very important.


Which raises an interesting question -- what happens to
the timings if your SELECTs are done with synchronous_commit = off?


Just tried that...
In that case the WAL is still written (as seen via iostat), but not 
synchronously by the transaction (as seen by strace).


-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-13 Thread lars

On 07/13/2011 11:42 AM, Kevin Grittner wrote:

So transactions without an XID *are* sensitive to
synchronous_commit.  That's likely a useful clue.

How much did it help the run time of the SELECT which followed the
UPDATE?


It has surprisingly little impact on the SELECT side:

= set synchronous_commit = on;
= update test set created_by = '001' where tenant = 
'001';

UPDATE 3712
Time: 384.702 ms
lars= select count(*) from test where tenant = '001' and 
created_date = '2011-6-30';

 count
---
  3712
(1 row)

Time: 36.571 ms
= select count(*) from test where tenant = '001' and 
created_date = '2011-6-30';

 count
---
  3712
(1 row)

Time: 5.702 ms
= select count(*) from test where tenant = '001' and 
created_date = '2011-6-30';

 count
---
  3712
(1 row)

Time: 5.822 ms
= set synchronous_commit = off;
SET
Time: 0.145 ms
= update test set created_by = '001' where tenant = 
'001';

UPDATE 3712
Time: 96.227 ms
= select count(*) from test where tenant = '001' and 
created_date = '2011-6-30';

 count
---
  3712
(1 row)

Time: 32.422 ms
= select count(*) from test where tenant = '001' and 
created_date = '2011-6-30';

 count
---
  3712
(1 row)

Time: 6.080 ms

I tried it multiple times, and while the numbers change by 5-10ms the 
relationship is the same.


The same results show when I use my JDBC code to run updates/selects as 
fast as possible. When synchronous_commit is

off for the SELECTing process it seems to be slightly faster.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread lars

On 07/12/2011 08:13 AM, Ivan Voras wrote:

On 12/07/2011 02:09, lars wrote:


Oh, and iowait hovers around 20% when SELECTs are slow:

avg-cpu: %user %nice %system %iowait %steal %idle
1.54 0.00 0.98 18.49 0.07 78.92

When SELECTs are fast it looks like this:
avg-cpu: %user %nice %system %iowait %steal %idle
8.72 0.00 0.26 0.00 0.00 91.01

Note that this is a 12 core VM. So one core at 100% would show as 8.33%
CPU.


Now only if you could do an iostat -x and show the output in both 
cases...




Sure (sorry for missing details):

iostat -x during selects when all's fine:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   8.250.000.000.000.00   91.75

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
xvdap10.00 1.000.002.00 0.0024.00
12.00 0.000.00   0.00   0.00
xvdf  0.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdg  0.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00


xvdap1 is OS volumn.
xvdf holds the database files
xvdg holds the WAL

No IO on database/WAL volumes, one core is pegged close to 100% CPU.



iostat -x during update:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.050.000.584.000.00   94.37

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
xvdap10.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdf  0.00 0.007.000.00   128.00 0.00
18.29 0.000.00   0.00   0.00
xvdg  0.00  7352.000.00  804.00 0.00 62368.00
77.5766.07   68.83   0.86  69.20


Just updating the WAL.

---

and while it's checkpointing:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.640.000.328.880.00   90.16

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
xvdap10.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdf  0.00  2548.002.00 1658.0032.00 33408.00
20.14   144.18   86.69   0.60 100.00
xvdg  0.00  5428.000.00  778.00 0.00 58480.00
75.1777.44  100.22   1.21  94.00


Updating the WAL, and database volume due to checkpointing.

--

iostat -x after I stopped the update process and checkpointing is done:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.000.000.000.000.00  100.00

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
xvdap10.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdf  0.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdg  0.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00


No activity at all.

-

iostat -x after I started the select queries after the updates:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   2.090.001.49   12.150.00   84.26

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
xvdap10.00 0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
xvdf  0.00 8.000.002.00 0.0080.00
40.00 0.002.00   2.00   0.40
xvdg  0.00  7844.001.00 1098.00 8.00 82336.00
74.9358.27   59.39   0.70  77.20


Heavy writes to the WAL volume.

select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename  | application_name | 
client_addr | client_hostname | client_port | 
backend_start |  xact_start   |  qu

ery_start  | waiting | current_query
---+-+-+--+--+--+-+-+-+---+---+
---+-+---
 16385 | lars|2654 |16384 | lars |  | 
127.0.0.1   | |   44972 | 2011-07-12 
18:44:09.479581+00 | 2011-07-12 18:50:32.629412+00 | 2011-07-12
18:50:32.629473+00 | f   | select count(*) from test where tenant = 
$1 and created_date = $2
 16385 | lars|2658 |   10 | postgres | psql 
| | |  -1 | 2011-07-12 
18:49:02.675436+00 | 2011-07-12 18:50:32.631013+00 | 2011-07-12

18:50:32.631013+00

Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread lars

On 07/12/2011 12:08 PM, Kevin Grittner wrote:

larslhofha...@yahoo.com  wrote:


select count(*) from test where tenant = $1 and created_date = $2


Ah, that might be a major clue -- prepared statements.

What sort of a plan do you get for that as a prepared statement?
(Note, it is very likely *not* to be the same plan as you get if you
run with literal values!)  It is not at all unlikely that it could
resort to a table scan if you have one tenant which is five or ten
percent of the table, which would likely trigger the pruning as it
passed over the modified pages.

-Kevin
So a read of a row *will* trigger dead tuple pruning, and that requires 
WAL logging, and this is known/expected?
This is actually the only answer I am looking for. :) I have not seen 
this documented anywhere.


I know that Postgres will generate general plans for prepared statements 
(how could it do otherwise?),

I also know that it sometimes chooses a sequential scan.
This can always be tweaked to touch fewer rows and/or use a different 
plan. That's not my objective, though!


The fact that a select (maybe a big analytical query we'll run) touching 
many rows will update the WAL and wait
(apparently) for that IO to complete is making a fully cached database 
far less useful.

I just artificially created this scenario.

... Just dropped the table to test something so I can't get the plan 
right now. Will send an update as soon as I get

it setup again.

Thanks again.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread lars

On 07/12/2011 01:04 PM, lars wrote:

On 07/12/2011 12:08 PM, Kevin Grittner wrote:

larslhofha...@yahoo.com  wrote:


select count(*) from test where tenant = $1 and created_date = $2


Ah, that might be a major clue -- prepared statements.

What sort of a plan do you get for that as a prepared statement?
(Note, it is very likely *not* to be the same plan as you get if you
run with literal values!)  It is not at all unlikely that it could
resort to a table scan if you have one tenant which is five or ten
percent of the table, which would likely trigger the pruning as it
passed over the modified pages.

-Kevin
So a read of a row *will* trigger dead tuple pruning, and that 
requires WAL logging, and this is known/expected?
This is actually the only answer I am looking for. :) I have not seen 
this documented anywhere.


I know that Postgres will generate general plans for prepared 
statements (how could it do otherwise?),

I also know that it sometimes chooses a sequential scan.
This can always be tweaked to touch fewer rows and/or use a different 
plan. That's not my objective, though!


The fact that a select (maybe a big analytical query we'll run) 
touching many rows will update the WAL and wait
(apparently) for that IO to complete is making a fully cached database 
far less useful.

I just artificially created this scenario.

... Just dropped the table to test something so I can't get the plan 
right now. Will send an update as soon as I get

it setup again.

Thanks again.

-- Lars



Ok... Slightly changes the indexes:
\d test
Table lars.test
Column| Type  | Modifiers
--+---+---
 tenant   | character(15) |
 created_by   | character(15) |
 created_date | date  |
Indexes:
i1 btree (tenant)

So just just a simple index on tenant.

prepare x as select count(*) from test where tenant = $1 and 
created_date = $2;

PREPARE
explain execute x('001','2011-6-30');
  QUERY PLAN
---
 Aggregate  (cost=263301.40..263301.41 rows=1 width=0)
   -  Bitmap Heap Scan on test  (cost=3895.99..263299.28 rows=847 width=0)
 Recheck Cond: (tenant = $1)
 Filter: (created_date = $2)
 -  Bitmap Index Scan on i1  (cost=0.00..3895.77 rows=169372 
width=0)

   Index Cond: (tenant = $1)
(6 rows)

-- this is when the WAL rows are written:
explain (analyze on, buffers on) execute x('001','2011-6-30');
 QUERY PLAN

 Aggregate  (cost=263301.40..263301.41 rows=1 width=0) (actual 
time=191.150..191.151 rows=1 loops=1)

   Buffers: shared hit=3716
   -  Bitmap Heap Scan on test  (cost=3895.99..263299.28 rows=847 
width=0) (actual time=1.966..188.221 rows=3712 loops=1)

 Recheck Cond: (tenant = $1)
 Filter: (created_date = $2)
 Buffers: shared hit=3716
 -  Bitmap Index Scan on i1  (cost=0.00..3895.77 rows=169372 
width=0) (actual time=1.265..1.265 rows=3712 loops=1)

   Index Cond: (tenant = $1)
   Buffers: shared hit=20
 Total runtime: 191.243 ms
(10 rows)

-- this is when no WAL is written:
explain (analyze on, buffers on) execute x('001','2011-6-30');
 QUERY PLAN

 Aggregate  (cost=263301.40..263301.41 rows=1 width=0) (actual 
time=11.529..11.530 rows=1 loops=1)

   Buffers: shared hit=3715
   -  Bitmap Heap Scan on test  (cost=3895.99..263299.28 rows=847 
width=0) (actual time=1.341..9.187 rows=3712 loops=1)

 Recheck Cond: (tenant = $1)
 Filter: (created_date = $2)
 Buffers: shared hit=3715
 -  Bitmap Index Scan on i1  (cost=0.00..3895.77 rows=169372 
width=0) (actual time=0.756..0.756 rows=3712 loops=1)

   Index Cond: (tenant = $1)
   Buffers: shared hit=19
 Total runtime: 11.580 ms
(10 rows)

If you wanted to recreate this scenario I created a simple script to 
create the table:


create table test(tenant char(15), created_by char(15), created_date date);
insert into test values('x', 'y','2011-6-30');
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert

Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread lars

On 07/12/2011 02:51 PM, Kevin Grittner wrote:

I ran x a bunch of times to get a baseline, then y once, then x a
bunch more times.  The results were a bit surprising:

cir=  \timing
Timing is on.
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 9.823 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 8.481 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 14.054 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 10.169 ms
cir=  execute y('001', '001','2011-6-30');
UPDATE 3456
Time: 404.244 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 128.643 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 2.657 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 5.883 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 2.645 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 2.753 ms
cir=  execute x('001','2011-6-30');
  count
---
   3456
(1 row)

Time: 2.253 ms

Interesting. When you did you test, did you also find WAL write activity 
when running x the first time after y?

(It's very hard to catch in only a single query, though).


Running the update made the next SELECT slow, then it was much
*faster*.  My best guess is that the data landed in a more
concentrated set of pages after the update, and once autovacuum
kicked in and cleaned things up it was able to get to that set of
data faster.


   autovacuum   | off

Well, certainly not while under modification without running
autovacuum.  That's disabling an integral part of what keeps
performance up.
Oh, it's just switched off for testing, so that I can control when 
vacuum runs and make sure that it's not

skewing the results while I am measuring something.
In a real database I would probably err on vacuuming more than less.

For a fully cached database I would probably want to switch off HOT 
pruning and compaction (which from what we see
is done synchronously with the select) and leave it up to the 
asynchronous auto vacuum to do that. But maybe I am

still not quite understanding the performance implications.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread lars

On 07/12/2011 02:38 PM, Merlin Moncure wrote:


Something is not adding up here.  Perhaps there is an alternate route
to WAL logged activity from selects I'm not thinking of.  Right now
I'm thinking to run the selects on table 'a' and the inserts
concurrently on table 'b' and seeing how that behaves.  Another way to
get to the bottom is to oprofile the selecting-during-load backend to
see where the time is getting spent.   Alternative way to do this is
to strace attach to the selecting-during-load backend to see if it's
really writing to the WAL (I'm really curious about this).

Another interesting test would be to try and reproduce the results on
native machine. It should be possible to do this on your workstation
with a more modestly sized scaling factor.

merlin


Just tried with two of my test tables.
Updates on 'a' have no (measurable) effect on select from 'b'.

Back to the first case, here's an strace from the backend doing the 
select right after the updates.


Q\0\0\0`select count(*) from test1 ..., 8192, 0, NULL, NULL) = 97
gettimeofday({1310512219, 723762}, NULL) = 0
open(base/16385/33032, O_RDWR)= 8
lseek(8, 0, SEEK_END)   = 1073741824
open(base/16385/33032.1, O_RDWR|O_CREAT, 0600) = 9
lseek(9, 0, SEEK_END)   = 1073741824
open(base/16385/33032.2, O_RDWR|O_CREAT, 0600) = 10
lseek(10, 0, SEEK_END)  = 191348736
open(base/16385/33035, O_RDWR)= 11
lseek(11, 0, SEEK_END)  = 1073741824
open(base/16385/33035.1, O_RDWR|O_CREAT, 0600) = 12
lseek(12, 0, SEEK_END)  = 3571712
lseek(10, 0, SEEK_END)  = 191348736
brk(0x28ad000)  = 0x28ad000
mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f5f28ca
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f5f28c5f000

munmap(0x7f5f28c5f000, 266240)  = 0
munmap(0x7f5f28ca, 135168)  = 0
open(pg_xlog/000100BB0012, O_RDWR) = 13
lseek(13, 1564672, SEEK_SET)= 1564672
write(13, 
f\320\1\0\1\0\0\0\273\0\0\0\0\340\27\22`\32\0\0002833!000..., 
2400256) = 2400256

fdatasync(13)   = 0
semop(229383, {{9, 1, 0}}, 1)   = 0
gettimeofday({1310512219, 885287}, NULL) = 0
sendto(5, 
\2\0\0\0\300\3\0\0\1@\0\0\t\0\0\0\1\0\0\0\0\0\0\0\353\4\0\0@\0\2\0..., 
960, 0, NULL, 0) = 960
sendto(5, 
\2\0\0\0\300\3\0\0\1@\0\0\t\0\0\0\0\0\0\0\0\0\0\0009\n\0\0@\0\2\0..., 
960, 0, NULL, 0) = 960
sendto(5, 
\2\0\0\0\300\3\0\0\1@\0\0\t\0\0\0\0\0\0\0\0\0\0\0v\n\0\0@\0\2\0..., 
960, 0, NULL, 0) = 960
sendto(5, 
\2\0\0\0\270\1\0\0\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\275\4\0\0\377\177\0\0..., 
440, 0, NULL, 0) = 440
sendto(6, 
T\0\0\0\36\0\1count\0\0\0\0\0\0\0\0\0\0\24\0\10\377\377\377\377\0\0D..., 
66, 0, NULL, 0) = 66


So the backend definitely writing to the WAL, directly and synchronously.

Selecting the same set of rows again:
Q\0\0\0`select count(*) from test1 ..., 8192, 0, NULL, NULL) = 97
gettimeofday({1310512344, 823728}, NULL) = 0
lseek(10, 0, SEEK_END)  = 191348736
lseek(12, 0, SEEK_END)  = 3571712
lseek(10, 0, SEEK_END)  = 191348736
brk(0x28d5000)  = 0x28d5000
brk(0x2915000)  = 0x2915000
brk(0x2897000)  = 0x2897000
gettimeofday({1310512344, 831043}, NULL) = 0
sendto(5, 
\2\0\0\0\350\0\0\0\1@\0\0\2\0\0\0\1\0\0\0\0\0\0\0\10\201\0\0?\0\2\0..., 232, 
0, NULL, 0) = 232
sendto(6, 
T\0\0\0\36\0\1count\0\0\0\0\0\0\0\0\0\0\24\0\10\377\377\377\377\0\0D..., 
66, 0, NULL, 0) = 66


No writing to the WAL.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-12 Thread Lars
shared_buffers is big enough to hold the entire database, and there is plenty 
of extra space. (verified with PG_buffercache) 
So i don't think that is the reason. 


Tom Lane t...@sss.pgh.pa.us schrieb:

Jeff Janes jeff.ja...@gmail.com writes:
 On 7/12/11, lars lhofha...@yahoo.com wrote:
 The fact that a select (maybe a big analytical query we'll run) touching
 many rows will update the WAL and wait
 (apparently) for that IO to complete is making a fully cached database
 far less useful.
 I just artificially created this scenario.

 I can't think of any reason that that WAL would have to be flushed
 synchronously.

Maybe he's running low on shared_buffers?  We would have to flush WAL
before writing a dirty buffer out, so maybe excessive pressure on
available buffers is part of the issue here.

   regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-11 Thread lars

On 07/11/2011 10:33 AM, Kevin Grittner wrote:

lars hofhansllhofha...@yahoo.com  wrote:


Yep, I am not seeing the SELECTs slow down (measurably) during
checkpoints (i.e. when dirty pages are flushed to disk), but only
during writing of the WAL files.


How about if you do a whole slew of the UPDATEs and then stop those
and run a bunch of SELECTs?  (I don't think I've seen anything
mentioned so far which rules out hint bit rewrites as an issue.)

I see you have tweaked things to balance the writes -- you might
want to try further adjustments to reduce backend writes and see
what happens.

-Kevin


Hmm... You are right. Stopping the UPDATEs, waiting for any CHECKPOINTs 
to finish,

and then running the SELECTs indeed shows a similar slowdown.

Interestingly I see very heavy WAL traffic while executing the SELECTs.
(So I was confused as to what caused the WAL traffic).

Why do changes to the hint bits need to be logged to the WAL? If we 
loose them we can always get that information back from the commit log.
Maybe the backend does not know why the page is dirty and will write it 
to the WAL anyway(?)

If that is the case there seems to be room to optimize that.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-11 Thread lars

On 07/11/2011 02:43 PM, Merlin Moncure wrote:

On Mon, Jul 11, 2011 at 2:16 PM, Kevin Grittner
kevin.gritt...@wicourts.gov  wrote:

larslhofha...@yahoo.com  wrote:


Stopping the UPDATEs, waiting for any CHECKPOINTs to finish,
and then running the SELECTs indeed shows a similar slowdown.

Interestingly I see very heavy WAL traffic while executing the
SELECTs. (So I was confused as to what caused the WAL traffic).

Hint bit changes aren't logged, so if it was that you would be
seeing writes to the heap, but not to the WAL.  Clean-up of dead
tuples is logged -- this is probably the result of pruning dead
tuples.  You could probably reduce the impact on your SELECT
statements at least a little by making autovacuum more aggressive.

yeah.  In fact, I'd like to disable autovacuum completely just to
confirm this.  In particular I'd like to know if that removes wal
traffic when only selects are going on.  Another way to check is to
throw some queries to pg_stat_activity during your select period and
see if any non-select activity (like autovacum vacuum).  Basically I'm
suspicious there is more to this story.

hint bit flusing causing i/o during SELECT is a typical complaint
(especially on systems with underperformant i/o), but I'm suspicious
if that's really the problem here. Since you are on a virtualized
platform, I can't help but wonder if you are running into some
bottleneck that you wouldn't see on native hardware.

What's iowait during the slow period?

merlin

Thanks Kevin and Merlin this is extremely helpful...

Ok, that makes much more sense (WALing hint bits did not make sense).

I disabled auto-vacuum and did four tests:
1. Run a bunch of updates, stop that process, wait until checkpointing 
is finished, and run the selects (as before).

2. run VACUUM manually, then run the SELECTs
3. Have the UPDATEs and SELECTs touch a mutually exclusive, random sets 
of row (still in sets of 1).

So the SELECTs are guaranteed not to select rows that were updated.
4. Lastly, change the UPDATEs to update a non-indexed column. To rule 
out Index maintenance. Still distinct set of rows.


In the first case I see the same slowdown (from ~14ms to ~400-500ms). 
pg_stat_activity shows no other load during that

time. I also see write activity only on the WAL volume.

In the 2nd case after VACUUM is finished the time is back to 14ms. As an 
aside: If I run the SELECTs while VACUUM is
running the slowdown is about the same as in the first case until 
(apparently) VACUUM has cleaned up most of the table,

at which point the SELECTs become faster again (~50ms).

In the 3rd case I see exactly the same behavior, which is interesting. 
Both before VACUUM is run and after.
There's no guarantee obviously that distinct rows do not share the same 
page of course especially since the index is

updated as part of this (but see the 4th case).

In case 4 I still see the same issue. Again both before and after VACUUM.

In all cases I see from pg_stat_bgwriter that no backend writes buffers 
directly (but I think that only pertains to dirty buffers, and not the WAL).


So I think I have a partial answer to my initial question.

However, that brings me to some other questions:
Why do SELECTs cause dead tuples to be pruned (as Kevin suggests)?
That even happens when the updates did not touch the selected rows(?)
And why does that slow down the SELECTs? (checkpointing activity on the 
EBS volume holding the database for example
does not slow down SELECTs at all, only WAL activity does). Does the 
selecting backend do that work itself?


Lastly, is this documented somewhere? (I apologize if it is and I missed 
it). If not I'd be happy to write a wiki entry for this.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-11 Thread lars

On 07/11/2011 04:02 PM, lars wrote:

On 07/11/2011 02:43 PM, Merlin Moncure wrote:

On Mon, Jul 11, 2011 at 2:16 PM, Kevin Grittner
kevin.gritt...@wicourts.gov  wrote:

larslhofha...@yahoo.com  wrote:


Stopping the UPDATEs, waiting for any CHECKPOINTs to finish,
and then running the SELECTs indeed shows a similar slowdown.

Interestingly I see very heavy WAL traffic while executing the
SELECTs. (So I was confused as to what caused the WAL traffic).

Hint bit changes aren't logged, so if it was that you would be
seeing writes to the heap, but not to the WAL.  Clean-up of dead
tuples is logged -- this is probably the result of pruning dead
tuples.  You could probably reduce the impact on your SELECT
statements at least a little by making autovacuum more aggressive.

yeah.  In fact, I'd like to disable autovacuum completely just to
confirm this.  In particular I'd like to know if that removes wal
traffic when only selects are going on.  Another way to check is to
throw some queries to pg_stat_activity during your select period and
see if any non-select activity (like autovacum vacuum).  Basically I'm
suspicious there is more to this story.

hint bit flusing causing i/o during SELECT is a typical complaint
(especially on systems with underperformant i/o), but I'm suspicious
if that's really the problem here. Since you are on a virtualized
platform, I can't help but wonder if you are running into some
bottleneck that you wouldn't see on native hardware.

What's iowait during the slow period?

merlin

Thanks Kevin and Merlin this is extremely helpful...

Ok, that makes much more sense (WALing hint bits did not make sense).

I disabled auto-vacuum and did four tests:
1. Run a bunch of updates, stop that process, wait until checkpointing 
is finished, and run the selects (as before).

2. run VACUUM manually, then run the SELECTs
3. Have the UPDATEs and SELECTs touch a mutually exclusive, random 
sets of row (still in sets of 1).

So the SELECTs are guaranteed not to select rows that were updated.
4. Lastly, change the UPDATEs to update a non-indexed column. To rule 
out Index maintenance. Still distinct set of rows.


In the first case I see the same slowdown (from ~14ms to ~400-500ms). 
pg_stat_activity shows no other load during that

time. I also see write activity only on the WAL volume.

In the 2nd case after VACUUM is finished the time is back to 14ms. As 
an aside: If I run the SELECTs while VACUUM is
running the slowdown is about the same as in the first case until 
(apparently) VACUUM has cleaned up most of the table,

at which point the SELECTs become faster again (~50ms).

In the 3rd case I see exactly the same behavior, which is interesting. 
Both before VACUUM is run and after.
There's no guarantee obviously that distinct rows do not share the 
same page of course especially since the index is

updated as part of this (but see the 4th case).

In case 4 I still see the same issue. Again both before and after VACUUM.

In all cases I see from pg_stat_bgwriter that no backend writes 
buffers directly (but I think that only pertains to dirty buffers, and 
not the WAL).


So I think I have a partial answer to my initial question.

However, that brings me to some other questions:
Why do SELECTs cause dead tuples to be pruned (as Kevin suggests)?
That even happens when the updates did not touch the selected rows(?)
And why does that slow down the SELECTs? (checkpointing activity on 
the EBS volume holding the database for example
does not slow down SELECTs at all, only WAL activity does). Does the 
selecting backend do that work itself?


Lastly, is this documented somewhere? (I apologize if it is and I 
missed it). If not I'd be happy to write a wiki entry for this.




Oh, and iowait hovers around 20% when SELECTs are slow:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.540.000.98   18.490.07   78.92

When SELECTs are fast it looks like this:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   8.720.000.260.000.00   91.01

Note that this is a 12 core VM. So one core at 100% would show as 8.33% CPU.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-11 Thread lars

On 07/11/2011 08:26 AM, Robert Klemme wrote:

On Mon, Jul 11, 2011 at 3:13 PM, k...@rice.eduk...@rice.edu  wrote:

I do not know if this makes sense in PostgreSQL and that readers
do not block writers and writes do not block readers. Are your
UPDATEs to individual rows, each in a separate transaction, or
do you UPDATE multiple rows in the same transaction? If you
perform multiple updates in a single transaction, you are
synchronizing the changes to that set of rows and that constraint
is causing other readers that need to get the correct values post-
transaction to wait until the COMMIT completes. This means that
the WAL write must be completed.

What readers should that be?  Docs explicitly state that readers are
never blocked by writers:
http://www.postgresql.org/docs/9.0/interactive/mvcc-intro.html
http://www.postgresql.org/docs/9.0/interactive/mvcc.html

 From what I understand about this issue the observed effect must be
caused by the implementation and not by a conceptual issue with
transactions.


Have you tried disabling synchronous_commit? If this scenario
holds, you should be able to reduce the slowdown by un-batching
your UPDATEs, as counter-intuitive as that is. This seems to
be similar to a problem that I have been looking at with using
PostgreSQL as the backend to a Bayesian engine. I am following
this thread with interest.

I don't think this will help (see above).  Also, I would be very
cautious to do this because although the client might get a faster
acknowledge the DB still has to do the same work as without
synchronous_commit (i.e. WAL, checkpointing etc.) but it still has to
do significantly more transactions than in the batched version.

Typically there is an optimum batch size: if batch size is too small
(say, one row) the ratio of TX overhead to work is too bad.  If
batch size is too large (say, millions of rows) you hit resource
limitations (memory) which inevitable force the RDBMS to do additional
disk IO.

Kind regards

robert


Thanks Ken and Robert,

What I am observing is definitely not readers blocked by writers by 
means of row-level locking.


This seems to be some implementation detail in Postgres about how dirty 
pages (or dead versions of tuples) are
flushed to the disk (see the other parts of this thread) when they 
accessed by a SELECT query.


The batch size in this case is one SELECT statement accessing 1 rows 
via an aggregate (such as COUNT) and

an UPDATE updating 1 rows in a single statement.

I am not trying to optimize this particular use case, but rather to 
understand what Postgres is doing, and why SELECT
queries are affected negatively (sometimes severely) by concurrent (or 
even preceding) UPDATEs at all when the

database resides in the cache completely.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-10 Thread lars
I am doing some research that will hopefully lead to replacing a big 
Oracle installation with a set PostgreSQL servers.


The current Oracle installations consists of multiple of RAC clusters 
with 8 RAC nodes each. Each RAC node has 256gb of

memory (to be doubled soon).
The nature of our service is such that over a reasonable time (a day or 
so) the database *is* the working set.


So I am looking at Postgres in a context where (almost) all of the data 
is cached and disk IO is only required for persistence.


Setup:
PostgreSQL 9.1beta2 on a high memory (~68gb, 12 cores) EC2 Linux 
instance (kernel 2.6.35) with the database and
WAL residing on the same EBS volume with EXT4 (data=ordered, barriers=1) 
- yes that is not an ideal setup
(WAL should be on separate drive, EBS is slow to begin, etc), but I am 
mostly interested in read performance for a fully cached database.


shared_buffers: varied between 1gb and 20gb
checkpoint_segments/timeout: varied accordingly between 16-256 and 
5-10m, resp.
bgwriter tweaked to get a good distribution of checkpoints, bg-writes, 
and backend writes.

wal_sync_method: tried fdatasync and open_datasync.

I read PostgreSQL 9.0 high performance, and have spent some 
significant amount of time on this already.


PostgreSQL holds up extremely well, once things like storing 
hint-bits, checkpoints vs bgwriter vs backend_writes, etc
are understood. I installed pg_buffercache and pgfincore to monitor how 
and where the database is stored.


There is one observation that I wasn't able to explain:
A SELECT only client is severely slowed down by a concurrent client 
performing UPDATES on the same table the other
client selects from, even when the database resides 100% in the cache (I 
tried with shared_buffers large enough to hold
the database, and also with a smaller setting relying on the OS cache, 
the behavior is the same).


As long as only the reader is running I get great performance (20-30ms, 
query reading a random set of about 1 rows
out of 100m row table in a single SELECT). The backend is close to 100% 
cpu, which is what want in a cached database.


Once the writer starts the read performance drops almost immediately to 
200ms.
The reading backend's cpu drop drop to 10%,  and is mostly waiting (D 
state in top).
The UPDATE touches a random set of also about 1 rows (in one update 
statement, one of the columns touched is

indexed - and that is the same index used for the SELECTs).

What I would have expected is that the SELECTs would just continue to 
read from the cached buffers (whether dirtied
or not) and not be affected by concurrent updates. I could not find 
anything explaining this.


The most interesting part:
that this does not happen with an exact clone of that relation but 
UNLOGGED. The same amount of buffers get dirty,
the same amount checkpointing, bgwriting, vacuuming. The only difference 
is WAL maintenance as far as I can tell.


Is there some (intentional or not) synchronization between backend when 
the WAL is maintained? Are there times when
read only query needs to compete disk IO when everything is cached? Or 
are there any other explanations?


I am happy to provide more information. Although I am mainly looking for 
a qualitative answer, which could explain this behavior.


Thanks.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-10 Thread lars
I have since moved the WAL to its own EBS volume (ext4, data=writeback) 
to make it easier to monitor IO.
The times where the SELECTs slow down coincide with heavy write traffic 
to the WAL volume.


Maybe this has to do with WALInsertLock or WALWriteLock (or some other 
lock).
Since the slowdown was less severe with WAL on its own volume it seems 
some exclusive lock on the pages in
shared_buffers is held while WAL IO is in progres(?) - that would be 
frustrating. (wal_buffers default to 16mb in my setup)


Next I am going to have a look at the code. I would be thankful for any 
further insights, though :)


Thanks.

-- Lars

On 07/07/2011 04:56 PM, lars wrote:
I am doing some research that will hopefully lead to replacing a big 
Oracle installation with a set PostgreSQL servers.


The current Oracle installations consists of multiple of RAC clusters 
with 8 RAC nodes each. Each RAC node has 256gb of

memory (to be doubled soon).
The nature of our service is such that over a reasonable time (a day 
or so) the database *is* the working set.


So I am looking at Postgres in a context where (almost) all of the 
data is cached and disk IO is only required for persistence.


Setup:
PostgreSQL 9.1beta2 on a high memory (~68gb, 12 cores) EC2 Linux 
instance (kernel 2.6.35) with the database and
WAL residing on the same EBS volume with EXT4 (data=ordered, 
barriers=1) - yes that is not an ideal setup
(WAL should be on separate drive, EBS is slow to begin, etc), but I am 
mostly interested in read performance for a fully cached database.


shared_buffers: varied between 1gb and 20gb
checkpoint_segments/timeout: varied accordingly between 16-256 and 
5-10m, resp.
bgwriter tweaked to get a good distribution of checkpoints, bg-writes, 
and backend writes.

wal_sync_method: tried fdatasync and open_datasync.

I read PostgreSQL 9.0 high performance, and have spent some 
significant amount of time on this already.


PostgreSQL holds up extremely well, once things like storing 
hint-bits, checkpoints vs bgwriter vs backend_writes, etc
are understood. I installed pg_buffercache and pgfincore to monitor 
how and where the database is stored.


There is one observation that I wasn't able to explain:
A SELECT only client is severely slowed down by a concurrent client 
performing UPDATES on the same table the other
client selects from, even when the database resides 100% in the cache 
(I tried with shared_buffers large enough to hold
the database, and also with a smaller setting relying on the OS cache, 
the behavior is the same).


As long as only the reader is running I get great performance 
(20-30ms, query reading a random set of about 1 rows
out of 100m row table in a single SELECT). The backend is close to 
100% cpu, which is what want in a cached database.


Once the writer starts the read performance drops almost immediately 
to 200ms.
The reading backend's cpu drop drop to 10%,  and is mostly waiting (D 
state in top).
The UPDATE touches a random set of also about 1 rows (in one 
update statement, one of the columns touched is

indexed - and that is the same index used for the SELECTs).

What I would have expected is that the SELECTs would just continue to 
read from the cached buffers (whether dirtied
or not) and not be affected by concurrent updates. I could not find 
anything explaining this.


The most interesting part:
that this does not happen with an exact clone of that relation but 
UNLOGGED. The same amount of buffers get dirty,
the same amount checkpointing, bgwriting, vacuuming. The only 
difference is WAL maintenance as far as I can tell.


Is there some (intentional or not) synchronization between backend 
when the WAL is maintained? Are there times when
read only query needs to compete disk IO when everything is cached? Or 
are there any other explanations?


I am happy to provide more information. Although I am mainly looking 
for a qualitative answer, which could explain this behavior.


Thanks.

-- Lars




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Statistics and Multi-Column indexes

2011-07-10 Thread lars

I know this has been discussed various times...

We are maintaining a large multi tenant database where *all* tables have 
a tenant-id and all indexes and PKs lead with the tenant-id.
Statistics and counts for the all other columns are only really 
meaningful within the context of the tenant they belong to.


There appear to be five options for me:
1. Using single column indexes on all interesting columns and rely on 
PostgreSQLs bitmap indexes to combine them (which are pretty cool).
2. Use multi column indexes and accept that sometimes Postgres pick the 
wrong index (because a non-tenant-id
column might seem highly selective over the table, but it is not for a 
particular tenant - or vice versa).
3. Use a functional index that combines multiple columns and only query 
via these, that causes statistics

gathering for the expression.
I.e. create index i on t((tenantid||column1)) and SELECT ... FROM t 
WHERE tenantid||column1 = '...'
4. Play with n_distinct and/or set the statistics for the inner columns 
to some fixed values that lead to the plans that we want.

5. Have a completely different schema and maybe a database per tenant.

Currently we use Oracle and query hinting, but I do not like that 
practice at all (and Postgres does not have hints anyway).

Are there any other options?

#1 would be the simplest, but I am concerned about the overhead, both 
maintaining two indexes and building the bitmap during queries - for 
every query.


I don't think #2 is actually an option. We have some tenants with many 
(sometimes 100s) millions of rows per table,

and picking the wrong index would be disastrous.

Could something like #3 be generally added to Postgres? I.e. if there is 
a multi column index keep combined statistics for
the involved columns. Of course in that case is it no longer possible to 
query the index by prefix.
#3 also seems expensive as the expression needs to be evaluated for each 
changed row.


Still trying #4. I guess it involves setting the stat target for the 
inner columns to 0 and then inserting my own records into
pg_statistic. Probably only setting n_distinct, i.e. set it low if the 
inner column is not selective within the context of a tenant and high 
otherwise.


For various reasons #5 is also not an option.

And of course the same set of questions comes up with joins.

Thanks.

-- Lars


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-10 Thread lars hofhansl
Thanks Craig.

Yep, I am not seeing the SELECTs slow down (measurably) during  checkpoints 
(i.e. when dirty pages are flushed to disk), but only during  writing of the 
WAL 
files. The buffers shared and OS are big enough to hold the entire  database, 
so 
evicting cached data should not be necessary. (The database  including indexes 
can fit into 16 or so GB, and I have 68GB on that  machine).
Interestingly I initially thought there might be a correlation between 
checkpointing and slower SELECTs, but it turns out that checkpointing just 
slowed down IO to the WAL - until I move it to its own drive, and then 
increased 
the effect I was seeing.

I'll do more research and try to provide more useful details.

Thanks for the pg_catalog link, I'll have a look at it.

-- Lars



- Original Message 
From: Craig Ringer cr...@postnewspapers.com.au
To: pgsql-performance@postgresql.org
Sent: Sun, July 10, 2011 4:11:39 PM
Subject: Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

On 11/07/2011 4:34 AM, lars wrote:
 I have since moved the WAL to its own EBS volume (ext4, data=writeback)
 to make it easier to monitor IO.
 The times where the SELECTs slow down coincide with heavy write traffic
 to the WAL volume.

In theory, UPDATEs shouldn't be blocking or slowing SELECTs. Whether that holds 
up to the light of reality, real-world hardware, and software implementation 
detail, I really don't know. I avoided responding to your first mail because I 
generally work with smaller and less performance critical databases so I 
haven't 
accumulated much experience with fine-tuning.

If your SELECTs were slower *after* your UPDATEs I'd be wondering if your 
SELECTs are setting hint bits on the pages touched by the UPDATEs. See: 
http://wiki.postgresql.org/wiki/Hint_Bits . It doesn't sound like that's the 
case if the SELECTs are slowed down *during* a big UPDATE that hasn't yet 
committed, though.

Could it just be cache pressure - either on shm, or operating system disk 
cache? 
All the dirty buffers that have to be flushed to WAL and to the heap may be 
evicting cached data your SELECTs were benefitting from. Unfortunately, 
diagnostics in this area are ... limited ... though some of the pg_catalog 
views 
(http://www.postgresql.org/docs/9.0/static/catalogs.html) may offer some 
information.

--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088 Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

-- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] very long updates very small tables

2011-04-04 Thread Lars Feistner



On 03/30/2011 06:54 PM, Kevin Grittner wrote:

Lars Feistnerfeist...@uni-heidelberg.de  wrote:

On 03/29/2011 09:28 PM, Kevin Grittner wrote:

Lars Feistnerfeist...@uni-heidelberg.de   wrote:


The log tells me that certain update statements take sometimes
about 3-10 minutes. But we are talking about updates on tables
with 1000 to 1 rows and updates that are supposed to update
1 row.


The top possibilities that come to my mind are:



[all eliminated as possibilities]


If you haven't already done so, you should probably turn on
checkpoint logging to see if this corresponds to checkpoint
activity.  If it does, you can try cranking up how aggressive your
background writer is, and perhaps limiting your shared_buffers to
something around the size of your RAID controller's BBU cache.  (I
hope you have a RAID controller with BBU cache configured for
write-back, anyway.)

-Kevin



Hello Kevin,

i am sorry to disappoint you here. As I said in my first E-Mail we don't 
have much traffic and the database fits easily into memory. The traffic 
might increase, at least it was increasing the last 12 months. The 
database will always fit into memory.
No, we don't have a raid and thus we don't have a bbu. Actually we 
started off with a big SAN that our data centre offered. But sometimes 
this SAN was a bit slow and when we first encountered the very long 
updates i thought there was a connection between the long running 
updates and the slowliness of the SAN, so i started to use the local 
disk (we are talking about one disk not disks) for the database. I am 
still seeing the long running inserts and updates. I am still following 
the auto vacuum trail, it does still not run frequently enough. Thanks a 
lot for the replies so far. I will keep you guys informed about my next 
steps and the results.


Thanx a lot
Lars

--
~~~
Lars Feistner

Kompetenzzentrum für Prüfungen in der Medizin
Medizinische Fakultät Heidelberg,
Im Neuenheimer Feld 346, Raum 013
69120 Heidelberg

E-Mail: feist...@uni-heidelberg.de
Fon:   +49-6221-56-8269
Fax:   +49-6221-56-7175

WWW:   http://www.ims-m.de
   http://www.kompmed.de
~~~

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] very long updates very small tables

2011-03-30 Thread Lars Feistner

Hello Kevin,


On 03/29/2011 09:28 PM, Kevin Grittner wrote:

Lars Feistnerfeist...@uni-heidelberg.de  wrote:


The log tells me that certain update statements take sometimes
about 3-10 minutes. But we are talking about updates on tables
with 1000 to 1 rows and updates that are supposed to update 1
row.


The top possibilities that come to my mind are:

(1)  The tables are horribly bloated.  If autovacuum is off or not
aggressive enough, things can degenerate to this level.

Some tables are auto vacuumed regularly others are not. The specific 
table extjs_recentlist was never autovacuumed. So i would think that 
updates on this table should be always very slow, but they are not. Only 
every 4 or 5th day for maybe half an hour and then everything is fine 
again. And;-) there is no anti virus installed.

(2)  Memory is over-committed and your machine is thrashing.

We can rule this out. There is enough memory installed and the database 
is less than 500MB.

(3)  There are explicit LOCK commands in the software which is
contributing to the blocking.
We use the the jdbc driver. The jdbc driver might do some locking but we 
don't.


(4)  There is some external delay within the transaction, such as
waiting for user input while the transaction is open.


No, no user interaction within a transaction.

Maybe there's a combination of the above at play.  Can you rule any
of these out?

-Kevin

So, i will try to get the autovacuum to be more aggressive and will 
report again if nothing changes.


Thanks a lot.
Lars

--
~~~
Lars Feistner

Kompetenzzentrum für Prüfungen in der Medizin
Medizinische Fakultät Heidelberg,
Im Neuenheimer Feld 346, Raum 013
69120 Heidelberg

E-Mail: feist...@uni-heidelberg.de
Fon:   +49-6221-56-8269
Fax:   +49-6221-56-7175

WWW:   http://www.ims-m.de
   http://www.kompmed.de
~~~

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] very long updates very small tables

2011-03-29 Thread Lars Feistner

Dear list,

we have got a web application and when people log in some information is 
written to the user tables. We have around 500 active users, but at the 
most 5 people are logged in at the same time. From times to times people 
log in and then the application is not responsive any more.


What we see in the postgres server logs is that processes are waiting 
for other transactions to finish though not because of a deadlock.


The log tells me that certain update statements take sometimes about 
3-10 minutes. But we are talking about updates on tables with 1000 to 
1 rows and updates that are supposed to update 1 row.


We are running under windows 2008 and postgres 8.4.7. ( Sorry for the 
windows, it was not MY first choice )


My only explanation at the moment would be, that there must be any kind 
of windows process that stops all other processes until it is finished 
or something like that. ( Could it even be autovaccuum? ). Is there a 
way to find out how long autovaccum took ? Has anyone seen anything 
similiar? Or could it really be that we need a bigger machine with more 
io? But the one disk in the system still seems not very busy and 
response times in windows resource monitor are not higher than 28 ms.


Following is an excerpt of our server log.

LOG:  process 1660 acquired ShareLock on transaction 74652 after 
533354.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  process 4984 acquired ShareLock on transaction 74652 after 
1523530.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  process 956 acquired ExclusiveLock on tuple (4,188) of relation 
16412 of database 16384 after 383055.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  process 4312 acquired ExclusiveLock on tuple (9,112) of relation 
16412 of database 16384 after 1422677.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  duration: 1523567.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '1362'
LOG:  duration: 533391.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '31'
LOG:  process 5504 acquired ExclusiveLock on tuple (9,112) of relation 
16412 of database 16384 after 183216.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  process 1524 acquired ExclusiveLock on tuple (4,188) of relation 
16412 of database 16384 after 376370.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  duration: 1422688.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '1362'
LOG:  duration: 383067.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 'f', $2 = '31'
LOG:  process 4532 acquired ExclusiveLock on tuple (9,112) of relation 
16412 of database 16384 after 118851.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  process 4448 acquired ExclusiveLock on tuple (4,188) of relation 
16412 of database 16384 after 366304.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  duration: 183241.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '1362'
LOG:  duration: 376395.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '31'
LOG:  process 4204 acquired ExclusiveLock on tuple (4,188) of relation 
16412 of database 16384 after 339893.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  duration: 366342.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '31'
LOG:  process 4760 acquired ExclusiveLock on tuple (4,188) of relation 
16412 of database 16384 after 205943.000 ms

STATEMENT:  UPDATE extjs_recentlist SET visible=$1 WHERE recentlist_id=$2
LOG:  duration: 339923.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '31'
LOG:  duration: 205963.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '31'
LOG:  duration: 124654.000 ms  execute unnamed: UPDATE 
extjs_recentlist SET visible=$1 WHERE recentlist_id=$2

DETAIL:  parameters: $1 = 't', $2 = '1362'
LOG:  process 3844 still waiting for ShareLock on transaction 74839 
after 8000.000 ms


Thanx in advance.

Lars
--
~~~
Lars Feistner

Kompetenzzentrum für Prüfungen in der Medizin
Medizinische Fakultät Heidelberg,
Im Neuenheimer Feld 346, Raum 013
69120 Heidelberg

E

Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars
 No idea what mysql thinks a shard is, but in PG we have read-only hot 
 standby's.
I used sharding as an expression for partitioning data into several databases.
Each user in the system is unaware of any other user. The user never accesses 
the private data of another user. Each user could in theory be assigned their 
own database server. This makes it easy to split the 4 users over a number 
of database servers. There are some shared data that is stored in a special 
shared database.

 The standby database is exactly the same as the master (save a bit of
 data that has not been synced yet.)  I assume you know this... but I'd
 really recommend trying out PG's hot-standby and make sure it works the
 way you need (because I bet its different than mysql's).

 Assuming the shared and the sharded databases are totally different
 (lets call them database a and c), with the PG setup you'd have database
 a on one computer, then one master with database b on it (where all
 writes go), then several hot-standby's mirroring database b (that
 support read-only queries).
As our data is easily partitioned into any number of servers we do not plan to 
use replication for load balancing. We do however plan to use it to set up a 
backup site.

  Its pretty hard to guess what your usage pattern is (70% read,
   small columns, no big blobs (like photos), etc)... and even then we'd
 still have to guess.
It's more like 40% read 60% write.

 Not only will I not compare apples to oranges, but I really wont compare
 apples in Canada to oranges in Japan. :-)
Hehe

/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars

 Are you going to RAID the SSD drives at all?
Yes, I was thinking four drives in RAID 10 and a (hot) spare drive...

 Of course this is based on my experience, and I have my fireproof suit since
 I mentioned the word fusionIO :)
Hehe

FusionIO has some impressive stats!
SSD in RAID10 provides redundancy in case of disc failure. How do you handle 
this with fusionIO? Two mirrored cards?

/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars
Thanks for the reply!

MyISAM was chosen back in 2000. I'm not aware of the reasoning behind this 
choice...

Dell claims both the Samsung and the Pliant are safe to use.
Below is a quote from the Pliant datasheet:
No Write Cache:
Pliant EFDs deliver outstanding
write performance
without any dependence on
write cache and thus does
not use battery/supercap.

 As others have mentioned, how are you going to be doing your shards?
Hmm... shards might not have been a good word to describe it. I'll paste what I 
wrote in another reply:
I used sharding as an expression for partitioning data into several databases.
Each user in the system is unaware of any other user. The user never accesses 
the private data of another user. Each user could in theory be assigned their 
own database server. This makes it easy to split the 4 users over a number 
of database servers. There are some shared data that is stored in a special 
shared database.

/Lars

-Ursprungligt meddelande-
Från: mark [mailto:dvlh...@gmail.com]
Skickat: den 19 januari 2011 05:10
Till: Lars
Kopia: pgsql-performance@postgresql.org
Ämne: RE: [PERFORM] Migrating to Postgresql and new hardware

Comments in line, take em for what you paid for em.



 -Original Message-
 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] On Behalf Of Lars
 Sent: Tuesday, January 18, 2011 3:57 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] Migrating to Postgresql and new hardware

 Hi,

 We are in the process of moving a web based application from a MySql to
 Postgresql database.
 Our main reason for moving to Postgresql is problems with MySql
 (MyISAM) table locking.

I would never try and talk someone out of switching but MyISAM? What
version of MySQL and did you pick MyISAM for a good reason or just happened
to end up there?



 We will buy a new set of servers to run the Postgresql databases.

 The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB
 RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).

 One server is used for shared data.
 Four servers are used for sharded data. A user in the system only has
 data in one of the shards.
 There is another server to which all data is replicated but I'll leave
 that one out of this discussion.
 These are dedicated database servers. There are more or less no stored
 procedures. The shared database size is about 20GB and each shard
 database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect
 the size will grow 10%-15% this year. Server load might increase with
 15%-30% this year. This setup is disk I/O bound. The overwhelming
 majority of sql statements are fast (typically single row selects,
 updates, inserts and deletes on primary key) but there are some slow
 long running (10min) queries.

 As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb
 RAM, H700 512MB NV Cache.

One would think you should notice a nice speed improvement, ceteris paribus,
since the X5650 will have -significantly- more memory bandwidth than the
5410s you are used to, and you are going to have a heck of a lot more ram
for things to cache in. I think the H700 is a step up in raid cards as well
but with only 4 disks your probably not maxing out there.



 Dell has offered two alternative SSDs:
 Samsung model SS805 (100GB Solid State Disk SATA 2.5).
 (http://www.plianttechnology.com/lightning_lb.php)
 Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5).
 (http://www.samsung.com/global/business/semiconductor/products/SSD/Prod
 ucts_Enterprise_SSD.html)

The Samsung ones seems to indicate that they have protection in the event of
a power failure, and the pliant does not mention it.

Granted I haven't done or seen any pull the plug under max load tests on
either family, so I got nothing beyond that it is the first thing I have
looked at with every SSD that crosses my path.




 Both are SLC drives. The price of the Pliant is about 2,3 times the
 price of the Samsung (does it have twice the performance?).

 One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung
 drives (4 in RAID 10 + 1 spare).
 Another alternative would be 3 servers (1 shared and 2 shards) with 5
 Pliant drives (4 in RAID 10 + 1 spare). This would be slightly more
 expensive than the first alternative but would be easier to upgrade
 with two new shard servers when it's needed.

As others have mentioned, how are you going to be doing your shards?




 Anyone have experience using the Samsung or the Pliant SSD? Any
 information about degraded performance over time?
 Any comments on the setups? How would an alternative with 15K disks (6
 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare?


You still may find that breaking xlog out to its own logical drive (2 drives
in raid 1) gives a speed improvement to the overall. YMMV - so tinker and
find out before you go deploying.

 How would these alternatives compare in I/O performance

[PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Lars
Hi,

We are in the process of moving a web based application from a MySql to 
Postgresql database.
Our main reason for moving to Postgresql is problems with MySql (MyISAM) table 
locking.
We will buy a new set of servers to run the Postgresql databases.

The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB RAM. 
PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).

One server is used for shared data.
Four servers are used for sharded data. A user in the system only has data in 
one of the shards.
There is another server to which all data is replicated but I'll leave that one 
out of this discussion.
These are dedicated database servers. There are more or less no stored 
procedures. The shared database size is about 20GB and each shard database is 
about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 
10%-15% this year. Server load might increase with 15%-30% this year. This 
setup is disk I/O bound. The overwhelming majority of sql statements are fast 
(typically single row selects, updates, inserts and deletes on primary key) but 
there are some slow long running (10min) queries.

As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb RAM, H700 
512MB NV Cache.
Dell has offered two alternative SSDs:
Samsung model SS805 (100GB Solid State Disk SATA 2.5).
(http://www.plianttechnology.com/lightning_lb.php)
Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5). 
(http://www.samsung.com/global/business/semiconductor/products/SSD/Products_Enterprise_SSD.html)

Both are SLC drives. The price of the Pliant is about 2,3 times the price of 
the Samsung (does it have twice the performance?).

One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung drives (4 
in RAID 10 + 1 spare).
Another alternative would be 3 servers (1 shared and 2 shards) with 5 Pliant 
drives (4 in RAID 10 + 1 spare). This would be slightly more expensive than the 
first alternative but would be easier to upgrade with two new shard servers 
when it's needed.

Anyone have experience using the Samsung or the Pliant SSD? Any information 
about degraded performance over time?
Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 
+ 1 spare, or even 10 RAID10 + 1 spare) compare?
How would these alternatives compare in I/O performance compared to the old 
setup?
Anyone care to guess how the two alternatives would compare in performance 
running Postgresql?
How would the hardware usage of Postgresql compare to MySqls?



Regards
/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-02 Thread Lars Heidieker

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On 2 Jan 2007, at 14:54, Ragnar wrote:


On þri, 2007-01-02 at 09:04 -0500, Geoffrey wrote:

Alvaro Herrera wrote:


Actually it has been suggested that a combination of ext2 (for  
WAL) and

ext3 (for data, with data journalling disabled) is a good performer.
AFAIK you don't want the overhead of journalling for the WAL  
partition.


I'm curious as to why ext3 for data with journalling disabled?  Would
that not be the same as ext2?


I believe Alvaro was referring to ext3 with journalling enabled
for meta-data, but not for data.
I also believe this is the standard ext3 configuration, but I
could be wrong on that.

gnari



it doesn't really belong here but ext3 has
data journaled (data and meta data)
ordered (meta data journald but data written before meta data (default))
journald (meta data only journal)
modes.

The performance differences between ordered and meta data only  
journaling should be very small enyway


- --

Viele Grüße,
Lars Heidieker

[EMAIL PROTECTED]
http://paradoxon.info

- 

Mystische Erklärungen.
Die mystischen Erklärungen gelten für tief;
die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind.
 -- Friedrich Nietzsche



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFFmnJUcxuYqjT7GRYRApNrAJ9oYusdw+Io4iSZrEITTbFy2qDA4QCgmBW5
7cpQZmlIv61EF2wP2yNXZhA=
=glwc
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate