Re: [PERFORM] RAID Controller (HP P400) beat by SW-RAID?

2011-09-13 Thread Arjen van der Meijden


On 12-9-2011 0:44 Anthony Presley wrote:

A few weeks back, we purchased two refurb'd HP DL360's G5's, and were
hoping to set them up with PG 9.0.2, running replicated.  These machines
have (2) 5410 Xeon's, 36GB of RAM, (6) 10k SAS drives, and are using the
HP SA P400i with 512MB of BBWC.  PG is running on an ext4 (noatime)
partition, and they drives configured as RAID 1+0 (seems with this
controller, I cannot do JBOD).


If you really want a JBOD-setup, you can try a RAID0 for each available 
disk, i.e. in your case 6 separate RAID0's. That's how we configured our 
Dell H700 - which doesn't offer JBOD as well - for ZFS.


Best regards,

Arjen

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PG 9.x prefers slower Hash Joins?

2011-09-13 Thread Anthony Presley
In relation to my previous thread (about SW RAID vs. HW RAID on a P400), I
was able to narrow down the filesystem speed and in general, our new system
(running PG 9.1) is about 3x - 5x faster on the IO.

In looking at the query plans in more depth, it appears that PG 9.0 and 9.1
are both preferring to do hash joins, which seem to have a linear time and
are slower than PG 8.4 doing an index scan.

For example, on PG 9.x:
  http://explain.depesz.com/s/qji - This takes 307ms, all the time.  Doesn't
matter if it's cached, or fresh from a reboot.

Same query on PG 8.4:
  http://explain.depesz.com/s/8Pd - This can take 2-3s the first time, but
then takes 42ms once it's cached.

Both of these servers have the same indexes, similar postgresql.conf, and
almost identical data.  However, the old server is doing some different
planning than the new server.  I've run analyze on both of these databases.
 Some relevant PG parameters:

  max_connections = 150
  shared_buffers = 6400MB (have tried as high as 20GB)
  work_mem = 20MB (have tried as high as 100MB)
  effective_io_concurrency = 6
  fsync = on
  synchronous_commit = off
  wal_buffers = 16MB
  checkpoint_segments = 30  (have tried 200 when I was loading the db)
  random_page_cost = 2.5
  effective_cache_size = 10240MB  (have tried as high as 16GB)

If I disable the hashjoin, I get massive improvements on PG 9.x ... as fast
(or faster) than our PG 8.4 instance.


-- 
Anthony Presley


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Marti Raudsepp
On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com wrote:
 In the case of PG this particular example will work:
 1. TX inserts new PK row
 2. TX tries to insert same PK row = blocks
 1. TX commits
 2. TX fails with PK violation
 2. TX does the update (if the error is caught)

That goes against the point I was making in my earlier comment. In
order to implement this error-catching logic, you'll have to allocate
a new subtransaction (transaction ID) for EVERY ROW you insert. If
you're going to be loading billions of rows this way, you will invoke
the wrath of the vacuum freeze process, which will seq-scan all
older tables and re-write every row that it hasn't touched yet. You'll
survive it if your database is a few GB in size, but in the terabyte
land that's unacceptable. Transaction IDs are a scarce resource there.

In addition, such blocking will limit the parallelism you will get
from multiple inserters.

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme
On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com 
 wrote:
 In the case of PG this particular example will work:
 1. TX inserts new PK row
 2. TX tries to insert same PK row = blocks
 1. TX commits
 2. TX fails with PK violation
 2. TX does the update (if the error is caught)

 That goes against the point I was making in my earlier comment. In
 order to implement this error-catching logic, you'll have to allocate
 a new subtransaction (transaction ID) for EVERY ROW you insert.

I don't think so.  You only need to catch the error (see attachment).
Or does this create a sub transaction?

 If
 you're going to be loading billions of rows this way, you will invoke
 the wrath of the vacuum freeze process, which will seq-scan all
 older tables and re-write every row that it hasn't touched yet. You'll
 survive it if your database is a few GB in size, but in the terabyte
 land that's unacceptable. Transaction IDs are a scarce resource there.

Certainly.  But it's not needed as far as I can see.

 In addition, such blocking will limit the parallelism you will get
 from multiple inserters.

Yes, I mentioned the speed issue.  But regardless of the solution for
MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
will have the locking problem anyhow if you plan to insert
concurrently into the same table and be robust.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


ins.sql
Description: Binary data

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Marc Mamin
Hi,

 (see attachment)

under high concurency you may expect that your data is already in.
In such a case you better do nothing at all:

begin
  
  select dat=a_dat from t where id=a_id into test:
  
  if test is null then
  
   begin
  
insert into t (id, dat) values (a_id, a_dat);
exception
when unique_violation then
  update t set dat = a_dat where id = a_id and dat  a_dat;
  return 0;

   end;
  
  elsif not test then
  
update t set dat = a_dat where id = a_id;
  return 0;
  
  end if;

  return 1;


best regards,

Marc Mamin

-Ursprüngliche Nachricht-
Von: pgsql-performance-ow...@postgresql.org im Auftrag von Robert Klemme
Gesendet: Di 9/13/2011 6:34
An: Marti Raudsepp
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
 
On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com 
 wrote:
 In the case of PG this particular example will work:
 1. TX inserts new PK row
 2. TX tries to insert same PK row = blocks
 1. TX commits
 2. TX fails with PK violation
 2. TX does the update (if the error is caught)

 That goes against the point I was making in my earlier comment. In
 order to implement this error-catching logic, you'll have to allocate
 a new subtransaction (transaction ID) for EVERY ROW you insert.

I don't think so.  You only need to catch the error (see attachment).
Or does this create a sub transaction?

 If
 you're going to be loading billions of rows this way, you will invoke
 the wrath of the vacuum freeze process, which will seq-scan all
 older tables and re-write every row that it hasn't touched yet. You'll
 survive it if your database is a few GB in size, but in the terabyte
 land that's unacceptable. Transaction IDs are a scarce resource there.

Certainly.  But it's not needed as far as I can see.

 In addition, such blocking will limit the parallelism you will get
 from multiple inserters.

Yes, I mentioned the speed issue.  But regardless of the solution for
MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
will have the locking problem anyhow if you plan to insert
concurrently into the same table and be robust.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/



Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Marti Raudsepp
On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com wrote:
 I don't think so.  You only need to catch the error (see attachment).
 Or does this create a sub transaction?

Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.

 Yes, I mentioned the speed issue.  But regardless of the solution for
 MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
 will have the locking problem anyhow if you plan to insert
 concurrently into the same table and be robust.

In a mass-loading application you can often divide the work between
threads in a manner that doesn't cause conflicts.

For example, if the unique key is foobar_id and you have 4 threads,
thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
(foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
work.

I already suggested this in my original post.

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Igor Chudov
I do not need to do insert updates from many threads. I want to do it from
one thread.

My current MySQL architecture is that I have a table with same layout as the
main one, to hold new and updated objects.

When there is enough objects, I begin a big INSERT SELECT ... ON DUPLICATE
KEY UPDATE and stuff that into the master table.

i

On Tue, Sep 13, 2011 at 1:11 PM, Marti Raudsepp ma...@juffo.org wrote:

 On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com
 wrote:
  I don't think so.  You only need to catch the error (see attachment).
  Or does this create a sub transaction?

 Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
 SAVEPOINT it can roll back to in case of an error.

  Yes, I mentioned the speed issue.  But regardless of the solution for
  MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
  will have the locking problem anyhow if you plan to insert
  concurrently into the same table and be robust.

 In a mass-loading application you can often divide the work between
 threads in a manner that doesn't cause conflicts.

 For example, if the unique key is foobar_id and you have 4 threads,
 thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
 (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
 work.

 I already suggested this in my original post.

 Regards,
 Marti

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)

2011-09-13 Thread Carlo Stonebanks





Hi Kevin,



(sorry for late reply, PG forums seem to have problems with my e-mail client, 
now trying web mail)
 
First, thanks for taking the time. I wish I could write back with quick, terse 
questions to your detailed reply - but I'm sorry, this is still going to be a 
wordy post.



 max_connections = 300

Too high.  Both throughput and latency should improve with correct use 

of a connection pooler.



Even for 300 stateful applications that can remain connected for up to a week, 
continuously distilling data (imports)? The 300 is overkill, a sys admin raised 
it from 100 when multiple large projects were loaded and the server refused the 
additional connections. We can take large imports and break them into multiple 
smaller ones which the operators are doing to try and improve import 
performance. It does result in some improvement, but I think they have gone 
over the top and the answer is to improve DB and OS performance. Perhaps I 
don't understand how connection pooling will work with stateful apps that are 
continuously reading and writing (the apps are DB I/O bound).

 
 you want the controller configured for write-back (with automatic 

 switch to write-through on low or failed battery, if possible).



For performance or safety reasons? Since the sys admin thinks there's no 
performance benefit from this, I would like to be clear on why we should do 
this.



 Can our particular setup benefit from changing the bgwriter values?

 Probably not.  If you find that your interactive users have periods 

 where queries seem to freeze for a few minutes at a time and then 

 return to normal levels of performance, you might need to make this 

 more aggressive.



We actually experience this. Once again, remember the overwhelming use of the 
system is long-running import threads with continuous connections. Every now 
and then the imports behave as if they are suddenly taking a deep breath, 
slowing down. Sometimes, so much we cancel the import and restart (the imports 
pick up where they left off).



What would the bg_writer settings be in this case?



Thanks again for your time,



Carlo

 
 Date: Fri, 9 Sep 2011 13:16:28 -0500
 From: kevin.gritt...@wicourts.gov
 To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca
 Subject: Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config   
  (re-post)
 
 Carlo Stonebanks stonec.regis...@sympatico.ca wrote:
  
  this is a full-time ETL system, with only a handful of actual
  *users* and automated processes over 300 connections running
  *import* programs 24/7
  
  Intel* Xeon* Processor X5560 (8M Cache, 2.80 GHz, 6.40 
  GT/s Intel* QPI) x 2, dual quad core 48 GB RAM
  RAID 10, 6 X 600 GB 15krpm SAS)
  
 So, eight cores and six spindles.  You are probably going to see
 *much* better throughput if you route those 300 workers through
 about 22 connections.  Use a connection pooler which limits active
 transactions to that and queues up requests to start a transaction.
  
  Sys admin says that battery-backup RAID controller and 
  consequent write settings should have no impact on performance.
  
 With only six drives, I your OS, WAL files, indexes, and heap files
 are all in the same RAID?  If so, your sys admin is wrong -- you
 want the controller configured for write-back (with automatic switch
 to write-through on low or failed battery, if possible).
  
  max_connections = 300
  
 Too high.  Both throughput and latency should improve with correct
 use of a connection pooler.
  
  shared_buffers = 
  500MB# At 48GB of RAM, could we go to 2GB
  
 You might benefit from as much as 8GB, but only testing with your
 actual load will show for sure.
  
  effective_cache_size = 
  2457MB # Sys admin says assume 25% of 48GB
  
 Add together the shared_buffers setting and whatever the OS tells
 you is used for cache under your normal load.  It's usually 75% of
 RM or higher.  (NOTE: This doesn't cause any allocation of RAM; it's
 a hint to the cost calculations.)
  
  work_mem = 
  512MB  # Complex reads are called many times a second
  
 Maybe, if you use the connection pooler as described above.  Each
 connection can allocate this multiple times.  So with 300
 connections you could very easily start using 150GB of RAM in
 addition to your shared buffers; causing a swap storm followed by
 OOM crashes.  If you stay with 300 connections this *must* be
 reduced by at least an order of magnitude.
  
  # from each connection, so what should this be?
  maintenance_work_mem = 
  256MB  # Should this be bigger - 1GB at least?
  
 I'd go to 1 or 2 GB.
  
  checkpoint_segments = 
  128 # There is lots of write activity; this is high 
  
 OK
  
  # but could it be higher?
  
 IMO, there's unlikely to be much benefit beyond that.
  
  #checkpoint_completion_target not set; 
  #  Recommendation appears to be .9 for our 128 checkpoint segments
  
 0.9 is probably a good idea.
  
  default_statistics_target = 
  200  # 

Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Stefan Keller
Interesting debate.

2011/9/13 Marti Raudsepp ma...@juffo.org:
 Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
 SAVEPOINT it can roll back to in case of an error.

Are you sure? In theory I always understood that there are no
subtransactions.

In fact when looking at the docs there is chapter 39.6.6. saying By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause.
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?

Stefan

2011/9/13 Marti Raudsepp ma...@juffo.org:
 On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com 
 wrote:
 I don't think so.  You only need to catch the error (see attachment).
 Or does this create a sub transaction?

 Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
 SAVEPOINT it can roll back to in case of an error.

 Yes, I mentioned the speed issue.  But regardless of the solution for
 MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
 will have the locking problem anyhow if you plan to insert
 concurrently into the same table and be robust.

 In a mass-loading application you can often divide the work between
 threads in a manner that doesn't cause conflicts.

 For example, if the unique key is foobar_id and you have 4 threads,
 thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
 (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
 work.

 I already suggested this in my original post.

 Regards,
 Marti

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] raid array seek performance

2011-09-13 Thread Samuel Gendler
I'm just beginning the process of benchmarking and tuning a new server.
 Something I really haven't done before.  I'm using Greg's book as a guide.
 I started with bonnie++ (1.96) and immediately got anomalous results (I
think).

Hardware is as follows:

2x quad core xeon 5504 2.0Ghz, 2x4MB cache
192GB DDR3 1066 RAM
24x600GB 15K rpm SAS drives
adaptec 52445 controller

The default config, being tested at the moment, has 2 volumes, one 100GB and
one 3.2TB, both are built from a stripe across all 24 disks, rather than
splitting some spindles out for one volume and another set for the other
volume.  At the moment, I'm only testing against the single 3.2TB volume.

The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4 and
91GB).  The larger volume is mounted as xfs with the following options
(cribbed from an email to the list earlier this week, I
think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m

Bonnie++ delivered the expected huge throughput for sequential read and
write.  It seems in line with other benchmarks I found online.  However, we
are only seeing 180 seeks/sec, but seems quite low.  I'm hoping someone
might be able to confirm that and. hopefully, make some suggestions for
tracking down the problem if there is one.

Results are as follows:

1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,19107,69,+,+++,20006,69,19571,72,+,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us


Version  1.96   --Sequential Output-- --Sequential Input-
--Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
newzonedb.z1.p 379G  1561  99 552277  46 363872  34  3005  90 981924  49
179.1  56
Latency  7111us   10666ms   14067ms   65528us 592ms
170ms
--Sequential Create-- Random
Create
-Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
files:max:min/sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
newbox16 19107  69 + +++ 20006  69 19571  72 + +++ 20336
 63
Latency   949us 107us 160us 383us  31us
130us

Also, my inclination is to default to the following volume layout:

2 disks in RAID 1 for system
4 disks in RAID 10 for WAL (xfs)
18 disks in RAID 10 for data (xfs)

Use case is minimal OLTP traffic, plus a fair amount of data warehouse style
traffic - low connection count, queries over sizeable fact tables (100s of
millions of rows) partitioned over time, insert-only data loading, via COPY,
plus some tables are populated via aggregation queries over other tables.
 Basically, based on performance of our current hardware, I'm not concerned
about being able to handle the data-loading load, with the 4 drive raid 10
volume, so emphasis is on warehouse query speed.  I'm not best pleased by
the 2 Ghz CPUs, in that context, but I wasn't given a choice on the
hardware.

Any comments on that proposal are welcome.  I've got only a week to settle
on a config and ready the box for production, so the number of iterations I
can go through is limited.


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Scott Marlowe
On Tue, Sep 13, 2011 at 12:57 PM, Stefan Keller sfkel...@gmail.com wrote:
 Are you sure? In theory I always understood that there are no
 subtransactions.

subtransaction is just another way of saying save points / rollback.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme

On 13.09.2011 20:11, Marti Raudsepp wrote:

On Tue, Sep 13, 2011 at 19:34, Robert Klemmeshortcut...@googlemail.com  wrote:

I don't think so.  You only need to catch the error (see attachment).
Or does this create a sub transaction?


Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.


Ouch!  Learn something new every day.  Thanks for the update!

http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html

Side note: it seems that Oracle handles this differently (i.e. no 
subtransaction but the INSERT would be rolled back) making the pattern 
pretty usable for this particular situation.  Also, I have never heard 
that TX ids are such a scarse resource over there.


Would anybody think it a good idea to optionally have a BEGIN EXCEPTION 
block without the current TX semantics?  In absence of that what would 
be a better pattern to do it (other than UPDATE and INSERT if not found)?



Yes, I mentioned the speed issue.  But regardless of the solution for
MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
will have the locking problem anyhow if you plan to insert
concurrently into the same table and be robust.


In a mass-loading application you can often divide the work between
threads in a manner that doesn't cause conflicts.


Yeah, but concurrency might not the only reason to optionally update. 
If the data is there you might rather want to overwrite it instead of 
failure.


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme

On 13.09.2011 20:57, Stefan Keller wrote:

Interesting debate.


Indeed.


2011/9/13 Marti Raudseppma...@juffo.org:

Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.


Are you sure? In theory I always understood that there are no
subtransactions.


What theory are you referring to?


In fact when looking at the docs there is chapter 39.6.6. saying By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause.
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?


I find this pretty explicit:

It is important not to confuse the use of BEGIN/END for grouping 
statements in PL/pgSQL with the similarly-named SQL commands for 
transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do 
not start or end a transaction. Functions and trigger procedures are 
always executed within a transaction established by an outer query — 
they cannot start or commit that transaction, since there would be no 
context for them to execute in. However, a block containing an EXCEPTION 
clause effectively forms a subtransaction that can be rolled back 
without affecting the outer transaction. For more about that see Section 
38.6.5.


http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html

Cheers

robert




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Kevin Grittner
Robert Klemme  wrote:
 On 12.09.2011 19:22, Andy Colson wrote:
 
 There are transaction isolation levels, but they are like playing
 with fire. (in my opinion).
 
 You make them sound like witchcraft. But they are clearly defined
 - even standardized.
 
Yeah, for decades.  Developing concurrency control from scratch at
the application level over and over again is more like playing with
fire, in my book.
 
 Granted, different RDBMS might implement them in different ways -
 here's PG's view of TX isolation:


 http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html
 
Oh, that link is *so* day-before-yesterday!  Try this one:
 
http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html


 In my opinion anybody working with RDBMS should make himself
 familiar with this concept - at least know about it - because it
 is one of the fundamental features of RDBMS and certainly needs
 consideration in applications with highly concurrent DB activity.
 
+1
 
Understanding what levels of transaction isolation are available,
and what the implications of each are, is fundamental.  Just as
there are cases where a foreign key constraint doesn't exactly work
for what you need to enforce, there are cases where serializable
transactions don't fit.  But where they do fit, developing the
equivalent from scratch all over again is not as safe or productive
as using the built-in feature.
 
-Kevin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Gianni Ciolli
On Mon, Sep 12, 2011 at 11:26:10PM +0200, Robert Klemme wrote:
 You make them sound like witchcraft.  But they are clearly defined -
 even standardized.  Granted, different RDBMS might implement them in
 different ways - here's PG's view of TX isolation:
 
 http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html

Even better: PostgreSQL 9.1 (Released yesterday! Fresher than milk...)
ships an improved algorithm for serializable transaction isolation
level:

  http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html

More info:

  http://wiki.postgresql.org/wiki/Serializable

Cheers,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)

2011-09-13 Thread Kevin Grittner
Carlo Stonebanks  wrote:
 
 max_connections = 300
 Too high. Both throughput and latency should improve with correct
 use of a connection pooler.
 
 Even for 300 stateful applications that can remain connected for
 up to a week, continuously distilling data (imports)?
 
Absolutely.
 
A good connection pooler will be able to hold those 300 *client*
connections, and maintain a much smaller set of connections to the
database.  It will notice when a client connection is requesting the
start of a database transaction.  If there is an idle database
connection it will route the requests there; otherwise it will put
that client connection in a queue.  When a database transaction is
committed, a waiting client connection (if any) will be assigned to
its database connection.
 
Every benchmark I've seen shows that this will improve both
throughput and latency over the approach of releasing a thundering
herd of requests against the server.  Picture a meat counter with
four butchers behind it, and few spinning devices to slice meat.
If customers queue up, and the butchers call on people as they are
ready, things go better than if each butcher tries to take on one-
fourth of the customers at a time and constantly switch between one
order and another to try to make incremental progress on all of
them.
 
 a sys admin raised it from 100 when multiple large projects were
 loaded and the server refused the additional connections.
 
Whoever is making these decisions needs more training.  I suggest
Greg Smith's book:
 
http://www.postgresql.org/docs/books/
 
(Full disclosure, I was a technical reviewer of the book and got a
free copy.)
 
 you want the controller configured for write-back (with automatic
 switch to write-through on low or failed battery, if possible).
 
For performance or safety reasons?
 
You get better performance with write-back.  If you can't rely on
the battery, then write-back is not safe and you need to use write-
through.
 
 Since the sys admin thinks there's no performance benefit from
 this, I would like to be clear on why we should do this.
 
If you can get him to change it back and forth for performance
testing, it is easy enough to prove.  Write a client application
which inserts on row per database transaction.  A nice, simple,
short row -- like containing one integer column with no indexes.
Have the external application create the table and do a million
inserts.  Try this with both cache settings.  It's best not to
issue a BEGIN and COMMIT at all.  Don't loop in a function or a DO
block, because that creates an implicit transaction.
 
 Every now and then the imports behave as if they are suddenly
 taking a deep breath, slowing down. Sometimes, so much we cancel
 the import and restart (the imports pick up where they left off).
 
 What would the bg_writer settings be in this case?
 
I'm not sure what that is based on information so far, so it's
unclear whether background writer settings would help; but on the
face of it my bet would be that it's a context switching storm or
swapping, and the connection pool would be the better solution.
Those poor butchers are just overwhelmed
 
-Kevin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Stefan Keller
The doc at http://www.postgresql.org/docs/current/interactive/indexes-types.html
says: Caution: Hash index operations are not presently WAL-logged, so
hash indexes might need to be rebuilt with REINDEX after a database
crash. They are also not replicated over streaming or file-based
replication. For these reasons, hash index use is presently
discouraged.

I found a thread here
http://archives.postgresql.org/pgsql-general/2005-05/msg00370.php
about Hash index vs. b-tree index (PostgreSQL 8.0) mentioning
some issues, like they
* are not faster than B-trees even for = comparisons
* aren't WAL safe
* have poor concurrency (require coarser locks),
* are significantly slower than creating a b+-tree index.

In fact these statements seem to rely on the docs back in version 7.2
(see http://www.postgresql.org/docs/7.2/static/indexes-types.html )

Has this been verified on a recent release? I can't believe that hash
performs so bad over all these points. Theory tells me otherwise and
http://en.wikipedia.org/wiki/Hash_table seems to be a success.

Are there any plans to give hash index another chance (or to bury it
with a reason)?

Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] raid array seek performance

2011-09-13 Thread Samuel Gendler
On Tue, Sep 13, 2011 at 12:13 PM, Samuel Gendler
sgend...@ideasculptor.comwrote:

 I'm just beginning the process of benchmarking and tuning a new server.
  Something I really haven't done before.  I'm using Greg's book as a guide.
  I started with bonnie++ (1.96) and immediately got anomalous results (I
 think).

 Hardware is as follows:

 2x quad core xeon 5504 2.0Ghz, 2x4MB cache
 192GB DDR3 1066 RAM
 24x600GB 15K rpm SAS drives
 adaptec 52445 controller

 The default config, being tested at the moment, has 2 volumes, one 100GB
 and one 3.2TB, both are built from a stripe across all 24 disks, rather than
 splitting some spindles out for one volume and another set for the other
 volume.  At the moment, I'm only testing against the single 3.2TB volume.

 The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4
 and 91GB).  The larger volume is mounted as xfs with the following options
 (cribbed from an email to the list earlier this week, I
 think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m

 Bonnie++ delivered the expected huge throughput for sequential read and
 write.  It seems in line with other benchmarks I found online.  However, we
 are only seeing 180 seeks/sec, but seems quite low.  I'm hoping someone
 might be able to confirm that and. hopefully, make some suggestions for
 tracking down the problem if there is one.

 Results are as follows:


 1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,19107,69,+,+++,20006,69,19571,72,+,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us


 Version  1.96   --Sequential Output-- --Sequential Input-
 --Random-
 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
 --Seeks--
 MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
 %CP
 newzonedb.z1.p 379G  1561  99 552277  46 363872  34  3005  90 981924  49
 179.1  56
 Latency  7111us   10666ms   14067ms   65528us 592ms
 170ms
 --Sequential Create-- Random
 Create
 -Create-- --Read--- -Delete-- -Create-- --Read---
 -Delete--
 files:max:min/sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
 %CP
 newbox16 19107  69 + +++ 20006  69 19571  72 + +++
 20336  63
 Latency   949us 107us 160us 383us  31us
 130us


My seek times increase when I reduce the size of the file, which isn't
surprising, since once everything fits into cache, seeks aren't dependent on
mechanical movement.  However, I am seeing lots of bonnie++ results in
google which appear to be for a file size that is 2x RAM which show numbers
closer to 1000 seeks/sec (compared to my 180).  Usually, I am seeing 16GB
file for 8GB hosts.  So what is an acceptable random seeks/sec number for a
file that is 2x memory?  And does file size make a difference independent of
available RAM such that the enormous 379GB file that is created on my host
is skewing the results to the low end?


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Peter Geoghegan
On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote:
 Has this been verified on a recent release? I can't believe that hash
 performs so bad over all these points. Theory tells me otherwise and
 http://en.wikipedia.org/wiki/Hash_table seems to be a success.

Hash indexes have been improved since 2005 - their performance was
improved quite a bit in 9.0. Here's a more recent analysis:

http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote:
 Has this been verified on a recent release? I can't believe that hash
 performs so bad over all these points. Theory tells me otherwise and
 http://en.wikipedia.org/wiki/Hash_table seems to be a success.

 Hash indexes have been improved since 2005 - their performance was
 improved quite a bit in 9.0. Here's a more recent analysis:

 http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/

Yeah, looking into the git logs shows several separate major changes
committed during 2008, including storing only the hash code not the
whole indexed value (big win on wide values, and lets you index values
larger than one index page, which doesn't work in btree).  I think that
the current state of affairs is still what depesz said, namely that
there might be cases where they'd be a win to use, except the lack of
WAL support is a killer.  I imagine somebody will step up and do that
eventually.

The big picture though is that we're not going to remove hash indexes,
even if they're nearly useless in themselves, because hash index
opclasses provide the foundation for the system's knowledge of how to
do the datatype-specific hashing needed for hash joins and hash
aggregation.  And those things *are* big wins, even if hash indexes
themselves never become so.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] RAID Controller (HP P400) beat by SW-RAID?

2011-09-13 Thread Anthony Presley
On Tue, Sep 13, 2011 at 1:22 AM, Arjen van der Meijden 
acmmail...@tweakers.net wrote:


 On 12-9-2011 0:44 Anthony Presley wrote:

 A few weeks back, we purchased two refurb'd HP DL360's G5's, and were
 hoping to set them up with PG 9.0.2, running replicated.  These machines
 have (2) 5410 Xeon's, 36GB of RAM, (6) 10k SAS drives, and are using the
 HP SA P400i with 512MB of BBWC.  PG is running on an ext4 (noatime)
 partition, and they drives configured as RAID 1+0 (seems with this
 controller, I cannot do JBOD).


 If you really want a JBOD-setup, you can try a RAID0 for each available
 disk, i.e. in your case 6 separate RAID0's. That's how we configured our
 Dell H700 - which doesn't offer JBOD as well - for ZFS.


That's a pretty good idea ... I'll try that on our second server today.  In
the meantime, after tweaking it a bit, we were able to get (with iozone):



Old New  Initial write
75.85 220.68  Rewrite
63.95 253.07  Read
45.04 171.35  Re-read
45 2405.23  Random read
27.56 1733.46  Random write
50.7 239.47

Not as fas as I'd like, but faster than the old disks, for sure.

--
Anthony


Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)

2011-09-13 Thread Carlo Stonebanks


 
Ok, connection pooler it is. As I understand it, even if there are no idle 
connections available we'll get the benefit of putting a turnstile on the 
butcher's door.

I also ordered the book as soon as you mentioned - the title alone was enough 
to sell me on it! The book won't be for the errant sys admin who increased the 
connections, it's for me - I'll use it to whack the sys admin on the head. 
Thanks fo rthe tip, the author owes you a beer - as do I. Will the book 
recommend any particular connection pooler product, or is it inappropriate to 
ask for a recommendation on the forum? Carlo   Date: Tue, 13 Sep 2011 16:13:00 
-0500
 From: kevin.gritt...@wicourts.gov
 To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca
 Subject: RE: [PERFORM] Migrated from 8.3 to 9.0 - need to update config   
  (re-post)
 
 Carlo Stonebanks  wrote:
  
  max_connections = 300
  Too high. Both throughput and latency should improve with correct
  use of a connection pooler.
  
  Even for 300 stateful applications that can remain connected for
  up to a week, continuously distilling data (imports)?
  
 Absolutely.
  
 A good connection pooler will be able to hold those 300 *client*
 connections, and maintain a much smaller set of connections to the
 database.  It will notice when a client connection is requesting the
 start of a database transaction.  If there is an idle database
 connection it will route the requests there; otherwise it will put
 that client connection in a queue.  When a database transaction is
 committed, a waiting client connection (if any) will be assigned to
 its database connection.
  
 Every benchmark I've seen shows that this will improve both
 throughput and latency over the approach of releasing a thundering
 herd of requests against the server.  Picture a meat counter with
 four butchers behind it, and few spinning devices to slice meat.
 If customers queue up, and the butchers call on people as they are
 ready, things go better than if each butcher tries to take on one-
 fourth of the customers at a time and constantly switch between one
 order and another to try to make incremental progress on all of
 them.
  
  a sys admin raised it from 100 when multiple large projects were
  loaded and the server refused the additional connections.
  
 Whoever is making these decisions needs more training.  I suggest
 Greg Smith's book:
  
 http://www.postgresql.org/docs/books/
  
 (Full disclosure, I was a technical reviewer of the book and got a
 free copy.)
  
  you want the controller configured for write-back (with automatic
  switch to write-through on low or failed battery, if possible).
  
 For performance or safety reasons?
  
 You get better performance with write-back.  If you can't rely on
 the battery, then write-back is not safe and you need to use write-
 through.
  
  Since the sys admin thinks there's no performance benefit from
  this, I would like to be clear on why we should do this.
  
 If you can get him to change it back and forth for performance
 testing, it is easy enough to prove.  Write a client application
 which inserts on row per database transaction.  A nice, simple,
 short row -- like containing one integer column with no indexes.
 Have the external application create the table and do a million
 inserts.  Try this with both cache settings.  It's best not to
 issue a BEGIN and COMMIT at all.  Don't loop in a function or a DO
 block, because that creates an implicit transaction.
  
  Every now and then the imports behave as if they are suddenly
  taking a deep breath, slowing down. Sometimes, so much we cancel
  the import and restart (the imports pick up where they left off).
  
  What would the bg_writer settings be in this case?
  
 I'm not sure what that is based on information so far, so it's
 unclear whether background writer settings would help; but on the
 face of it my bet would be that it's a context switching storm or
 swapping, and the connection pool would be the better solution.
 Those poor butchers are just overwhelmed
  
 -Kevin
 
 

  

Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)

2011-09-13 Thread Craig Ringer

On 09/14/2011 02:56 AM, Carlo Stonebanks wrote:


Even for 300 stateful applications that can remain connected for up to a
week, continuously distilling data (imports)?


If they're all doing active work all that time you can still benefit 
from a pooler.


Say your server can service 50 connections at optimum speed, and any 
more result in reduced overall throughput. You have 300 apps with 
statements they want to run. Your pooler will basically queue them, so 
at any one time 50 are doing work and 250 are waiting for database 
access. This should _improve_ database throughput by reducing contention 
if 50 worker connections is your sweet spot. However, it will also 
increase latency for service for those workers because they may have to 
wait a while before their transaction runs, even though their 
transaction will complete much faster.


You'd probably want to pool at the transaction level, so once a client 
gets a connection it keeps it for the lifetime of that transaction and 
the connection is handed back to the pool when the transaction commits 
or rolls back.



 you want the controller configured for write-back (with automatic
 switch to write-through on low or failed battery, if possible).


For performance or safety reasons? Since the sys admin thinks there's no
performance benefit from this, I would like to be clear on why we should
do this.


fsync!

If your workload is read-only, it won't help you much. If your workload 
is write-heavy or fairly balanced it'll make a HUGE difference, because 
fsync() on commit won't have to wait for disk I/O, only I/O to the RAID 
card's cache controller.


You can also play with commit_delay and synchronous_commit to trade 
guarantees of data persistence off against performance. Don't mind 
losing up to 5 mins of commits if you lose power? These options are for you.


Whatever you do, do NOT set fsync=off. It should be called Eat my data 
if anything goes even slightly wrong=on; it does have legitimate uses, 
but they're not yours.



 Can our particular setup benefit from changing the bgwriter values?
 Probably not. If you find that your interactive users have periods
 where queries seem to freeze for a few minutes at a time and then
 return to normal levels of performance, you might need to make this
 more aggressive.


We actually experience this. Once again, remember the overwhelming use
of the system is long-running import threads with continuous
connections. Every now and then the imports behave as if they are
suddenly taking a deep breath, slowing down. Sometimes, so much we
cancel the import and restart (the imports pick up where they left off).


This could definitely be checkpointing issues. Enable checkpoint logging.


What would the bg_writer settings be in this case?


You need to tune it for your workload I'm afraid. See the manual and 
mailing list discussions.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)

2011-09-13 Thread Kevin Grittner
Craig Ringer  wrote:
 
I agreed with almost your entire post, but there is one sentence
with which I take issue.
 
 However, it will also increase latency for service for those
 workers because they may have to wait a while before their
 transaction runs, even though their transaction will complete much
 faster.
 
My benchmarks have shown that latency also improves.  See these
posts for my reasoning on why that is:
 
http://archives.postgresql.org/pgsql-performance/2009-03/msg00138.php
http://archives.postgresql.org/pgsql-performance/2010-01/msg00107.php
 
So even though there is greater latency from the attempt to *start*
the transaction until it is underway, the total latency from the
attempt to start the transaction until *completion* is less on
average, in spite of the time in queue.  Perhaps that's what you
were getting at, but it sounded to me like you're saying you
sacrifice latency to achieve the throughput, and that isn't what
I've seen.
 
-Kevin





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance