Re: [PERFORM] pg_stat_all_indexes understand

2015-09-09 Thread Bruce Momjian
On Thu, Jul  9, 2015 at 09:45:25AM -0700, Jeff Janes wrote:
> 
> On Thu, Jul 9, 2015 at 5:20 AM, Nicolas Paris  wrote:
> 
> Hello,
> 
> I wonder how understanding pg_stat_all_indexes working
> 
> When I run an explain, some index are not used, but
> pg_stat_all_indexes.idx_scan is incremented for those indexes.
> 
> 
> When the planner considers using a merge join on a indexed column, it uses an
> index to check the endpoints of the column (the min and the max) to make sure
> it has the latest values to get the most accurate estimate.  This causes the
> usage counts to get incremented.  Even when it doesn't end up using the merge
> join.

And it will be documented in 9.5:

commit 7e9ed623d9988fcb1497a2a8ca7f676a5bfa136f
Author: Bruce Momjian 
Date:   Thu Mar 19 22:38:12 2015 -0400

docs:  mention the optimizer can increase the index usage count

Report by Marko Tiikkaja

+The optimizer also accesses indexes to check for supplied constants
+whose values are outside the recorded range of the optimizer 
statistics
+because the optimizer statistics might be stale.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] intel s3500 -- hot stuff

2014-12-09 Thread Bruce Momjian
On Mon, Dec  8, 2014 at 03:40:43PM -0600, Merlin Moncure wrote:
> >> Did not see consistent measurable gains > 256
> >> effective_io_concurrency.  Interesting that at setting of '2' (the
> >> lowest possible setting with the feature actually working) is
> >> pessimal.
> >
> > Very interesting.  When we added a per-tablespace random_page_cost,
> > there was a suggestion that we might want to add per-tablespace
> > effective_io_concurrency someday:
> 
> What I'd really like to see is to have effective_io_concurrency work
> on other types of scans.  It's clearly a barn burner on fast storage
> and perhaps the default should be something other than '1'.  Spinning
> storage is clearly dead and ssd seem to really benefit from the posix
> readhead api.

Well, the real question is knowing which blocks to request before
actually needing them.  With a bitmap scan, that is easy --- I am
unclear how to do it for other scans.  We already have kernel read-ahead
for sequential scans, and any index scan that hits multiple rows will
probably already be using a bitmap heap scan.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] intel s3500 -- hot stuff

2014-12-06 Thread Bruce Momjian
On Wed, Nov  5, 2014 at 12:09:16PM -0600, Merlin Moncure wrote:
> effective_io_concurrency 1: 46.3 sec, ~ 170 mb/sec peak via iostat
> effective_io_concurrency 2:  49.3 sec, ~ 158 mb/sec peak via iostat
> effective_io_concurrency 4:  29.1 sec, ~ 291 mb/sec peak via iostat
> effective_io_concurrency 8:  23.2 sec, ~ 385 mb/sec peak via iostat
> effective_io_concurrency 16:  22.1 sec, ~ 409 mb/sec peak via iostat
> effective_io_concurrency 32:  20.7 sec, ~ 447 mb/sec peak via iostat
> effective_io_concurrency 64:  20.0 sec, ~ 468 mb/sec peak via iostat
> effective_io_concurrency 128:  19.3 sec, ~ 488 mb/sec peak via iostat
> effective_io_concurrency 256:  19.2 sec, ~ 494 mb/sec peak via iostat
> 
> Did not see consistent measurable gains > 256
> effective_io_concurrency.  Interesting that at setting of '2' (the
> lowest possible setting with the feature actually working) is
> pessimal.

Very interesting.  When we added a per-tablespace random_page_cost,
there was a suggestion that we might want to add per-tablespace
effective_io_concurrency someday:

commit d86d51a95810caebcea587498068ff32fe28293e
Author: Robert Haas 
Date:   Tue Jan 5 21:54:00 2010 +

Support ALTER TABLESPACE name SET/RESET ( tablespace_options ).

This patch only supports seq_page_cost and random_page_cost as 
parameters,
but it provides the infrastructure to scalably support many more.
In particular, we may want to add support for 
effective_io_concurrency,
but I'm leaving that as future work for now.

Thanks to Tom Lane for design help and Alvaro Herrera for the 
review.

It seems that time has come.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Bruce Momjian
On Thu, Aug 21, 2014 at 02:17:13PM -0700, Josh Berkus wrote:
> >> Actually, I don't know that anyone has posted the benefits of HT.  Link?
> >>  I want to compare results so that we can figure out what's different
> >> between my case and theirs.  Also, it makes a big difference if there is
> >> an advantage to turning HT on for some workloads.
> > 
> > I had Greg Smith test my system when it was installed, tested it, and
> > recommended hyper-threading.  The system is Debian Squeeze
> > (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores.
> 
> Can you post some numerical results?
> 
> I'm serious.  It's obviously easier for our users if we can blanket
> recommend turning HT off; that's a LOT easier for them than "you might
> want to turn HT off if these conditions ...".  So I want to establish
> that HT is a benefit sometimes if it is.
> 
> I personally have never seen HT be a benefit.  I've seen it be harmless
> (most of the time) but never beneficial.

I know that when hyperthreading was introduced that it was mostly a
negative, but then this was improved, and it might have gotten bad
again.  I am afraid results are based on the type of CPU, so I am not
sure we can know a general answer.

I know I asked Greg Smith, and I assume he would know.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Bruce Momjian
On Thu, Aug 21, 2014 at 02:02:26PM -0700, Josh Berkus wrote:
> On 08/20/2014 07:40 PM, Bruce Momjian wrote:
> > On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote:
> >> On a read-write test, it's 10% faster with HT off as well.
> >>
> >> Further, from their production machine we've seen that having HT on
> >> causes the machine to slow down by 5X whenever you get more than 40
> >> cores (as in 100% of real cores or 50% of HT cores) worth of activity.
> >>
> >> So we're definitely back to "If you're using PostgreSQL, turn off
> >> Hyperthreading".
> > 
> > Not sure how you can make such a blanket statement when so many people
> > have tested and shown the benefits of hyper-threading.  
> 
> Actually, I don't know that anyone has posted the benefits of HT.  Link?
>  I want to compare results so that we can figure out what's different
> between my case and theirs.  Also, it makes a big difference if there is
> an advantage to turning HT on for some workloads.

I had Greg Smith test my system when it was installed, tested it, and
recommended hyper-threading.  The system is Debian Squeeze
(2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-20 Thread Bruce Momjian
On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote:
> On a read-write test, it's 10% faster with HT off as well.
> 
> Further, from their production machine we've seen that having HT on
> causes the machine to slow down by 5X whenever you get more than 40
> cores (as in 100% of real cores or 50% of HT cores) worth of activity.
> 
> So we're definitely back to "If you're using PostgreSQL, turn off
> Hyperthreading".

Not sure how you can make such a blanket statement when so many people
have tested and shown the benefits of hyper-threading.  I am also
unclear exactly what you tested, as I didn't see it mentioned in the
email --- CPU type, CPU count, and operating system would be the minimal
information required.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSI slows down over time

2014-04-09 Thread Bruce Momjian
On Mon, Apr  7, 2014 at 10:38:52AM -0400, Ryan Johnson wrote:
> The two * entries were produced by runs under SI, and confirm that
> the rest of the system has not been slowing down nearly as much as
> SSI. SI throughput dropped by 5% as the database quadrupled in size.
> SSI throughput dropped by 23% during the same interval. And this was
> actually one of the better sets of runs; I had a few last week that
> dipped below 1ktps.
> 
> I'm not sure what to make of this, thoughts?

Seems it is time to ask on hackers.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PGSQL, checkpoints, and file system syncs

2014-04-09 Thread Bruce Momjian
On Thu, Apr 3, 2014 at 09:01:08PM +0300, Heikki Linnakangas wrote:
> >Is there something I can set in the PGSQL parameters or in the file
> >system parameters to force a steady flow of writes to disk rather
> >than waiting for a sync system call? Mounting with "commit=1" did not
> >make a difference.
>
> Try setting the vm.dirty_bytes sysctl. Something like 256MB might be a
> good starting point.
>
> This comes up fairly often, see e.g.:
> http://www.postgresql.org/message-id/flat/27C32FD4-0142-44FE-8488-9F36
> 6dc75...@mr-paradox.net

Uh, should he set vm.dirty_bytes or vm.dirty_background_bytes?  It is my
understanding that vm.dirty_background_bytes starts the I/O while still
accepting writes, while vm.dirty_bytes stops accepting writes during the
I/O, which isn't optimal.  See:

https://www.kernel.org/doc/Documentation/sysctl/vm.txt

dirty_bytes

Contains the amount of dirty memory at which a process generating disk
writes will itself start writeback.

dirty_background_bytes

Contains the amount of dirty memory at which the background kernel
flusher threads will start writeback.

I think we want the flusher to be active, not necessarly the writing
process.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PGSQL, checkpoints, and file system syncs

2014-04-09 Thread Bruce Momjian
On Tue, Apr 8, 2014 at 02:00:05PM +, Shaun Thomas wrote:
> So if you are on a 1GB RAID card, set it to 1GB. Once you have 1GB
> of dirty memory (from a checkpoint or whatever), Linux will begin
> flushing.
>
> This is a pretty well-known issue on Linux systems with large amounts
> of RAM. Most VM servers fit that profile, so I'm not surprised it's
> hurting you.

Agreed.  The dirty kernel defaults Linux defaults are too high for
systems with large amounts of memory.  See sysclt -a | grep dirty for a
list.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why shared_buffers max is 8GB?

2014-04-09 Thread Bruce Momjian
On Wed, Apr  2, 2014 at 11:38:57AM +0200, Alexey Klyukin wrote:
> In most cases 8GB should be enough even for the servers with hundreds of GB of
> data, since the FS uses the rest of the memory as a cache (make sure you give 
> a
> hint to the planner on how much memory is left for this with the
> effective_cache_size), but the exact answer is a matter of performance 
> testing.
> 
> Now, the last question would be what was the initial justification for the 8GB
> barrier, I've heard that there were a lock congestion when dealing with huge
> pool of buffers, but I think that was fixed even in the pre-9.0 era.

The issue in earlier releases was the overhead of managing more then 1
million 8k buffers.  I have not seen any recent tests to confirm that
overhead is still significant.

A larger issue is that going over 8GB doesn't help unless you are
accessing more than 8GB of data in a short period of time.  Add to that
the problem if potentially dirtying all the buffers and flushing it to a
now-smaller kernel buffer cache, and you can see why the 8GB limit is
recommended.

I do think this merits more testing against the current Postgres source
code.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Explain analyze time overhead

2013-12-10 Thread Bruce Momjian
On Tue, Dec 10, 2013 at 01:53:54PM -0800, salah jubeh wrote:
> Thanks for the link, I find it very useful,  unfortunatly I am using 9.1.11
> version.
> 
> After digging a little bit, I find out that the gettimeofday is indeed a 
> little
> bit slower on this particular machine than other machines, but it is not that
> significanat difference. The query I am running is not optimized, and for 
> some 
> reason the material operator is the one which causes most of the overhead. The
> whole issue is due to cross colums statistics and highly correlated 
> predicates,
> the number of estimated records are much less than the actual number.  Still, 
> I
> did not understand completly, why the material operator consume about 9 
> minutes
> when I run explain analyze. i.e how many times we call gettimeofday for the
> material operator -I need to calculate this-? Finally, for testing purposes, I
> have disabled material  and the query execution time dropped from 1 minute to
> 12 second.

The executable is not tied to any particular Postgres version, so you
could get the 9.3 binary and just use that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Explain analyze time overhead

2013-12-10 Thread Bruce Momjian
On Thu, Dec  5, 2013 at 09:22:14AM -0500, Tom Lane wrote:
> salah jubeh  writes:
> > When I excute a query,� the exection time is about 1 minute; however, when 
> > I execute the query with explain analyze the excution time jumps to 10 
> > minutes. 
> 
> This isn't exactly unheard of, although it sounds like you have a
> particularly bad case.  Cheap commodity PCs tend to have clock hardware
> that takes multiple microseconds to read ... which was fine thirty years
> ago when that hardware design was set, but with modern CPUs that's
> painfully slow.
> 
> Short of getting a better machine, you might look into whether you can run
> a 64-bit instead of 32-bit operating system.  In some cases that allows
> a clock reading to happen without a context switch to the kernel.
> 
> > This is a little bit starnge for me; did any one experience somthing like 
> > this? Can I trust the generated plans?
> 
> The numbers are fine as far as they go, but you should realize that the
> relative cost of the cheaper plan nodes is being overstated, since the
> added instrumentation cost is the same per node call regardless of how
> much work happens within the node.

The original poster might also want to run pg_test_timing to get
hardware timing overhead:

http://www.postgresql.org/docs/9.3/static/pgtesttiming.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-10 Thread Bruce Momjian
On Tue, Sep 10, 2013 at 05:43:16AM +, Mel Llaguno wrote:
> Bruce,
>
> First of all, I'd like to thank you for taking some interest in this
> issue. We'd love to migrate to the latest PG version, but this issue
> is currently preventing us from doing so.
>
> Regardless of the DB used (base application schema _or_ restored DB
> with additional app data + base application schema), seed information
> is present in all tests. I guess my question is this - why would
> having existing data change the bind behavior at all? Is it possible
> that the way indexes are created has changed between 8.4 -> 9.x?

I don't know as you have not shown us exactly what is slower between
versions --- you only said the bug appears or not in certain
circumstances.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-09 Thread Bruce Momjian
On Tue, Sep 10, 2013 at 02:06:08AM +, Mel Llaguno wrote:
> We're currently using an embedded PG 8.4.17 for our application. Our
> PG 9.x tests consists of one of the following :
>
> - Take a 8.4.17 DB which contains only our application schema and
> required seed data and use pg_upgrade to create a 9.x database
> directory, run the analyze_new_cluster.sh script and fire up the
> application. Under this set of conditions, the bind connection issue
> is present during our test.
>
> - Start with a fresh PG 9.x DB (using use create_db) and use psql
> to recreate our application schema and required seed data. When the
> application is started and our test executed, the bind connection
> issue is still present.
>
> In both of the above cases, a base application schema is used.
>
> If I upgrade an 8.4.17 DB that contains additional application data
> (generated by interaction with our application) to 9.x, the bind
> connection issue is no longer present. Restoring this upgraded 9.x DB
> into any PG instance in the previously described scenarios also seems
> to fix the bind connection issue.
>
> Please let me know if this clarifies my earlier post.

Yes, that is clear.  So it is the seed data that is causing the problem?
That is the only different I see from #2 and #3.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-09 Thread Bruce Momjian
On Tue, Sep 10, 2013 at 01:36:27AM +, Mel Llaguno wrote:
> Let me clarify further - when we reconstruct our schema (without the
> upgrade step) via a sql script, the problem still persists. Restoring
> an upgraded DB which contains existing data into exactly the same
> instance will correct the behavior.

I do not understand what you are saying above.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-09 Thread Bruce Momjian
On Mon, Sep  9, 2013 at 08:38:09PM -0400, Andrew Dunstan wrote:
> 
> On 08/01/2013 03:20 PM, Jeff Janes wrote:
> >On Thu, Aug 1, 2013 at 10:58 AM, Josh Berkus  wrote:
> >>Amit, All:
> >>
> >>So we just retested this on 9.3b2.  The performance is the same as 9.1
> >>and 9.2; that is, progressively worse as the test cycles go on, and
> >>unacceptably slow compared to 8.4.
> >>
> >>Some issue introduced in 9.1 is causing BINDs to get progressively
> >>slower as the PARSEs BINDs get run repeatedly.  Per earlier on this
> >>thread, that can bloat to 200X time required for a BIND, and it's
> >>definitely PostgreSQL-side.
> >>
> >>I'm trying to produce a test case which doesn't involve the user's
> >>application.  However, hints on other things to analyze would be keen.
> >Does it seem to be all CPU time (it is hard to imagine what else it
> >would be, but...)
> >
> >Could you use oprofile or perf or gprof to get a profile of the
> >backend during a run?  That should quickly narrow it down to which C
> >function has the problem.
> >
> >Did you test 9.0 as well?
> 
> 
> This has been tested back to 9.0. What we have found is that the
> problem disappears if the database has come in via dump/restore, but
> is present if it is the result of pg_upgrade. There are some
> long-running transactions also running alongside this - we are
> currently planning a test where those are not present. We're also
> looking at constructing a self-contained test case.
> 
> Here is some perf output from the bad case:
> 
>+  14.67%  postgres   [.] heap_hot_search_buffer

OK, certainly looks like a HOT chain issue.  I think there are two
possibilities:

1)  the heap or index file is different from a dump/restore vs.
pg_upgrade
2)  some other files is missing or changed between the two

My guess is that the dump/restore removes all the HOT chains as it just
dumps the most recent value of the chain.  Could it be HOT chain
overhead that you are seeing, rather than a pg_upgrade issue/bug?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres upgrade, security release, where?

2013-04-01 Thread Bruce Momjian
On Tue, Apr  2, 2013 at 09:40:07AM +0900, Ian Lawrence Barwick wrote:
> > Due to the security nature of the release, the source and binaries will
> > only be publicly available on April 4 --- there are no pre-release
> > versions available.
> 
> The PostgreSQL homepage has a big announcement saying
> "PostgreSQL minor versions released!", including a mention of a
> "security issue";
> unfortunately it's not obvious that this is for the prior 9.2.3 release and as
> the announcement of the upcoming security release
> ( http://www.postgresql.org/about/news/1454/ ) does not mention the
> new release number, methinks there is plenty of room for confusion :(
> 
> It might be an idea to update the "splash box" with details of the upcoming
> release.

I agree updating the "spash box" would make sense.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres upgrade, security release, where?

2013-04-01 Thread Bruce Momjian
On Mon, Apr  1, 2013 at 05:10:22PM -0700, Tory M Blue wrote:
> So It was announced that there would be a security patch for all versions
> released on the 4th. I see it's been announced/released on the website, but 
> the
> versions available show Feb dates.
> 
> Should the source be current? Or does it take a while for source and other to
> be made available?
> 
> Figured if the site says released, it should be available.
> 
> Thanks
> Tory
> 
> postgresql-9.2.3.tar.bz22013-02-07   15.6
> postgresql-9.2.3.tar.bz210:25:10 MB
> postgresql-9.2.3.tar.bz2.md52013-02-07   59
> postgresql-9.2.3.tar.bz2.md510:25:10 bytes
> postgresql-9.2.3.tar.gz postgresql-9.2.3.tar.gz 2013-02-07   20.5
> 10:25:12 MB
> postgresql-9.2.3.tar.gz.md5 2013-02-07   58
> postgresql-9.2.3.tar.gz.md5 10:25:13 bytes

Due to the security nature of the release, the source and binaries will
only be publicly available on April 4 --- there are no pre-release
versions available.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server setup

2013-03-15 Thread Bruce Momjian
On Fri, Mar 15, 2013 at 06:06:02PM +, Rick Otten wrote:
> >I don't think any drive that corrupts on power-off is suitable for a
> >database, but for non-db uses, sure, I guess they are OK, though you
> >have to be pretty money->constrainted to like that tradeoff.
>
> Wouldn't mission critical databases normally be configured in a high
> availability cluster - presumably with replicas running on different
> power sources?
>
> If you lose power to a member of the cluster (or even the master), you
> would have new data coming in and stuff to do long before it could
> come back online - corrupted disk or not.
>
> I find it hard to imagine configuring something that is too critical
> to be able to be restored from periodic backup to NOT be in a
> (synchronous) cluster.  I'm not sure all the fuss over whether an SSD
> might come back after a hard server failure is really about.  You
> should architect the solution so you can lose the server and throw
> it away and never bring it back online again.  Native streaming
> replication is fairly straightforward to configure.  Asynchronous
> multimaster (albeit with some synchronization latency) is also fairly
> easy to configure using third party tools such as SymmetricDS.
>
> Agreed that adding a supercap doesn't sound like a hard thing for
> a hardware manufacturer to do, but I don't think it should be a
> necessarily be showstopper for being able to take advantage of some
> awesome I/O performance opportunities.

Do you want to recreate the server if it loses power over an extra $100
per drive?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server setup

2013-03-14 Thread Bruce Momjian
On Fri, Mar 15, 2013 at 10:37:55AM +1300, Mark Kirkwood wrote:
> On 15/03/13 07:54, Bruce Momjian wrote:
> >Only use SSDs with a BBU cache, and don't set SSD caches to
> >write-through because an SSD needs to cache the write to avoid wearing
> >out the chips early, see:
> >
> > http://momjian.us/main/blogs/pgblog/2012.html#August_3_2012
> >
> 
> I not convinced about the need for BBU with SSD - you *can* use them
> without one, just need to make sure about suitable longevity and
> also the presence of (proven) power off protection (as discussed
> previously). It is worth noting that using unproven or SSD known to
> be lacking power off protection with a BBU will *not* save you from
> massive corruption (or device failure) upon unexpected power loss.

I don't think any drive that corrupts on power-off is suitable for a
database, but for non-db uses, sure, I guess they are OK, though you
have to be pretty money-constrainted to like that tradeoff.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server setup

2013-03-14 Thread Bruce Momjian
On Tue, Mar 12, 2013 at 09:41:08PM +, Gregg Jaskiewicz wrote:
> On 10 March 2013 15:58, Greg Smith  wrote:
> 
> On 3/1/13 6:43 AM, Niels Kristian Schjødt wrote:
> 
> Hi, I'm going to setup a new server for my postgresql database, and I
> am considering one of these: http://www.hetzner.de/hosting/
> produkte_rootserver/poweredge-r720 with four SAS drives in a RAID 10
> array. Has any of you any particular comments/pitfalls/etc. to mention
> on the setup? My application is very write heavy.
> 
> 
> The Dell PERC H710 (actually a LSI controller) works fine for write-heavy
> workloads on a RAID 10, as long as you order it with a battery backup unit
> module.  Someone must install the controller management utility and do
> three things however:
> 
> 
> We're going to go with either HP or IBM (customer's preference, etc). 
> 
>  
> 
> 1) Make sure the battery-backup unit is working.
> 
> 2) Configure the controller so that the *disk* write cache is off.

Only use SSDs with a BBU cache, and don't set SSD caches to
write-through because an SSD needs to cache the write to avoid wearing
out the chips early, see:

http://momjian.us/main/blogs/pgblog/2012.html#August_3_2012

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] temp tablespaces and SSDs, etc..

2013-02-15 Thread Bruce Momjian
On Fri, Feb  8, 2013 at 02:52:37PM -0600, Jon Nelson wrote:
> I was wondering if somebody could clear up how tablespaces are used.
> Let's say I have three classes of storage:
> - ramdisk (tmpfs)
> - SSD
> - spinning rust
> 
> Furthermore, let's say I'd like to be able to tell postgresql to
> prefer them - in that order - until they each get full. IE, use tmpfs
> until it reports ENOSPC and then fall back to SSD, finally falling
> back to spinning rust.  Is there a way to do this?

Nope.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

2012-12-05 Thread Bruce Momjian
On Wed, Dec  5, 2012 at 04:25:28PM -0600, Shaun Thomas wrote:
> On 12/05/2012 04:19 PM, Daniel Farina wrote:
> 
> >Is 3.2 a significant regression from previous releases, or is 3.4 just
> >faster?  Your wording only indicates that "older kernel is slow," but
> >your tone would suggest that you feel this is a regression, cf.
> 
> It's definitely a regression. I'm trying to pin it down, but the
> 3.2.0-24 kernel didn't do the CPU drain down to single-digits on
> that client load test. I'm working on 3.2.0-30 and going down to
> figure out which patch might have done it.
> 
> Older kernels performed better. And by older, I mean 2.6. Still not
> 3.4 levels, but that's expected. I haven't checked 3.0, but other
> threads I've read suggest it had less problems. Sorry if I wasn't
> clear.

Ah, that is interesting about 2.6.  I had wondered how Debian stable
would have performed, 2.6.32-5.  This relates to a recent discussion
about the appropriateness of Ubuntu for database servers:

    http://archives.postgresql.org/pgsql-performance/2012-11/msg00358.php

Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers on ubuntu precise

2012-11-30 Thread Bruce Momjian
On Fri, Nov 30, 2012 at 02:01:45PM -0600, Shaun Thomas wrote:
> On 11/30/2012 01:57 PM, Ben Chobot wrote:
> 
> >Hm, this sounds like something we should look into. Before we start
> >digging do you have more to share, or did you leave it with the "huh,
> >that's weird; this seems to fix it" solution?
> 
> We're still testing. We're still on the -31 kernel. We tried the -33
> kernel which *might* fix it, but then this happened:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1084264
> 
> So now we're testing -34 which is currently proposed. Either way,
> it's pretty clear that Ubuntu's choice of patches to backport is
> rather eclectic and a little wonky, or that nailing down load
> calculations went awry since the NOHZ stuff started, or both. At
> this point, I wish we'd stayed on CentOS.

Or Debian.  Not sure what would justify use of Ubuntu as a server,
except wanting to have the exact same OS as their personal computers.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-23 Thread Bruce Momjian
On Tue, Nov 20, 2012 at 02:24:01PM -0600, Merlin Moncure wrote:
> On Tue, Nov 20, 2012 at 1:53 PM, Jon Nelson  wrote:
> > As can be seen by the current conversation, not everyone is convinced
> that CTEs ought to be an explicit optimization barrier
> 
> On Tue, Nov 20, 2012 at 1:26 PM, Claudio Freire  
> wrote:
> > It *could* just be a lack of imagination on my part. But if it were
> > not, then it'd be nice for it to be done automatically (since this
> > particular CTE behavior bites enough people already).
> 
> Sure.  I just find it personally hard to find a good demarcation line
> between A: "queries where pushing quals through are universally
> beneficial and wanted" and B: "queries where we are inserting an
> explicit materialization step to avoid planner issues", particularly
> where there is substantial overlap with between A and C: "queries that
> are written with a CTE and arguably shouldn't be".
> 
> Put another way, I find CTE to express: 'this then that' where joins
> express 'this with that'.  So current behavior is not surprising at
> all. All that said, there could be a narrow class of low hanging cases
> (such as the OP's) that could be sniped...I'm just skeptical.

Is thi
-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Thousands databases or schemas

2012-11-15 Thread Bruce Momjian
On Fri, Nov  9, 2012 at 02:15:45PM +0800, Craig Ringer wrote:
> On 11/08/2012 09:29 PM, Denis wrote:
> > Ok guys, it was not my intention to hurt anyone's feelings by mentioning
> > MySQL. Sorry about that.
> It's pretty silly to be upset by someone mentioning another DB product.
> I wouldn't worry.
> > There simply was a project with a similar
> > architecture built using MySQL. When we started the current project, I have
> > made a decision to give PostgreSQL a try.
> It's certainly interesting that MySQL currently scales to much larger
> table counts better than PostgreSQL appears to.
> 
> I'd like to see if this can be improved down the track. Various people
> are doing work on PostgreSQL scaling and performance, so with luck huge
> table counts will come into play there. If nothing else, supporting
> large table counts is important when dealing with very large amounts of
> data in partitioned tables.
> 
> I think I saw mention of better performance with higher table counts in
> 9.3 in -hackers, too.

Yes, 9.3 does much better dumping/restoring databases with a large
number of tables.  I was testing this as part of pg_upgrade performance
improvements for large tables.  We have a few other things we might try
to improve for 9.3 related to caching, but that might not help in this
case.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to upgrade from 9.1 to 9.2 with replication?

2012-11-07 Thread Bruce Momjian
On Wed, Nov  7, 2012 at 03:44:13PM -0300, Claudio Freire wrote:
> On Wed, Nov 7, 2012 at 3:36 PM, Bruce Momjian  wrote:
> >> Bring both down.
> >> pg_upgrade master
> >> Bring master up
> >> pg_upgrade slave
> >
> > Is there any reason to upgrade the slave when you are going to do rsync
> > anyway?  Of course you need to install the new binaries and libs, but it
> > seems running pg_upgrade on the standby is unnecessary.
> 
> Just to speed up the rsync

pg_upgrade is mostly modifying the system tables --- not sure if that is
faster than just having rsync copy those.  The file modification times
would be different after pg_upgrade, so rsync might copy the file anyway
when you run pg_upgrade.  It would be good for you to test if it really
is a win --- I would be surprised if pg_upgrade was in this case on the
standby.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to upgrade from 9.1 to 9.2 with replication?

2012-11-07 Thread Bruce Momjian
On Fri, Oct 19, 2012 at 12:02:49PM -0300, Claudio Freire wrote:
> > This unfortunately does require a new data copy to be pulled across to the
> > slave.  For the local copies this isn't so bad as wire speed is fast enough
> > to make it reasonable; for the actual backup units at a remove it takes a
> > while as the copy has to go across a WAN link.  I cheat on that by using a
> > SSH tunnel with compression turned on (which, incidentally, it would be
> > really nice if Postgres supported internally, and it could quite easily --
> > I've considered working up a patch set for this and submitting it.)
> >
> > For really BIG databases (as opposed to moderately-big) this could be a
> > much-more material problem than it is for me.
> 
> Did you try?
> 
> Bring both down.
> pg_upgrade master
> Bring master up
> pg_upgrade slave

Is there any reason to upgrade the slave when you are going to do rsync
anyway?  Of course you need to install the new binaries and libs, but it
seems running pg_upgrade on the standby is unnecessary.

> rsync master->slave (differential update, much faster than basebackup)
> Bring slave up

Good ideas.  I have applied the attached doc patch to pg_upgrade head
and 9.2 docs to suggest using rsync as part of base backup.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
new file mode 100644
index 301222c..998cb2f
*** a/doc/src/sgml/pgupgrade.sgml
--- b/doc/src/sgml/pgupgrade.sgml
*** psql --username postgres --file script.s
*** 529,535 

 A Log-Shipping Standby Server () cannot
 be upgraded because the server must allow writes.  The simplest way
!is to upgrade the primary and use rsync to rebuild the standbys.

  

--- 529,538 

 A Log-Shipping Standby Server () cannot
 be upgraded because the server must allow writes.  The simplest way
!is to upgrade the primary and use rsync to rebuild the
!standbys.  You can run rsync while the primary is down,
!or as part of a base backup ()
!which overwrites the old standby cluster.

  


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Have: Seq Scan - Want: Index Scan - what am I doing wrong?

2012-10-16 Thread Bruce Momjian
On Tue, Oct 16, 2012 at 08:19:43PM -0400, Chris Ruprecht wrote:
> 
> On Oct 16, 2012, at 20:01 , Evgeny Shishkin  wrote:
> 
> > Selecting 5 yours of data is not selective at all, so postgres decides it 
> > is cheaper to do seqscan. 
> > 
> > Do you have an index on patient.dnsortpersonnumber? Can you post a result 
> > from 
> > select count(*) from patient where dnsortpersonnumber = '347450'; ?
> > 
> 
> Yes, there is an index:
> 
> "Aggregate  (cost=6427.06..6427.07 rows=1 width=0)"
> "  ->  Index Scan using patient_pracsortpatientnumber on patient  
> (cost=0.00..6427.06 rows=1 width=0)"
> "Index Cond: (dnsortpersonnumber = '347450'::text)"
> 
> 
> In fact, all the other criteria is picked using an index. I fear that the >= 
> and <= on the timestamp is causing the issue. If I do a "=" of just one of 
> them, I get an index scan. But I need to scan the entire range. I get queries 
> like "give me everything that was entered into the system for this patient 
> between these two dates". A single date wouldn't work.

Have you read our FAQ on this matter?


http://wiki.postgresql.org/wiki/FAQ#Why_are_my_queries_slow.3F_Why_don.27t_they_use_my_indexes.3F

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers/effective_cache_size on 96GB server

2012-10-10 Thread Bruce Momjian
On Wed, Oct 10, 2012 at 10:12:51PM +0200, Strahinja Kustudić wrote:
> @Bruce Thanks for your articles, after reading them all I don't think 
> disabling
> swap is a good idea now. Also you said to see the effective_cache_size I 
> should
> check it with free. My question is should I use the value that free is showing
> as cached, or a little lower one, since not everything in the cache is because
> of Postgres.

Well, you are right that some of that might not be Postgres, so yeah,
you can lower it somewhat.

> @Claudio So you are basically saying that if I have set effective_cache_size 
> to
> 10GB and I have 10 concurrent processes which are using 10 different indices
> which are for example 2GB, it would be better to set the effective_cache size
> to 1GB? Since if I leave it at 10GB each running process query planner will
> think the whole index is in cache and that won't be true? Did I get that 
> right?

Well, the real question is whether, while traversing the index, if some
of the pages are going to be removed from the cache by other process
cache usage.  effective_cache_size is not figuring the cache will remain
between queries.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers/effective_cache_size on 96GB server

2012-10-10 Thread Bruce Momjian
On Wed, Oct 10, 2012 at 02:05:20PM -0300, Claudio Freire wrote:
> On Wed, Oct 10, 2012 at 1:10 PM, Bruce Momjian  wrote:
> >> >shared_buffers = 10GB
> >>
> >> Generally going over 4GB for shared_buffers doesn't help.. some of
> >> the overhead of bgwriter and checkpoints is more or less linear in
> >> the size of shared_buffers ..
> >>
> >> >effective_cache_size = 90GB
> >>
> >> effective_cache_size should be ~75% of the RAM (if it's a dedicated server)
> >
> > Why guess?  Use 'free' to tell you the kernel cache size:
> >
> > http://momjian.us/main/blogs/pgblog/2012.html#May_4_2012
> 
> Why does nobody every mention that concurrent access has to be taken
> into account?
> 
> Ie: if I expect concurrent access to 10 really big indices, I'll set
> effective_cache_size = free ram / 10

It is true that the estimate assumes a single session is using all the
cache, but I think that is based on the assumion is that there is a
major overlap between the cache needs of multiple sessions.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hyperthreading (was: Two identical systems, radically different performance)

2012-10-10 Thread Bruce Momjian
On Wed, Oct 10, 2012 at 11:44:50AM -0300, Claudio Freire wrote:
> On Wed, Oct 10, 2012 at 9:52 AM, Shaun Thomas  
> wrote:
> > On 10/09/2012 06:30 PM, Craig James wrote:
> >
> >>ra:8192 walb:1M   ra:256 walb:1Mra:256 walb:256kB
> >>    -
> >> -c  -tRun1  Run2  Run3  Run4  Run5  Run6  Run7  Run8  Run9
> >> 40  2500  4261  3722  4243  9286  9240  5712  9310  8530  8872
> >> 50  2000  4138  4399  3865  9213  9351  9578  8011  7651  8362
> >
> >
> > I think I speak for more than a few people here when I say: wat.
> >
> > About the only thing I can ask, is: did you make these tests fair? And by
> > fair, I mean:
> >
> > echo 3 > /proc/sys/vm/drop_caches
> > pg_ctl -D /your/pg/dir restart
> 
> Yes, I was thinking the same. Especially if you check the tendency to
> perform better in higher-numbered runs. But, as you said, that doesn't
> explain that jump to twice the TPS. I was thinking, and I'm not
> pgbench expert, could it be that the database grows from run to run,
> changing performance characteristics?
> 
> > My head hurts.
> 
> I'm just confused. No headache yet.
> 
> But really interesting numbers in any case. It these results are on
> the level, then maybe the kernel's read-ahead algorithm isn't as
> fool-proof as we thought? Gotta read the source. BRB

Well, I have exactly the same setup here:

new: 2x4-core Intex Xeon E5620 2.40 GHz

Let me know if you want any tests run, on SSDs or magnetic disk.  I do
have hyperthreading enabled, and Greg Smith benchmarked my server and
said it was good.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers/effective_cache_size on 96GB server

2012-10-10 Thread Bruce Momjian
On Wed, Oct 10, 2012 at 10:11:30AM +0200, Julien Cigar wrote:
> On 10/10/2012 09:12, Strahinja Kustudić wrote:
> >Hi everyone,
> 
> Hello,
> 
> >
> >I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM
> >and RAID10 15K SCSI drives which is runing Centos 6.2 x64. This
> >server is mainly used for inserting/updating large amounts of data
> >via copy/insert/update commands, and seldom for running select
> >queries.
> >
> >Here are the relevant configuration parameters I changed:
> >
> >shared_buffers = 10GB
> 
> Generally going over 4GB for shared_buffers doesn't help.. some of
> the overhead of bgwriter and checkpoints is more or less linear in
> the size of shared_buffers ..
> 
> >effective_cache_size = 90GB
> 
> effective_cache_size should be ~75% of the RAM (if it's a dedicated server)

Why guess?  Use 'free' to tell you the kernel cache size:

http://momjian.us/main/blogs/pgblog/2012.html#May_4_2012

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers/effective_cache_size on 96GB server

2012-10-10 Thread Bruce Momjian
On Wed, Oct 10, 2012 at 09:12:20AM +0200, Strahinja Kustudić wrote:
> Hi everyone,
> 
> I have a Postgresql 9.1 dedicated server with 16 cores, 96GB RAM and RAID10 
> 15K
> SCSI drives which is runing Centos 6.2 x64. This server is mainly used for
> inserting/updating large amounts of data via copy/insert/update commands, and
> seldom for running select queries.
> 
> Here are the relevant configuration parameters I changed:
> 
> shared_buffers = 10GB
> effective_cache_size = 90GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.8
> 
> My biggest concern are shared_buffers and effective_cache_size, should I
> increase shared_buffers and decrease effective_cache_size? I read that values
> above 10GB for shared_buffers give lower performance, than smaller amounts?
> 
> free is currently reporting (during the loading of data):
> 
> $ free -m
>  total   used   free sharedbuffers cached
> Mem: 96730  96418311  0 71  93120
> -/+ buffers/cache:   3227  93502
> Swap:21000 51  20949
> 
> So it did a little swapping, but only minor, still I should probably decrease
> shared_buffers so there is no swapping at all.

You might want to read my blog entry about swap space:

http://momjian.us/main/blogs/pgblog/2012.html#July_25_2012

It is probably swapping unused memory _out_ to make more use of RAM for
cache.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] hardware advice - opinions about HP?

2012-10-02 Thread Bruce Momjian
On Tue, Oct  2, 2012 at 10:51:46AM -0400, Franklin, Dan (FEN) wrote:
> > Look around and find another vendor, even if your company has to pay
> 
> > more for you to have that blame avoidance.
> 
> We're currently using Dell and have had enough problems to think about
> switching.
> 
> What about HP?

If you need a big vendor, I think HP is a good choice.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan when the wrong data type is used

2012-09-01 Thread Bruce Momjian
On Mon, Feb 28, 2011 at 02:04:53PM -0500, Robert Haas wrote:
> On Sun, Feb 27, 2011 at 1:39 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Tue, Feb 8, 2011 at 5:04 PM, Josh Berkus  wrote:
> >>> I'm not saying that PostgreSQL couldn't do better on this kind of case,
> >>> but that doing better is a major project, not a minor one.
> >
> >> Specifically, the problem is that x = 4.0, where x is an integer, is
> >> defined to mean x::numeric = 4.0, not x = 4.0::integer.  If it meant
> >> the latter, then testing x = 3.5 would throw an error, whereas what
> >> actually happens is it just returns false.
> >
> >> We could fix this by adding some special case logic that understands
> >> properties of integers and numeric values and optimizes x =
> >> 4.0::numeric to x = 4::int and x = 3.5::numeric to constant false.
> >> That would be cool, in a way, but I'm not sure it's really worth the
> >> code it would take, unless it falls naturally out of some larger
> >> project in that area.
> >
> > I think that most of the practical problems around this case could be
> > solved without such a hack.  What we should do instead is invent
> > cross-type operators "int = numeric" etc and make them members of both
> > the integer and numeric index opclasses.  There are reasons why that
> > wouldn't work for integer versus float (read the last section of
> > src/backend/access/nbtree/README) but right offhand it seems like it
> > ought to be safe enough for numeric.  Now, it wouldn't be quite as fast
> > as if we somehow downconverted numeric to integer beforehand, but at
> > least you'd only be talking about a slow comparison operator and not a
> > fundamentally stupider plan.  That's close enough for me, for what is
> > in the end a stupidly written query.
> >
> > Of course, the above is still not exactly a small project, since you'd
> > be talking about something like 36 new operators to cover all of int2,
> > int4, int8.  But it's a straightforward extension.
> 
> Interesting.  Worth a TODO?

Since we are discussing int2 casting, I wanted to bring up this other
casting issue from 2011, in case it helped the discussion.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
> >> management in the server.  What I fixed so far on the pg_dump side
> >> should be enough to let partial dumps run at reasonable speed even if
> >> the whole database contains many tables.  But if psql is taking
> >> AccessShareLock on lots of tables, there's still a problem.
> > 
> > Ok, I modified the part of pg_dump where tremendous number of LOCK
> > TABLE are issued. I replace them with single LOCK TABLE with multiple
> > tables. With 100k tables LOCK statements took 13 minutes in total, now
> > it only takes 3 seconds. Comments?
> 
> Shall I commit to master and all supported branches?

Was this applied?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:51:56PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >>> Ok, I modified the part of pg_dump where tremendous number of LOCK
> >>> TABLE are issued. I replace them with single LOCK TABLE with multiple
> >>> tables. With 100k tables LOCK statements took 13 minutes in total, now
> >>> it only takes 3 seconds. Comments?
> 
> >> Shall I commit to master and all supported branches?
> 
> > Was this applied?
> 
> No, we fixed the server side instead.

Again, thanks.  I knew we fixed the server, but wasn't clear that made
the client changes unnecessary, but I think I now do remember discussion
about that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High Disk write and space taken by PostgreSQL

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 06:07:26PM +0200, anara...@anarazel.de wrote:
> 
> 
> Bruce Momjian  schrieb:
> 
> >On Thu, Aug 16, 2012 at 03:48:57PM +1000, Ondrej Ivanič wrote:
> >> Hi,
> >> 
> >> On 16 August 2012 15:40, J Ramesh Kumar 
> >wrote:
> >> > As you said, MySQL with MyISAM is better choice for my app. Because
> >I don't
> >> > need transaction/backup. May be I'll try with InnoDB and find the
> >disk
> >> > write/space difference. Is there any similar methods available in
> >postgresql
> >> > like MyISAM engine ?
> >> 
> >> You can try unlogged tables:
> >> http://www.postgresql.org/docs/9.1/static/sql-createtable.html
> >> 
> >> If specified, the table is created as an unlogged table. Data written
> >> to unlogged tables is not written to the write-ahead log (see Chapter
> >> 29), which makes them considerably faster than ordinary tables.
> >> However, they are not crash-safe: an unlogged table is automatically
> >> truncated after a crash or unclean shutdown. The contents of an
> >> unlogged table are also not replicated to standby servers. Any
> >indexes
> >> created on an unlogged table are automatically unlogged as well;
> >> however, unlogged GiST indexes are currently not supported and cannot
> >> be created on an unlogged table.
> >
> >I would set full_page_writes = off too.
> Why? There shouldn't be any such writes on unlogged tables.

True.  I was thinking more of the logged tables, and the system tables.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High Disk write and space taken by PostgreSQL

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 10:53:21AM -0400, Bruce Momjian wrote:
> On Thu, Aug 16, 2012 at 03:48:57PM +1000, Ondrej Ivanič wrote:
> > Hi,
> > 
> > On 16 August 2012 15:40, J Ramesh Kumar  wrote:
> > > As you said, MySQL with MyISAM is better choice for my app. Because I 
> > > don't
> > > need transaction/backup. May be I'll try with InnoDB and find the disk
> > > write/space difference. Is there any similar methods available in 
> > > postgresql
> > > like MyISAM engine ?
> > 
> > You can try unlogged tables:
> > http://www.postgresql.org/docs/9.1/static/sql-createtable.html
> > 
> > If specified, the table is created as an unlogged table. Data written
> > to unlogged tables is not written to the write-ahead log (see Chapter
> > 29), which makes them considerably faster than ordinary tables.
> > However, they are not crash-safe: an unlogged table is automatically
> > truncated after a crash or unclean shutdown. The contents of an
> > unlogged table are also not replicated to standby servers. Any indexes
> > created on an unlogged table are automatically unlogged as well;
> > however, unlogged GiST indexes are currently not supported and cannot
> > be created on an unlogged table.
> 
> I would set full_page_writes = off too.

Better yet, read our documentation about non-durable settting:

http://www.postgresql.org/docs/9.1/static/non-durability.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High Disk write and space taken by PostgreSQL

2012-08-16 Thread Bruce Momjian
On Thu, Aug 16, 2012 at 03:48:57PM +1000, Ondrej Ivanič wrote:
> Hi,
> 
> On 16 August 2012 15:40, J Ramesh Kumar  wrote:
> > As you said, MySQL with MyISAM is better choice for my app. Because I don't
> > need transaction/backup. May be I'll try with InnoDB and find the disk
> > write/space difference. Is there any similar methods available in postgresql
> > like MyISAM engine ?
> 
> You can try unlogged tables:
> http://www.postgresql.org/docs/9.1/static/sql-createtable.html
> 
> If specified, the table is created as an unlogged table. Data written
> to unlogged tables is not written to the write-ahead log (see Chapter
> 29), which makes them considerably faster than ordinary tables.
> However, they are not crash-safe: an unlogged table is automatically
> truncated after a crash or unclean shutdown. The contents of an
> unlogged table are also not replicated to standby servers. Any indexes
> created on an unlogged table are automatically unlogged as well;
> however, unlogged GiST indexes are currently not supported and cannot
> be created on an unlogged table.

I would set full_page_writes = off too.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux memory zone reclaim

2012-07-30 Thread Bruce Momjian
> node distances:
> node   0   1   2   3 
>   0:  10  11  11  11 
>   1:  11  10  11  11 
>   2:  11  11  10  11 
>   3:  11  11  11  10 
>  
> When considering a hardware purchase, it might be wise to pay close
> attention to how "far" a core may need to go to get to the most
> "distant" RAM.

Yikes, my server is certainly asymmetric:

node distances:
node   0   1
  0:  10  21
  1:  21  10

and my Debian Squeeze certainly knows that:

$ cat <  /proc/sys/vm/zone_reclaim_mode
1

Server specs:

http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012

I have 12 2GB DDR3 DIMs.

Of course, my home server is ridiculously idle too.  :-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux memory zone reclaim

2012-07-26 Thread Bruce Momjian
On Tue, Jul 17, 2012 at 09:52:11PM -0400, Greg Smith wrote:
> I've taken to disabling /proc/sys/vm/zone_reclaim_mode on any Linux
> system where it's turned on now.  I'm still working through whether
  
> it also makes sense in all cases to use the more complicated memory
> interleaving suggestions that MySQL users have implemented,
> something most people would need to push into their PostgreSQL
> server started up scripts in /etc/init.d  (That will be a fun
> rpm/deb packaging issue to deal with if this becomes more
> wide-spread)  Suggestions on whether that is necessary, or if just
> disabling zone_reclaim is enough, are welcome from anyone who wants
> to try and benchmark it.

Should I be turning it off on my server too?  It is enabled on my
system.


-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Drop statistics?

2012-07-03 Thread Bruce Momjian
On Fri, Jun 22, 2012 at 11:04:36AM -0700, David Kerr wrote:
> On Fri, Jun 22, 2012 at 01:27:51PM -0400, Tom Lane wrote:
> - David Kerr  writes:
> - > I'm trying to work through a root cause on a performance problem. I'd 
> like to
> - > be able to "show" that a problem was fixed by analyzing the table.
> - 
> - > what i've done is
> - > set default_statistics_target=1
> - > analyze 
> - 
> - > That gets rid of most of the rows in pg_stats, but i'm still getting 
> decent performance.
> - 
> - I usually do something like
> - 
> - DELETE FROM pg_statistic WHERE starelid = 'foo'::regclass;
> - 
> - (you need to be superuser to be allowed to do this).
> - 
> - You may need to keep an eye on whether auto-analyze is coming along and
> - undoing what you did, too.
> - 
> - regards, tom lane
> - 
> 
> Awesome, thanks!

One cool trick I have seen is to do the DELETE pg_statistic in a multi-statement
transaction and then run query query, and roll it back.  This allows the
statistics to be preserved, and for only your query to see empty
pg_statistic values for the table.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Bruce Momjian
On Thu, May 31, 2012 at 11:04:12AM -0400, Robert Haas wrote:
> On Thu, May 31, 2012 at 10:50 AM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
> >>> I'm not; Jeff Janes is.  But you shouldn't be holding your breath
> >>> anyway, since it's 9.3 material at this point.
> >
> >> I agree we can't back-patch that change, but then I think we ought to
> >> consider back-patching some variant of Tatsuo's patch.  Maybe it's not
> >> reasonable to thunk an arbitrary number of relation names in there on
> >> one line, but how about 1000 relations per LOCK statement or so?  I
> >> guess we'd need to see how much that erodes the benefit, but we've
> >> certainly done back-branch rearrangements in pg_dump in the past to
> >> fix various kinds of issues, and this is pretty non-invasive.
> >
> > I am not convinced either that this patch will still be useful after
> > Jeff's fix goes in, ...
> 
> But people on older branches are not going to GET Jeff's fix.

FYI, if it got into Postgres 9.2, everyone upgrading to Postgres 9.2
would benefit because pg_upgrade uses the new cluster's pg_dumpall.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Bruce Momjian
On Thu, May 31, 2012 at 10:50:51AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Thu, May 31, 2012 at 10:31 AM, Tom Lane  wrote:
> >> I'm not; Jeff Janes is. �But you shouldn't be holding your breath
> >> anyway, since it's 9.3 material at this point.
> 
> > I agree we can't back-patch that change, but then I think we ought to
> > consider back-patching some variant of Tatsuo's patch.  Maybe it's not
> > reasonable to thunk an arbitrary number of relation names in there on
> > one line, but how about 1000 relations per LOCK statement or so?  I
> > guess we'd need to see how much that erodes the benefit, but we've
> > certainly done back-branch rearrangements in pg_dump in the past to
> > fix various kinds of issues, and this is pretty non-invasive.
> 
> I am not convinced either that this patch will still be useful after
> Jeff's fix goes in, or that it provides any meaningful savings when
> you consider a complete pg_dump run.  Yeah, it will make the lock
> acquisition phase faster, but that's not a big part of the runtime
> except in very limited scenarios (--schema-only, perhaps).

FYI, that is the pg_upgrade use-case, and pg_dump/restore time is
reportedly taking the majority of time in many cases.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-25 Thread Bruce Momjian
On Fri, May 25, 2012 at 10:41:23AM -0400, Tom Lane wrote:
> "Hugo "  writes:
> > If anyone has more suggestions, I would like to hear them. Thank you!
> 
> Provide a test case?
> 
> We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
> extremely specific cases that might or might not have anything to do
> with what you're seeing.  The complainant was extremely helpful about
> tracking down the problems:
> http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
> http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
> http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php

Yes, please help us improve this!  At this point pg_upgrade is limited
by the time to dump/restore the database schema, but I can't get users
to give me any way to debug the speed problems.

Someone reported pg_upgrade took 45 minutes because of pg_dumpall
--schema, which is quite long.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-24 Thread Bruce Momjian
On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:
> On Thu, May 24, 2012 at 8:21 AM, Craig James  wrote:
> >
> >
> > On Thu, May 24, 2012 at 12:06 AM, Hugo   wrote:
> >>
> >> Hi everyone,
> >>
> >> We have a production database (postgresql 9.0) with more than 20,000
> >> schemas
> >> and 40Gb size. In the past we had all that information in just one schema
> >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we
> >> decided to split the database into schemas, which makes a lot of sense for
> >> the kind of information we store and the plans we have for the future. The
> >> problem now is that pg_dump takes forever to finish (more than 24 hours)
> >> and
> >> we just can't have consistent daily backups like we had in the past. When
> >> I
> >> try to dump just one schema with almost nothing in it, it takes 12
> >> minutes.
> 
> Sorry, your original did not show up here, so I'm piggy-backing on
> Craig's reply.
> 
> Is dumping just one schema out of thousands an actual use case, or is
> it just an attempt to find a faster way to dump all the schemata
> through a back door?
> 
> pg_dump itself seems to have a lot of quadratic portions (plus another
> one on the server which it hits pretty heavily), and it hard to know
> where to start addressing them.  It seems like addressing the overall
> quadratic nature might be a globally better option, but addressing
> just the problem with dumping one schema might be easier to kluge
> together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-08 Thread Bruce Momjian
Tory M Blue wrote:
> >From my last report I had success but it was successful due to lots of
> manual steps. I figured it may be safer to just create a new rpm,
> installing to pgsql9 specific directories and a new data directory.
> 
> This allows pg_upgrade to complete successfully (so it says). However
> my new data directory is empty and the old data directory now has what
> appears to be 8.4 data and the 9.1 data.
> 
> /data is olddatadir original data dir
> 
> [root@devqdb03 queue]# ll /data/queue
> total 12
> drwx-- 2 postgres dba 4096 2011-12-07 09:44 16384
> drwx-- 3 postgres dba 4096 2011-12-07 11:34 PG_9.1_201105231
> -rw--- 1 postgres dba4 2011-12-07 09:44 PG_VERSION

That sure looks like a tablespace to me, not a data directory.

> 
> /data1 is the new 9.1 installed location.
> [root@devqdb03 queue]# ll /data1/queue/
> total 0
> 
> Do I have to manually move the new PG_9.1. data to /data1 or. I'm
> just confused at what I'm looking at here.
> 
> If I don't move anything and start up the DB , I get this
> 
> psql (8.4.4, server 9.1.1)
> WARNING: psql version 8.4, server version 9.1.
>  Some psql features might not work.
> Type "help" for help.
> 
> Sorry my upgrade process has been an ugly mess :)

You are using an 8.4.4 psql to connect to a 9.1.1 server.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-05 Thread Bruce Momjian
Tory M Blue wrote:
> On Mon, Dec 5, 2011 at 10:31 AM, Tory M Blue  wrote:
> > On Mon, Dec 5, 2011 at 10:22 AM, Bruce Momjian  wrote:
> >>> But initial response to all this, is umm we have not really made a
> >>> dump/restore unnecessary with the latest releases of Postgres than, as
> >>> I would have to think that there is a high percentage of users whom
> >>> use tablespaces.
> >>
> >> Yes, but they don't change tablespace locations during the upgrade. ?In
> >> fact, we have had surprisingly few (zero) request for moving
> >> tablespaces, and now we are trying to implement this for Postgres 9.2.
> >> The normal API will be to have the user move the tablespace before the
> >> upgrade, but as I said before, it isn't easy to do now in Postgres.
> >
> > Okay think here is where I'm confused. "they don't change tablespace",
> > okay how are they doing the upgrade? ?Do they leave the olddatadir in
> > the default location and create a new one elsewhere, vs where I'm kind
> > of doing the opposite?
> 
> Okay right
> 
> So changed the symlink in pg_tblspaces, and changed the path inside
> the db, and it appears to have worked. These were either the "doh
> pieces" or the missing components that you helped point me to. Thank
> you!

See my other email --- this might not be necessary.

---


> 
> Tory
> 
> -bash-4.0$ /logs-all/temp/pg_upgrade  --old-datadir "/data1/db"
> --new-datadir "/data/db" --old-bindir "/ipix/pgsql8/bin" --new-bindir
> "/ipix/pgsql/bin"
> Performing Consistency Checks
> -
> Checking current, bin, and data directories ok
> Checking cluster versions   ok
> Checking database user is a superuser   ok
> Checking for prepared transactions  ok
> Checking for reg* system oid user data typesok
> Checking for contrib/isn with bigint-passing mismatch   ok
> Checking for large objects  ok
> Creating catalog dump   ok
> Checking for prepared transactions  ok
> Checking for presence of required libraries ok
> 
> | If pg_upgrade fails after this point, you must
> | re-initdb the new cluster before continuing.
> | You will also need to remove the ".old" suffix
> | from /data1/db/global/pg_control.old.
> 
> Performing Upgrade
> --
> Adding ".old" suffix to old global/pg_control   ok
> Analyzing all rows in the new cluster   ok
> Freezing all rows on the new clusterok
> Deleting new commit clogs   ok
> Copying old commit clogs to new server  ok
> Setting next transaction id for new cluster ok
> Resetting WAL archives  ok
> Setting frozenxid counters in new cluster   ok
> Creating databases in the new cluster   ok
> Adding support functions to new cluster ok
> Restoring database schema to new clusterok
> Removing support functions from new cluster ok
> Restoring user relation files
> ok
> Setting next oid for new clusterok
> Creating script to delete old cluster   ok
> Checking for large objects  ok
> 
> Upgrade complete
> 
> | Optimizer statistics are not transferred by pg_upgrade
> | so consider running:
> | vacuumdb --all --analyze-only
> | on the newly-upgraded cluster.
> 
> | Running this script will delete the old cluster's data files:
> | /data/pgsql/delete_old_cluster.sh
> -bash-4.0$

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-05 Thread Bruce Momjian
Tory M Blue wrote:
> On Mon, Dec 5, 2011 at 10:22 AM, Bruce Momjian  wrote:
> >> But initial response to all this, is umm we have not really made a
> >> dump/restore unnecessary with the latest releases of Postgres than, as
> >> I would have to think that there is a high percentage of users whom
> >> use tablespaces.
> >
> > Yes, but they don't change tablespace locations during the upgrade. ?In
> > fact, we have had surprisingly few (zero) request for moving
> > tablespaces, and now we are trying to implement this for Postgres 9.2.
> > The normal API will be to have the user move the tablespace before the
> > upgrade, but as I said before, it isn't easy to do now in Postgres.
> 
> Okay think here is where I'm confused. "they don't change tablespace",
> okay how are they doing the upgrade?  Do they leave the olddatadir in
> the default location and create a new one elsewhere, vs where I'm kind
> of doing the opposite?

If you look in a 9.0+ tablespace directory, you will see that each
cluster has its own subdirectory:

test=> create tablespace tb1 location '/u/pg/tb1';
CREATE TABLESPACE
test=> \q
$ lf /u/pg/tb1
PG_9.2_20231/

That means if I upgrade to 9.3, there will be another subdirectory for
9.3, _inside_ the same tablespace location.  This change was added in
Postgres 9.0 to allow for upgrades without having to move tablespaces. 

Now, since you are upgrading from 8.4, and don't have a subdirectory,
the 9.1 cluster will be created inside the tablespace directory, so it
will look like:

323234/ 423411/ 932323/ PG_9.1_201105231/


I realize that is kind of confusing, but it works just fine, and
pg_upgrade will provide you with a script to delete the old cluster, and
its subdirectories, when you are ready.

I hope this helps clarify things.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-05 Thread Bruce Momjian
Tory M Blue wrote:
> Bruce is right, I didn't move tablespaces (I didn't know to be honest
> I had to, but it makes sense). I simply moved the location of the data
> files, from /data to /data1. But I did "not", change any sym links or

I was unclear if you moved the data directory or the tablespace.  Your
example showed you moving something that didn't look like data
directories:

> So I move olddata to /data1/db
> oldbinary to /pgsql8/bin
>
> new 9.1.1 db goes to /data/db
> newbinary installs at /pgsql/
>
> So when I run pg_upgrade (check validates the config), however trying
> to the upgrade nets;
> Restoring user relation files
>   /data/queue/16384/16406
> error while copying queue.adm_version (/data/queue/16384/16406 to
> /data/queue/PG_9.1_201105231/16407/16406): No such file or directory
> Failure, exiting

/data/db and /data/queue are not data locations, or at least they are
not ones we create during the install.  Was the real data directory and
the tablespaces all under /data1?  Did you define these tablespace
locations using relative paths?

> do any other pre-steps, other than install the new binary, make sure
> that there was a new and old data location as well as a new and old
> binary location.

You can definitely move data directories around. 

> Since my build processes installs data files at /data and binary at
> /pgsql/, I simply moved the old Data and binaries, before installing
> my new build. So /pgsql/ became /pgsql8/ and /data/ became /data1/

I think you can do that but your error messages don't say that.
 
> I do understand what you are all saying in regards to the tablespace
> links and tablespace locations.It made total sense when Bruce pointed
> it out initially. However I'm not sure if I should of known that
> pg_upgrade doesn't handle this, and or this would be a concern.
> pg_upgrade asks for old and new locations, so one would think that
> this information would be used for the upgrade process, including
> potentially changing tablespace paths during the migration step
> , this is above my pay grade.

There is no Postgres support for moving tablespaces, so it isn't
surprising that pg_upgrade doesn't handle it.

> But initial response to all this, is umm we have not really made a
> dump/restore unnecessary with the latest releases of Postgres than, as
> I would have to think that there is a high percentage of users whom
> use tablespaces.

Yes, but they don't change tablespace locations during the upgrade.  In
fact, we have had surprisingly few (zero) request for moving
tablespaces, and now we are trying to implement this for Postgres 9.2. 
The normal API will be to have the user move the tablespace before the
upgrade, but as I said before, it isn't easy to do now in Postgres.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-05 Thread Bruce Momjian
Nicholson, Brad (Toronto, ON, CA) wrote:
> > -Original Message-
> > From: Bruce Momjian [mailto:br...@momjian.us]
> > Sent: Monday, December 05, 2011 10:24 AM
> > To: Nicholson, Brad (Toronto, ON, CA)
> > Cc: Tory M Blue; pgsql-performance@postgresql.org; Magnus Hagander
> > Subject: Re: [PERFORM] pg_upgrade
> >
> > Nicholson, Brad (Toronto, ON, CA) wrote:
> > > > You mean moving tablespaces?  That isn't something pg_upgrade deals
> > > > with.  If we need docs to move tablespaces, it is a missing piece
> > of
> > > > our
> > > > main docs, not something pg_upgrade would ever mention.
> > >
> > > If I'm reading the issue correctly, and pg_upgrade gets part way
> > through
> > > an upgrade then fails if it hits a tablespace - it seems to me like
> > > the pg_upgrade should check for such a condition at the initial
> > > validation stage not proceed if found.
> >
> > Checking for all such cases would make pg_upgrade huge and unusable.
> > If
> > you messed up your configuration, pg_upgrade can't check for every such
> > case.  There are thosands of ways people can mess up their
> > configuration.
> 
> Based on the OP this does not seem like a messed up configuration.  It
> sounds like the OP used a fully supported core feature of Postgres
> (tablespaces) and pg_upgrade failed as a result.  I think having our
> upgrade utility fail under such circumstances is a bad thing.

The OP has not indicated exactly what he did to move the tablespaces, so
I have to assume he changed the SQL location but not the symbolic link
location, or some other misconfiguration.  Can someone provide the steps
that caused the failure?

pg_upgrade works fine for tablespaces so there must be something
different about his configuration.  Unless I hear details, I have to
assume the tablespace move was done incorrectly.  This is the first
tablespace failure like this I have ever gotten, and I do test
tablespaces.  Perhaps something is wrong, but I can't guess what it is.

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-05 Thread Bruce Momjian
Nicholson, Brad (Toronto, ON, CA) wrote:
> > You mean moving tablespaces?  That isn't something pg_upgrade deals
> > with.  If we need docs to move tablespaces, it is a missing piece of
> > our
> > main docs, not something pg_upgrade would ever mention.
> 
> If I'm reading the issue correctly, and pg_upgrade gets part way through
> an upgrade then fails if it hits a tablespace - it seems to me like
> the pg_upgrade should check for such a condition at the initial
> validation stage not proceed if found.

Checking for all such cases would make pg_upgrade huge and unusable.  If
you messed up your configuration, pg_upgrade can't check for every such
case.  There are thosands of ways people can mess up their configuration.

I think you should read up on how pg_upgrade attempts to be minimal:

http://momjian.us/main/blogs/pgblog/2011.html#June_15_2011_2

On a related note, Magnus is working on code for Postgres 9.2 that would
allow for easier moving of tablespaces.

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intersect/Union X AND/OR

2011-12-05 Thread Bruce Momjian
Thiago Godoi wrote:
> Thanks for the answers.
> 
> I found one of these cases , but I'm trying to understand this. Why the
> performance is better? The number of tuples is making the difference?
> 
> My original query :
> 
> select table1.id
> from table1, (select function(12345) id) table2
> where table1.kind = 1234
> and table1.id = table2.id
> 
> "Nested Loop  (cost=0.00..6.68 rows=1 width=12)"
> "  Join Filter: ()"
> "  ->  Seq Scan on recorte  (cost=0.00..6.39 rows=1 width=159)"
> "Filter: (id = 616)"
> "  ->  Result  (cost=0.00..0.26 rows=1 width=0)"
> 
> 
> -- function() returns a resultset
> 
> I tryed with explicit join and "in" , but the plan is the same.
> 
> When I changed the query to use intersect :
> 
> 
> (select table1.id from table1 where table1.kind = 1234)
> Intersect
> (select function(12345) id)
> 
> The new plan is :
> 
> "HashSetOp Intersect  (cost=0.00..6.67 rows=1 width=80)"
> "  ->  Append  (cost=0.00..6.67 rows=2 width=80)"
> "->  Subquery Scan on "*SELECT* 1"  (cost=0.00..6.40 rows=1
> width=159)"
> "  ->  Seq Scan on recorte  (cost=0.00..6.39 rows=1 width=159)"
> "Filter: (id = 616)"
> "->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.27 rows=1
> width=0)"
> "  ->  Result  (cost=0.00..0.26 rows=1 width=0)"
> 
> The second plan is about 10 times faster than the first one.

Well, there are usually several ways to execute a query internally,
intsersect is using a different, and faster, method.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-03 Thread Bruce Momjian
Bruce Momjian wrote:
> Tory M Blue wrote:
> > On Sat, Dec 3, 2011 at 6:04 AM, Bruce Momjian  wrote:
> > 
> > > Well, I am not totally clear how you are moving things around, but I do
> > > know pg_upgrade isn't happy to have the old and new cluster be very
> > > different.
> > >
> > > What I think is happening is that you didn't properly move the
> > > tablespace in the old cluster. ?We don't give you a very easy way to do
> > > that. ?You need to not only move the directory, but you need to update
> > > the symlinks in data/pg_tblspc/, and update the pg_tablespace system
> > > table. ?Did you do all of that? ?Does the 8.4 server see the tablespace
> > > properly after the move, but before pg_upgrade?
> > 
> > 
> > Simple answer is umm no..
> 
> The "no" is an answer to which question?
> 
> > "http://www.postgresql.org/docs/current/static/pgupgrade.html"; is
> > obviously lacking than :)
> > 
> > S I can take what you have told me and see if I can't attempt to
> > make those things happen and try again. Makes sense, but boy that's a
> > large piece of info missing in the document!
> 
> You mean moving tablespaces?  That isn't something pg_upgrade deals
> with.  If we need docs to move tablespaces, it is a missing piece of our
> main docs, not something pg_upgrade would ever mention.

FYI, I have asked on the docs list about getting this documented.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-03 Thread Bruce Momjian
Tory M Blue wrote:
> On Sat, Dec 3, 2011 at 6:04 AM, Bruce Momjian  wrote:
> 
> > Well, I am not totally clear how you are moving things around, but I do
> > know pg_upgrade isn't happy to have the old and new cluster be very
> > different.
> >
> > What I think is happening is that you didn't properly move the
> > tablespace in the old cluster. ?We don't give you a very easy way to do
> > that. ?You need to not only move the directory, but you need to update
> > the symlinks in data/pg_tblspc/, and update the pg_tablespace system
> > table. ?Did you do all of that? ?Does the 8.4 server see the tablespace
> > properly after the move, but before pg_upgrade?
> 
> 
> Simple answer is umm no..

The "no" is an answer to which question?

> "http://www.postgresql.org/docs/current/static/pgupgrade.html"; is
> obviously lacking than :)
> 
> S I can take what you have told me and see if I can't attempt to
> make those things happen and try again. Makes sense, but boy that's a
> large piece of info missing in the document!

You mean moving tablespaces?  That isn't something pg_upgrade deals
with.  If we need docs to move tablespaces, it is a missing piece of our
main docs, not something pg_upgrade would ever mention.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_upgrade

2011-12-03 Thread Bruce Momjian
Tory M Blue wrote:
> So we are making progress on our performance issues, we are splitting
> the data, changing the index value etc. So far having some success,
> but we also want to test out some of the options  and changes in the 9
> branch, but trying to dump and restore 750gb of data is not all that
> fun, so I'm trying to avoid that.
> 
> So upgraded from 8.4.4 64 bit to 9.1.1 64bit.
> 
> If we upgrade a database that just uses the public table space there
> are no issues, works fine. However when we try to upgrade a db that
> has tablespaces defined it errors out trying to load the data from the
> then now new db.
> 
> The tablespaces are hardcoded with a path, so that seems to cause issues.
> 
> Steps I'm taking
> 
> Standard location of data /data/db
> Standard binary location /pgsql/bin
> 
> I'm moving the standard location to /data1/db and moving the binaries
> to /pgsql8/bin
> 
> WHY: because my build scripts put my binaries and data in these
> locations, so without recreating my build process, I have to move the
> current data and binary locations before I install 9.11
> 
> So I move olddata to /data1/db
> oldbinary to /pgsql8/bin
> 
> new 9.1.1 db goes to /data/db
> newbinary installs at /pgsql/
> 
> So when I run pg_upgrade (check validates the config), however trying
> to the upgrade nets;
> Restoring user relation files
>   /data/queue/16384/16406
> error while copying queue.adm_version (/data/queue/16384/16406 to
> /data/queue/PG_9.1_201105231/16407/16406): No such file or directory
> Failure, exiting
> 
> As you can see, it's sticking with it's original path and not
> realizing that I'm trying now to install into /data from /data1
> 
> What is the flaw here? Do I have to rebuild my build process to
> install in a different location?, not sure what my choices are here. I
> mean I'm telling the upgrade process where new and old are located, I
> believe it should be overriding something and not allowing the
> included error.
> 
> Slaps and or pointers are welcome

Well, I am not totally clear how you are moving things around, but I do
know pg_upgrade isn't happy to have the old and new cluster be very
different.

What I think is happening is that you didn't properly move the
tablespace in the old cluster.  We don't give you a very easy way to do
that.  You need to not only move the directory, but you need to update
the symlinks in data/pg_tblspc/, and update the pg_tablespace system
table.  Did you do all of that?  Does the 8.4 server see the tablespace
properly after the move, but before pg_upgrade?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intersect/Union X AND/OR

2011-12-02 Thread Bruce Momjian
Thiago Godoi wrote:
> Hi all,
> 
> I found this presentation from B.  Momjian:
> 
> http://momjian.us/main/writings/pgsql/performance.pdf
> 
> I'm interested in what he said about " Intersect/Union X AND/OR " , Can I
> find a transcription or a video of this presentation? Can anyone explain it
> to me?

Well, there is a recording of the webcast on the EnterpriseDB web site,
but I am afraid they only allow viewing of 3+ hour webcasts by
EnterpriseDB customers.

The idea is that a query that uses an OR can be rewritten as two SELECTs
with a UNION between them.  I have seen rare cases where this is a win,
so I mentioned it in that talk.  Intersection is similarly possible for
AND in WHERE clauses.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD options, small database, ZFS

2011-11-22 Thread Bruce Momjian
Amitabh Kant wrote:
> > The whole memorys speed topic is also much more complicated than any
> > simple explanation can cover.  How many banks of RAM you can use
> > effectively changes based on the number of CPUs and associated chipset too.
> >  Someone just sent me an explanation recently of why I was seeing some
> > strange things on my stream-scaling benchmark program.  That dove into a
> > bunch of trivia around how the RAM is actually accessed on the motherboard.
> >  One of the reasons I keep so many samples on that program's page is to
> > help people navigate this whole maze, and have some data points to set
> > expectations against.  See 
> > https://github.com/gregs1104/**stream-scaling<https://github.com/gregs1104/stream-scaling>for
> >  the code and the samples.
> >
> > --
> > Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> > PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> >
> >
> >
> Greg
> 
> On a slightly unrelated note, you had once (
> http://archives.postgresql.org/pgsql-general/2011-08/msg00944.php) said to
> limit shared_buffers max to 8 GB on Linux and leave the rest for OS
> caching. Does the same advice hold on FreeBSD systems too?

Hard to say.  We don't know why this is happening but we are guessing it
is the overhead of managing over one million shared buffers.  Please
test and let us know.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD options, small database, ZFS

2011-11-22 Thread Bruce Momjian
Greg Smith wrote:
> The whole memorys speed topic is also much more complicated than any 
> simple explanation can cover.  How many banks of RAM you can use 
> effectively changes based on the number of CPUs and associated chipset 
> too.  Someone just sent me an explanation recently of why I was seeing 
> some strange things on my stream-scaling benchmark program.  That dove 
> into a bunch of trivia around how the RAM is actually accessed on the 
> motherboard.  One of the reasons I keep so many samples on that 
> program's page is to help people navigate this whole maze, and have some 
> data points to set expectations against.  See 
> https://github.com/gregs1104/stream-scaling for the code and the samples.

I can confirm that a Xeon E5620 CPU wants memory to be in multiples of
3, and a dual-CPU 5620 system needs memory in multiples of 6.  (I
installed 12 2GB modules.)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Allow sorts to use more available memory

2011-10-05 Thread Bruce Momjian
Stephen Frost wrote:
-- Start of PGP signed section.
> * Robert Schnabel (schnab...@missouri.edu) wrote:
> > And getting back to the to-do list entry and reading the related
> > posts, it appears that even if you could set work_mem that high it
> > would only use 2GB anyway.  I guess that was the second part of my
> > question.  Is that true?
> 
> Errr, and to get back to the to-do (which I've been considering doing
> something about...), it's to allow the *actual* memory usage for things
> like sorts to use more than 2GB, but as others have pointed out, you can
> do that by putting pgsql_tmp on a memory filesystem and letting the
> sorts spill to the memory-based FS.

It would be nice if the tempfs would allow us to control total temp
memory usage, except it causes a failure rather than splilling to real
disk.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU still needed with SSD?

2011-07-18 Thread Bruce Momjian
Andy wrote:
> 
> 
> --- On Mon, 7/18/11, David Rees  wrote:
> 
> > >> In this case is BBU still needed? If I put 2 SSD
> > in software RAID 1, would
> > >> that be any slower than 2 SSD in HW RAID 1 with
> > BBU? What are the pros and
> > >> cons?
> >
> > What will perform better will vary greatly depending on the
> > exact
> > SSDs, rotating disks, RAID BBU controller and
> > application.? But
> > certainly a couple of Intel 320s in RAID1 seem to be an
> > inexpensive
> > way of getting very good performance while maintaining
> > reliability.
> 
> I'm not comparing SSD in SW RAID with rotating disks in HW RAID with
> BBU though. I'm just comparing SSDs with or without BBU. I'm going to
> get a couple of Intel 320s, just want to know if BBU makes sense for
> them.

Yes, it certainly does, even if you have a RAID BBU.

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...

2011-02-16 Thread Bruce Momjian
Kevin Grittner wrote:
> Shaun Thomas  wrote:
>  
> > how difficult would it be to add that syntax to the JOIN
> > statement, for example?
>  
> Something like this syntax?:
>  
> JOIN WITH (correlation_factor=0.3)
>  
> Where 1.0 might mean that for each value on the left there was only
> one distinct value on the right, and 0.0 would mean that they were
> entirely independent?  (Just as an off-the-cuff example -- I'm not
> at all sure that this makes sense, let alone is the best thing to
> specify.  I'm trying to get at *syntax* here, not particular knobs.)

I am not excited about the idea of putting these correlations in
queries.  What would be more intesting would be for analyze to build a
correlation coeffficent matrix showing how columns are correlated:

a   b   c
a   1   .4  0
b   .1  1   -.3
c   .2  .3  1

and those correlations could be used to weigh how the single-column
statistics should be combined.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Bruce Momjian
Mladen Gogala wrote:
> Actually, it is not unlike a religious dogma, only stating that "hints 
> are bad". It even says so in the wiki. The arguments are
> 1) Refusal to implement hints is motivated by distrust toward users, 
> citing that some people may mess things up.
> Yes, they can, with and without hints.
> 2) All other databases have them. This is a major feature and if I were 
> in the MySQL camp, I would use it as an
>argument. Asking me for some "proof" is missing the point. All other 
> databases have hints precisely because
>they are useful. Assertion that only Postgres is so smart that can 
> operate without hints doesn't match the
>reality. As a matter of fact, Oracle RDBMS on the same machine will 
> regularly beat PgSQL in performance.
>That has been my experience so far.   I even posted counting query 
> results.
> 3) Hints are "make it or break it" feature. They're absolutely needed in 
> the fire extinguishing situations.
> 
> I see no arguments to say otherwise and until that ridiculous "we don't 
> want hints" dogma is on wiki, this is precisely what it is:  a dogma. 

Uh, that is kind of funny considering that text is on a 'wiki', meaning
everything there is open to change if the group agrees.

> Dogmas do not change and I am sorry that you don't see it that way. 
> However, this discussion
> did convince me that I need to take another look at MySQL and tone down 
> my engagement with PostgreSQL community. This is my last post on the 
> subject because posts are becoming increasingly personal. This level of 
> irritation is also
> characteristic of a religious community chastising a sinner. Let me 
> remind you again: all other major databases have that possibility: 
> Oracle, MySQL, DB2, SQL Server and Informix. Requiring burden of proof 
> about hints is equivalent to saying that all these databases are 
> developed by idiots and have a crappy optimizer.

You need to state the case for hints independent of what other databases
do, and indepdendent of fixing the problems where the optimizer doesn't
match reatility.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Bruce Momjian
Mladen Gogala wrote:
> Chris Browne wrote:
> > Well, the community declines to add hints until there is actual
> > consensus on a good way to add hints.
> >   
> OK. That's another matter entirely.   Who should make that decision? Is 
> there a committee or a person who would be capable of making that decision?
> 
> > Nobody has ever proposed a way to add hints where consensus was arrived
> > at that the way was good, so...
> >   
> 
> So, I will have to go back on my decision to use Postgres and 
> re-consider MySQL? I will rather throw away the effort invested in 

You want to reconsider using MySQL because Postgres doesn't have hints. 
Hard to see how that logic works.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Bruce Momjian
Greg Smith wrote:
> Check out 
> http://www.indeed.com/jobtrends?q=postgres%2C+mysql%2C+oracle&relative=1&relative=1
>  
> if you want to see the real story here.  Oracle has a large installed 
> base, but it's considered a troublesome legacy product being replaced 

+1 for Oracle being a "troublesome legacy product".

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Bruce Momjian
Mladen Gogala wrote:
> Hints are not even that complicated to program. The SQL parser should 
> compile the list of hints into a table and optimizer should check 
> whether any of the applicable access methods exist in the table. If it 
> does - use it. If not, ignore it. This looks to me like a philosophical 
> issue, not a programming issue. Basically, the current Postgres 
> philosophy can be described like this: if the database was a gas stove, 
> it would occasionally catch fire. However, bundling a fire extinguisher 
> with the stove is somehow seen as bad. When the stove catches fire, 
> users is expected to report the issue and wait for a better stove to be 
> developed. It is a very rough analogy, but rather accurate one, too.

That might be true.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Bruce Momjian
Mladen Gogala wrote:
> Greg Smith wrote:
> > Mladen Gogala wrote:
> >   
> >> The techies at big companies are the guys who will or will not make it 
> >> happen. And these guys are not beginners.  Appeasing them may actually 
> >> go a long way.
> >> 
> >
> > The PostgreSQL community isn't real big on appeasing people if it's at 
> > the expense of robustness or correctness, and this issue falls into that 
> > category.  
>
> With all due respect, I don't see how does the issue of hints fall into 
> this category? As I explained, the mechanisms are already there, they're 
> just not elegant enough. The verb "appease" doesn't convey the meaning 
> that I had in mind quite correctly. The phrase "target population" would 
> have  described what I wanted to say in a much better way .

The settings are currently there to better model the real world
(random_page_cost), or for testing (enable_seqscan).  They are not there
to force certain plans.  They can be used for that, but that is not
their purpose and they would not have been added if that was their
purpose.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-02 Thread Bruce Momjian
Andrew Dunstan wrote:
> 
> 
> On 02/02/2011 07:17 PM, Greg Smith wrote:
> > I direct anyone who thought Mladen was making a serious comment to 
> > http://www.nydailynews.com/news/politics/2009/01/08/2009-01-08_misunderestimate_tops_list_of_notable_bu-3.html
> >  
> > if you want to get his little joke there.  I plan to start using 
> > "misunderestimate" more in the future when talking about planner 
> > errors.  Might even try to slip it into the docs at some point in the 
> > future and see if anybody catches it.
> 
> My wings take dream ...

I think this humorous video really nails it:

http://www.youtube.com/watch?v=Km26gMI847Y
Presidential Speechalist 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-02 Thread Bruce Momjian
Greg Smith wrote:
> Mladen Gogala wrote:
> > Greg, how many questions about queries not using an index have you 
> > seen? There is a reason why people keep asking that. The sheer number 
> > of questions like that on this group should tell you that there is a 
> > problem there. There must be a relatively simple way of influencing 
> > optimizer decisions. 
> 
> I think that's not quite the right question.  For every person like 
> yourself who is making an informed "the optimizer is really picking the 
> wrong index" request, I think there are more who are asking for that but 
> are not actually right that it will help.  I think you would agree that 
> this area is hard to understand, and easy to make mistakes about, yes?  
> So the right question is "how many questions about queries not using an 
> index would have actually benefitted from the behavior they asked for?"  
> That's a much fuzzier and harder to answer question.

Agreed.  I created an FAQ entry years ago to explain this point and tell
people how to test it:


http://wiki.postgresql.org/wiki/FAQ#Why_are_my_queries_slow.3F_Why_don.27t_they_use_my_indexes.3F

Once I added that FAQ we had many fewer email questions about index
choice.

> > With all due respect, I consider myself smarter than the optimizer.  
> > I'm 6'4", 235LBS so telling me that you disagree and that I am more 
> > stupid than a computer program,  would not be a smart thing to do. 
> > Please, do not misunderestimate me.
> 
> I remember when I used to only weigh that much.  You are lucky to be 
> such a slim little guy!
> 
> Oh, I guess I should add, :)

Oh, wow, what a great retort.  :-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-02 Thread Bruce Momjian
Mladen Gogala wrote:
> > 2) The sort of random I/O done by index lookups can be as much as 50X as 
> > expensive on standard hard drives as sequential, if every block goes to 
> > physical hardware.
> >   
> 
> Greg, how many questions about queries not using an index have you seen? 
> There is a reason why people keep asking that. The sheer number of 
> questions like that on this group should tell you that there is a 
> problem there. 

Very few of those reports found that an index scan was indeed faster ---
they just assumed so but when they actually tested it, they understood.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2011-02-01 Thread Bruce Momjian
Bruce Momjian wrote:
> Robert Haas wrote:
> > On Wed, Aug 11, 2010 at 9:42 PM, Bruce Momjian  wrote:
> > > This confused me. ?If we are assuing the data is in
> > > effective_cache_size, why are we adding sequential/random page cost to
> > > the query cost routines?
> > 
> > See the comments for index_pages_fetched().  We basically assume that
> > all data starts uncached at the beginning of each query - in fact,
> > each plan node.  effective_cache_size only measures the chances that
> > if we hit the same block again later in the execution of something
> > like a nested-loop-with-inner-indexscan, it'll still be in cache.
> > 
> > It's an extremely weak knob, and unless you have tables or indices
> > that are larger than RAM, the only mistake you can make is setting it
> > too low.
> 
> The attached patch documents that there is no assumption that data
> remains in the disk cache between queries.  I thought this information
> might be helpful.

Applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2011-01-31 Thread Bruce Momjian
Robert Haas wrote:
> On Wed, Aug 11, 2010 at 9:42 PM, Bruce Momjian  wrote:
> > This confused me. ?If we are assuing the data is in
> > effective_cache_size, why are we adding sequential/random page cost to
> > the query cost routines?
> 
> See the comments for index_pages_fetched().  We basically assume that
> all data starts uncached at the beginning of each query - in fact,
> each plan node.  effective_cache_size only measures the chances that
> if we hit the same block again later in the execution of something
> like a nested-loop-with-inner-indexscan, it'll still be in cache.
> 
> It's an extremely weak knob, and unless you have tables or indices
> that are larger than RAM, the only mistake you can make is setting it
> too low.

The attached patch documents that there is no assumption that data
remains in the disk cache between queries.  I thought this information
might be helpful.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77cacdd..520170b 100644
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** SET ENABLE_SEQSCAN TO OFF;
*** 2424,2430 
  space.  This parameter has no effect on the size of shared
  memory allocated by PostgreSQL, nor
  does it reserve kernel disk cache; it is used only for estimation
! purposes.  The default is 128 megabytes (128MB).
 

   
--- 2424,2432 
  space.  This parameter has no effect on the size of shared
  memory allocated by PostgreSQL, nor
  does it reserve kernel disk cache; it is used only for estimation
! purposes.  The system also does not assume data remains in
! the disk cache between queries.  The default is 128 megabytes
! (128MB).
 

   

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Running PostgreSQL as fast as possible no matter the consequences

2011-01-27 Thread Bruce Momjian
Jeff Janes wrote:
> On Tue, Jan 25, 2011 at 5:32 PM, Bruce Momjian  wrote:
> > Robert Haas wrote:
> >> On Wed, Jan 19, 2011 at 12:07 PM, Bruce Momjian  wrote:
> 
> >> > ? ? ? 
> >> > ?http://developer.postgresql.org/pgdocs/postgres/non-durability.html
> >>
> >> This sentence looks to me like it should be removed, or perhaps clarified:
> >>
> >> ? ? This does affect database crash transaction durability.
> >
> > Uh, doesn't it affect database crash transaction durability? ?I have
> > applied the attached patch to clarify things. ?Thanks.
> 
> I think the point that was trying to be made there was that the other
> parameters only lose and corrupt data when the machine crashes.
> Synchronous commit turned off will lose data on a mere postgresql
> server crash, it doesn't require a machine-level crash to cause data
> loss.
> 
> Indeed, the currently committed doc is quite misleading.
> 
> " The following are configuration changes you can make
> to improve performance in such cases;  they do not invalidate
> commit guarantees related to database crashes, only abrupt operating
> system stoppage, except as mentioned below"
> 
> We've now removed the thing being mentioned below, but did not remove
> the promise we would be mentioning those things.

Excellent point.  The old wording was just too clever and even I forgot
why I was making that point.  I have updated the docs to clearly state
why this setting is different from the ones above.  Thanks for spotting
this.

Applied patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 0a10457..1bec5b1 100644
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** SELECT * FROM x, y, a, b, c WHERE someth
*** 1157,1165 
  
   

!Turn off ;  there is no
 need to write the WAL to disk on every
!commit.

   
  
--- 1157,1166 
  
   

!Turn off ;  there might be no
 need to write the WAL to disk on every
!commit.  This does enable possible tranaction loss in case of
!a database crash.

   
  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-26 Thread Bruce Momjian
Robert Haas wrote:
> On Thu, Jan 20, 2011 at 4:17 AM, C?dric Villemain
>  wrote:
> >>> I think his point is that we already have a proven formula
> >>> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
> >>> The problem is to figure out what numbers to apply the M-L formula to.
> >>>
> >>> I've been thinking that we ought to try to use it in the context of the
> >>> query as a whole rather than for individual table scans; the current
> >>> usage already has some of that flavor but we haven't taken it to the
> >>> logical conclusion.
> >>
> >> Is there a TODO here?
> >
> > it looks like, yes.
> 
> "Modify the planner to better estimate caching effects"?

Added to TODO.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Running PostgreSQL as fast as possible no matter the consequences

2011-01-25 Thread Bruce Momjian
Robert Haas wrote:
> On Wed, Jan 19, 2011 at 12:07 PM, Bruce Momjian  wrote:
> > Chris Browne wrote:
> >> gentosa...@gmail.com (A B) writes:
> >> > If you just wanted PostgreSQL to go as fast as possible WITHOUT any
> >> > care for your data (you accept 100% dataloss and datacorruption if any
> >> > error should occur), what settings should you use then?
> >>
> >> Use /dev/null. ?It is web scale, and there are good tutorials.
> >>
> >> But seriously, there *are* cases where "blind speed" is of use. ?When
> >> loading data into a fresh database is a good time for this; if things
> >> fall over, it may be pretty acceptable to start "from scratch" with
> >> mkfs/initdb.
> >>
> >> I'd:
> >> - turn off fsync
> >> - turn off synchronous commit
> >> - put as much as possible onto Ramdisk/tmpfs/similar as possible
> >
> > FYI, we do have a documentation section about how to configure Postgres
> > for improved performance if you don't care about durability:
> >
> > ? ? ? ?http://developer.postgresql.org/pgdocs/postgres/non-durability.html
> 
> This sentence looks to me like it should be removed, or perhaps clarified:
> 
> This does affect database crash transaction durability.

Uh, doesn't it affect database crash transaction durability?  I have
applied the attached patch to clarify things.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 2699828..fb55598 100644
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** SELECT * FROM x, y, a, b, c WHERE someth
*** 1159,1165 

 Turn off ;  there might be no
 need to write the WAL to disk on every
!commit.  This does affect database crash transaction durability.

   
  
--- 1159,1165 

 Turn off ;  there might be no
 need to write the WAL to disk on every
!commit.  This can cause transaction loss after a server crash.

   
  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Tom Lane wrote:
> Robert Haas  writes:
> > On Fri, Nov 12, 2010 at 4:15 AM, C?dric Villemain
> >  wrote:
> >>> I wondering if we could do something with a formula like 3 *
> >>> amount_of_data_to_read / (3 * amount_of_data_to_read +
> >>> effective_cache_size) = percentage NOT cached. ?That is, if we're
> >>> reading an amount of data equal to effective_cache_size, we assume 25%
> >>> caching, and plot a smooth curve through that point. ?In the examples
> >>> above, we would assume that a 150MB read is 87% cached, a 1GB read is
> >>> 50% cached, and a 3GB read is 25% cached.
> 
> >> But isn't it already the behavior of effective_cache_size usage ?
> 
> > No.
> 
> I think his point is that we already have a proven formula
> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
> The problem is to figure out what numbers to apply the M-L formula to.
> 
> I've been thinking that we ought to try to use it in the context of the
> query as a whole rather than for individual table scans; the current
> usage already has some of that flavor but we haven't taken it to the
> logical conclusion.

Is there a TODO here?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Robert Haas wrote:
> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> Yeah. ?For Kevin's case, it seems like we want the caching percentage
> >> to vary not so much based on which table we're hitting at the moment
> >> but on how much of it we're actually reading.
> >
> > Well, we could certainly take the expected number of pages to read and
> > compare that to effective_cache_size. ?The thing that's missing in that
> > equation is how much other stuff is competing for cache space. ?I've
> > tried to avoid having the planner need to know the total size of the
> > database cluster, but it's kind of hard to avoid that if you want to
> > model this honestly.
> 
> I'm not sure I agree with that.  I mean, you could easily have a
> database that is much larger than effective_cache_size, but only that
> much of it is hot.  Or, the hot portion could move around over time.
> And for reasons of both technical complexity and plan stability, I
> don't think we want to try to model that.  It seems perfectly
> reasonable to say that reading 25% of effective_cache_size will be
> more expensive *per-page* than reading 5% of effective_cache_size,
> independently of what the total cluster size is.

Late reply, but one idea is to have the executor store hit counts for
later use by the optimizer.  Only the executor knows how many pages it
had to request from the kernel for a query.  Perhaps getrusage could
tell us how often we hit the disk.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Tom Lane wrote:
> Mladen Gogala  writes:
> > Again, having an optimizer which will choose the plan completely 
> > accurately is, at least in my opinion, less important than having a 
> > possibility of manual control, the aforementioned "knobs and buttons" 
> > and produce the same plan for the same statement.
> 
> More knobs and buttons is the Oracle way, and the end result of that
> process is that you have something as hard to use as Oracle.  That's
> generally not thought of as desirable in this community.

Let reply, but Mladen, you might want to look at my blog entry
explaining why knobs are often not useful because they are only used by
a small percentage of users (and confuse the rest):

http://momjian.us/main/blogs/pgblog/2009.html#January_10_2009

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Running PostgreSQL as fast as possible no matter the consequences

2011-01-19 Thread Bruce Momjian
Chris Browne wrote:
> gentosa...@gmail.com (A B) writes:
> > If you just wanted PostgreSQL to go as fast as possible WITHOUT any
> > care for your data (you accept 100% dataloss and datacorruption if any
> > error should occur), what settings should you use then?
> 
> Use /dev/null.  It is web scale, and there are good tutorials.
> 
> But seriously, there *are* cases where "blind speed" is of use.  When
> loading data into a fresh database is a good time for this; if things
> fall over, it may be pretty acceptable to start "from scratch" with
> mkfs/initdb.
> 
> I'd:
> - turn off fsync
> - turn off synchronous commit
> - put as much as possible onto Ramdisk/tmpfs/similar as possible

FYI, we do have a documentation section about how to configure Postgres
for improved performance if you don't care about durability:

    http://developer.postgresql.org/pgdocs/postgres/non-durability.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-12-22 Thread Bruce Momjian
Bruce Momjian wrote:
> Greg Smith wrote:
> > Kevin Grittner wrote:
> > > I assume that we send a full
> > > 8K to the OS cache, and the file system writes disk sectors
> > > according to its own algorithm.  With either platters or BBU cache,
> > > the data is persisted on fsync; why do you see a risk with one but
> > > not the other
> > 
> > I'd like a 10 minute argument please.  I started to write something to 
> > refute this, only to clarify in my head the sequence of events that 
> > leads to the most questionable result, where I feel a bit less certain 
> > than I did before of the safety here.  Here is the worst case I believe 
> > you're describing:
> > 
> > 1) Transaction is written to the WAL and sync'd; client receives 
> > COMMIT.  Since full_page_writes is off, the data in the WAL consists 
> > only of the delta of what changed on the page.
> > 2) 8K database page is written to OS cache
> > 3) PG calls fsync to force the database block out
> > 4) OS writes first 4K block of the change to the BBU write cache.  Worst 
> > case, this fills the cache, and it takes a moment for some random writes 
> > to process before it has space to buffer again (makes this more likely 
> > to happen, but it's not required to see the failure case here)
> > 5) Sudden power interruption, second half of the page write is lost
> > 6) Server restarts
> > 7) That 4K write is now replayed from the battery's cache
> > 
> > At this point, you now have a torn 8K page, with 1/2 old and 1/2 new 
> 
> Based on this report, I think we need to update our documentation and
> backpatch removal of text that says that BBU users can safely turn off
> full-page writes.  Patch attached.
> 
> I think we have fallen into a trap I remember from the late 1990's where
> I was assuming that an 8k-block based file system would write to the
> disk atomically in 8k segments, which of course it cannot.  My bet is
> that even if you write to the kernel in 8k pages, and have an 8k file
> system, the disk is still accessed via 512-byte blocks, even with a BBU.

Doc patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a2724fa..1e67bbd 100644
*** /tmp/pgrevert.14281/7sLqTb_wal.sgml	Tue Nov 30 21:57:17 2010
--- doc/src/sgml/wal.sgml	Tue Nov 30 21:56:49 2010
***
*** 164,173 
 PostgreSQL periodically writes full page images to
 permanent WAL storage before modifying the actual page on
 disk. By doing this, during crash recovery PostgreSQL can
!restore partially-written pages.  If you have a battery-backed disk
!controller or file-system software that prevents partial page writes
!(e.g., ZFS),  you can turn off this page imaging by turning off the
! parameter.

   
  
--- 164,175 
 PostgreSQL periodically writes full page images to
 permanent WAL storage before modifying the actual page on
 disk. By doing this, during crash recovery PostgreSQL can
!restore partially-written pages.  If you have file-system software
!that prevents partial page writes (e.g., ZFS),  you can turn off
!this page imaging by turning off the  parameter. Battery-Backed unit
!(BBU) disk controllers do not prevent partial page writes unless
!they guarantee that data is written to the BBU as full (8kB) pages.

   
  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-12-01 Thread Bruce Momjian
Pierre C wrote:
> 
> > Is that true?  I have no idea.  I thought everything was done at the
> > 512-byte block level.
> 
> Newer disks (2TB and up) can have 4k sectors, but this still means a page  
> spans several sectors.

Yes, I had heard about that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-11-30 Thread Bruce Momjian
Greg Smith wrote:
> Tom Lane wrote:
> > You've got entirely too simplistic a view of what the "delta" might be,
> > I fear.  In particular there are various sorts of changes that involve
> > inserting the data carried in the WAL record and shifting pre-existing
> > data around to make room, or removing an item and moving remaining data
> > around.  If you try to replay that type of action against a torn page,
> > you'll get corrupted results.
> >   
> 
> I wasn't sure exactly how those were encoded, thanks for the 
> clarification.  Given that, it seems to me there are only two situations 
> where full_page_writes is safe to turn off:
> 
> 1) The operating system block size is exactly the same database block 
> size, and all writes are guaranteed to be atomic to that block size. 

Is that true?  I have no idea.  I thought everything was done at the
512-byte block level.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-11-30 Thread Bruce Momjian
Kevin Grittner wrote:
> Greg Smith  wrote:
>  
> > I think Kevin's point here may be that if your fsync isn't
> > reliable, you're always in trouble.  But if your fsync is good,
> > even torn pages should be repairable by the deltas written to the
> > WAL
>  
> I was actually just arguing that a BBU doesn't eliminate a risk
> here; if there is a risk with production-quality disk drives, there
> is a risk with a controller with a BBU cache.  The BBU cache just
> tends to reduce the window of time in which corruption can occur.  I
> wasn't too sure of *why* there was a risk, but Tom's post cleared
> that up.
>  
> I wonder why we need to expose this GUC at all -- perhaps it should
> be off when fsync is off and on otherwise?  Leaving it on without
> fsync is just harming performance for not much benefit, and turning
> it off with fsync seems to be saying that you are willing to
> tolerate a known risk of database corruption, just not quite so much
> as you have without fsync.  In reality it seems most likely to be a
> mistake, either way.

According to our docs, and my submitted patch, if you are using ZFS then
you can turn off full-page writes, so full-page writes are still useful.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-11-30 Thread Bruce Momjian
Greg Smith wrote:
> Kevin Grittner wrote:
> > I assume that we send a full
> > 8K to the OS cache, and the file system writes disk sectors
> > according to its own algorithm.  With either platters or BBU cache,
> > the data is persisted on fsync; why do you see a risk with one but
> > not the other
> 
> I'd like a 10 minute argument please.  I started to write something to 
> refute this, only to clarify in my head the sequence of events that 
> leads to the most questionable result, where I feel a bit less certain 
> than I did before of the safety here.  Here is the worst case I believe 
> you're describing:
> 
> 1) Transaction is written to the WAL and sync'd; client receives 
> COMMIT.  Since full_page_writes is off, the data in the WAL consists 
> only of the delta of what changed on the page.
> 2) 8K database page is written to OS cache
> 3) PG calls fsync to force the database block out
> 4) OS writes first 4K block of the change to the BBU write cache.  Worst 
> case, this fills the cache, and it takes a moment for some random writes 
> to process before it has space to buffer again (makes this more likely 
> to happen, but it's not required to see the failure case here)
> 5) Sudden power interruption, second half of the page write is lost
> 6) Server restarts
> 7) That 4K write is now replayed from the battery's cache
> 
> At this point, you now have a torn 8K page, with 1/2 old and 1/2 new 

Based on this report, I think we need to update our documentation and
backpatch removal of text that says that BBU users can safely turn off
full-page writes.  Patch attached.

I think we have fallen into a trap I remember from the late 1990's where
I was assuming that an 8k-block based file system would write to the
disk atomically in 8k segments, which of course it cannot.  My bet is
that even if you write to the kernel in 8k pages, and have an 8k file
system, the disk is still accessed via 512-byte blocks, even with a BBU.
 
-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a2724fa..1e67bbd 100644
*** /tmp/pgrevert.14281/7sLqTb_wal.sgml	Tue Nov 30 21:57:17 2010
--- doc/src/sgml/wal.sgml	Tue Nov 30 21:56:49 2010
***
*** 164,173 
 PostgreSQL periodically writes full page images to
 permanent WAL storage before modifying the actual page on
 disk. By doing this, during crash recovery PostgreSQL can
!restore partially-written pages.  If you have a battery-backed disk
!controller or file-system software that prevents partial page writes
!(e.g., ZFS),  you can turn off this page imaging by turning off the
! parameter.

   
  
--- 164,175 
 PostgreSQL periodically writes full page images to
 permanent WAL storage before modifying the actual page on
 disk. By doing this, during crash recovery PostgreSQL can
!restore partially-written pages.  If you have file-system software
!that prevents partial page writes (e.g., ZFS),  you can turn off
!this page imaging by turning off the  parameter. Battery-Backed unit
!(BBU) disk controllers do not prevent partial page writes unless
!they guarantee that data is written to the BBU as full (8kB) pages.

   
  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > I assume we send a full 8k to the controller, and a failure during
> > that write is not registered as a write.
>  
> On what do you base that assumption?  I assume that we send a full
> 8K to the OS cache, and the file system writes disk sectors
> according to its own algorithm.  With either platters or BBU cache,
> the data is persisted on fsync; why do you see a risk with one but
> not the other?

Now that is an interesting question.  We write 8k to the kernel, but the
kernel doesn't have to honor those write sizes, so while we probably
can't get a partial 512-byte block written to disk with an BBU (that
isn't cleanup up by the BBU on reboot), we could get some 512-byte
blocks of an 8k written and others not.

I agree you are right and a BBU does not mean you can safely turn off
full_page_writes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > If the write fails to the controller, the page is not flushed and
> > PG does not continue.  If the write fails, the fsync never
> > happens, and hence PG stops.
>  
> PG stops?  This case at issue is when the OS crashes or the plug is
> pulled in the middle of writing a page.  I don't think PG will
> normally have the option of a graceful stop after that.  To quote
> the Fine Manual:

If the OS crashes during a write or fsync, we have not committed the
transaction.

>  
> http://www.postgresql.org/docs/current/interactive/runtime-config-wal.html#GUC-FULL-PAGE-WRITES
>  
> | a page write that is in process during an operating system crash
> | might be only partially completed, leading to an on-disk page that
> | contains a mix of old and new data. The row-level change data
> | normally stored in WAL will not be enough to completely restore
> | such a page during post-crash recovery. Storing the full page
> | image guarantees that the page can be correctly restored
>  
> Like I said, the only difference between the page being written to
> platters and to a BBU cache that I can see is the average size of
> the window of time in which you're vulnerable, not whether there is
> a window.  I don't think you've really addressed that concern.

I assume we send a full 8k to the controller, and a failure during that
write is not registered as a write.  A disk drive is modifying permanent
storage so there is always the possibility of that failing.  I assume
the BBU just rewrites that after startup.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote:
> Greg Smith  wrote: 
> > Kevin Grittner wrote:
>  
> >> So you're confident that an 8kB write to the controller will not
> >> be done as a series of smaller atomic writes by the OS file
> >> system?
> > 
> > Sure, that happens.  But if the BBU has gotten an fsync call after
> > the 8K write, it shouldn't return success until after all 8K are
> > in its cache.
>  
> I'm not concerned about an fsync after the controller has it; I'm
> concerned about a system crash in the middle of writing an 8K page
> to the controller.  Other than the expected *size* of the window of
> time during which you're vulnerable, what does a BBU caching
> controller buy you in this regard?  Can't the OS rearrange the
> writes of disk sectors after the 8K page is written to the OS cache
> so that the window might occasionally be rather large?

If the write fails to the controller, the page is not flushed and PG
does not continue.  If the write fails, the fsync never happens, and
hence PG stops.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How does PG know if data is in memory?

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote:
>  wrote:
>  
> > -- None of that is anything for amateurs to play with.
> > 
> > Not jam a stick in anybody's eye, but shouldn't database pros not
> > be amateurs?
>  
> While many PostgreSQL installations are managed by professional
> DBAs, or programmers or consultants with a deep enough grasp of the
> issues to tune a knob like that appropriately, PostgreSQL is also
> used in environments without such staff.  In fact, there is pressure
> to make PostgreSQL easier to configure for exactly that reason.  If
> we add more knobs which are this hard to tune correctly, we would
> risk inundation with complaints from people to tried to use it and
> made things worse.

Agreed.  Here is a blog entry that explains some of the tradeoffs of
adding knobs:

http://momjian.us/main/blogs/pgblog/2009.html#January_10_2009

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > With a BBU you can turn off full_page_writes
>  
> My understanding is that that is not without risk.  What happens if
> the WAL is written, there is a commit, but the data page has not yet
> been written to the controller?  Don't we still have a torn page?

I don't see how full_page_writes affect non-written pages to the
controller.

full_page_writes is designed to guard against a partial write to a
device.  I don't think the raid cache can be partially written to, and
the cache will not be cleared until the drive has fully writen the data
to disk.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-21 Thread Bruce Momjian
Scott Marlowe wrote:
> On Wed, Oct 20, 2010 at 8:25 PM, Joshua D. Drake  
> wrote:
> > On Wed, 2010-10-20 at 22:13 -0400, Bruce Momjian wrote:
> >> Ben Chobot wrote:
> >> > On Oct 7, 2010, at 4:38 PM, Steve Crawford wrote:
> >> >
> >> > > I'm weighing options for a new server. In addition to PostgreSQL, this 
> >> > > machine will handle some modest Samba and Rsync load.
> >> > >
> >> > > I will have enough RAM so the virtually all disk-read activity will be 
> >> > > cached. The average PostgreSQL read activity will be modest - a mix of 
> >> > > single-record and fairly large (reporting) result-sets. Writes will be 
> >> > > modest as well but will come in brief (1-5 second) bursts of 
> >> > > individual inserts. The rate of insert requests will hit 
> >> > > 100-200/second for those brief bursts.
> >> > >
> >> > > So...
> >> > >
> >> > > Am I likely to be better off putting $$$ toward battery-backup on the 
> >> > > RAID or toward adding a second RAID-set and splitting off the WAL 
> >> > > traffic? Or something else?
> >> >
> >> > A BBU is, what, $100 or so? Adding one seems a no-brainer to me.
> >> > Dedicated WAL spindles are nice and all, but they're still spinning
> >> > media. Raid card cache is wy faster, and while it's best at bursty
> >> > writes, it sounds like bursty writes are precisely what you have.
> >>
> >> Totally agree!
> >
> > BBU first, more spindles second.
> 
> Agreed.  note that while you can get incredible burst performance from
> a battery backed cache, due to both caching and writing out of order,
> once the throughput begins to saturate at the speed of the disk array,
> the bbu cache is now only re-ordering really, as it will eventually
> fill up faster than the disks can take the writes, and you'll settle
> in at some percentage of your max tps you get for a short benchmark
> run.  It's vitally important that once you put a BBU cache in place,
> you run a very long running transactional test (pgbench is a simple
> one to start with) that floods the io subsystem so you see what you're
> average throughput is with the WAL and data store getting flooded.  I
> know on my system pgbench runs of a few minutes can be 3 or 4 times
> faster than runs that last for the better part of an hour.

With a BBU you can turn off full_page_writes, which should decrease the
WAL traffic.

However, I don't see this mentioned in our documentation.  Should I add
it?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New wiki page on write reliability

2010-10-21 Thread Bruce Momjian
Greg Smith wrote:
> Now that some of my recent writing has gone from NDA protected to public 
> sample, I've added a new page to the PostgreSQL wiki that provides a 
> good starting set of resources to learn about an ever popular topic 
> here, how write cache problems can lead to database corruption:  
> http://wiki.postgresql.org/wiki/Reliable_Writes
> 
> Bruce also has a presentation he's been working on that adds pictures 
> showing the flow of data through the various cache levels, to help 
> people visualize the whole thing, that should get added into there once 
> he's finished tweaking it.

My presentation is done and is now on the wiki too:

http://momjian.us/main/writings/pgsql/hw_selection.pdf

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-20 Thread Bruce Momjian
bricklen wrote:
> On Sat, Oct 9, 2010 at 4:26 PM, Neil Whelchel  wrote:
> > Maybe an
> > estimate(*) that works like count but gives an answer from the index without
> > checking visibility? I am sure that this would be good enough to make a page
> > list, it is really no big deal if it errors on the positive side, maybe the
> > list of pages has an extra page off the end. I can live with that. What I
> > can't live with is taking 13 seconds to get a page of results from 850,000
> > rows in a table.
> > -Neil-
> >
> 
> FWIW, Michael Fuhr wrote a small function to parse the EXPLAIN plan a
> few years ago and it works pretty well assuming your stats are up to
> date.
> 
> http://markmail.org/message/gknqthlwry2eoqey

What I recommend is to execute the query with EXPLAIN, and look at the
estimated rows and costs.  If the row number is large, just round it to
the nearest thousand and return it to the application as a count ---
this is what Google does for searches (just try it).

If the row count/cost are low, run the query and return an exact count.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BBU Cache vs. spindles

2010-10-20 Thread Bruce Momjian
Ben Chobot wrote:
> On Oct 7, 2010, at 4:38 PM, Steve Crawford wrote:
> 
> > I'm weighing options for a new server. In addition to PostgreSQL, this 
> > machine will handle some modest Samba and Rsync load.
> >
> > I will have enough RAM so the virtually all disk-read activity will be 
> > cached. The average PostgreSQL read activity will be modest - a mix of 
> > single-record and fairly large (reporting) result-sets. Writes will be 
> > modest as well but will come in brief (1-5 second) bursts of individual 
> > inserts. The rate of insert requests will hit 100-200/second for those 
> > brief bursts.
> >
> > So...
> >
> > Am I likely to be better off putting $$$ toward battery-backup on the RAID 
> > or toward adding a second RAID-set and splitting off the WAL traffic? Or 
> > something else?
> 
> A BBU is, what, $100 or so? Adding one seems a no-brainer to me.
> Dedicated WAL spindles are nice and all, but they're still spinning
> media. Raid card cache is wy faster, and while it's best at bursty
> writes, it sounds like bursty writes are precisely what you have.

Totally agree!

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Advice configuring ServeRAID 8k for performance

2010-08-16 Thread Bruce Momjian
Scott Carey wrote:
> Don't ever have WAL and data on the same OS volume as ext3.
> 
> If data=writeback, performance will be fine, data integrity will be ok
> for WAL, but data integrity will not be sufficient for the data
> partition.  If data=ordered, performance will be very bad, but data
> integrity will be OK.
> 
> This is because an fsync on ext3 flushes _all dirty pages in the file
> system_ to disk, not just those for the file being fsync'd.
> 
> One partition for WAL, one for data.  If using ext3 this is essentially
> a performance requirement no matter how your array is set up underneath.

Do we need to document this?

--
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Advice configuring ServeRAID 8k for performance

2010-08-13 Thread Bruce Momjian
Greg Smith wrote:
> > 2) Should I configure the ext3 file system with noatime and/or 
> > data=writeback or data=ordered?  My controller has a battery, the 
> > logical drive has write cache enabled (write-back), and the physical 
> > devices have write cache disabled (write-through).
> 
> data=ordered is the ext3 default and usually a reasonable choice.  Using 
> writeback instead can be dangerous, I wouldn't advise starting there.  
> noatime is certainly a good thing, but the speedup is pretty minor if 
> you have a battery-backed write cache.

We recomment 'data=writeback' for ext3 in our docs:

http://www.postgresql.org/docs/9.0/static/wal-intro.html

Tip:  Because WAL restores database file contents after a crash,
journaled file systems are not necessary for reliable storage of the
data files or WAL files. In fact, journaling overhead can reduce
performance, especially if journaling causes file system data  to be
flushed to disk. Fortunately, data flushing during journaling can often
be disabled with a file system mount option, e.g. data=writeback on a
Linux ext3 file system. Journaled file systems do improve boot speed
    after a crash. 

Should this be changed?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-11 Thread Bruce Momjian
Tom Lane wrote:
> Hannu Krosing  writes:
> > Of course there are more variables than just *_page_cost, so if you nail
> > down any other one, you may end with less than 1 for both page costs.
> 
> > I have always used seq_page_cost = 1 in my thinking and adjusted others
> > relative to it.
> 
> Right, seq_page_cost = 1 is sort of the traditional reference point,
> but you don't have to do it that way.  The main point here is that for
> an all-in-RAM database, the standard page access costs are too high
> relative to the CPU effort costs:
> 
> regression=# select name, setting from pg_settings where name like '%cost';
>  name | setting 
> --+-
>  cpu_index_tuple_cost | 0.005
>  cpu_operator_cost| 0.0025
>  cpu_tuple_cost   | 0.01
>  random_page_cost | 4
>  seq_page_cost| 1
> (5 rows)
> 
> To model an all-in-RAM database, you can either dial down both
> random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
> to 1 and increase all the CPU costs.  The former is less effort ;-)
> 
> It should be noted also that there's not all that much evidence backing
> up the default values of the cpu_xxx_cost variables.  In the past those
> didn't matter much because I/O costs always swamped CPU costs anyway.
> But I can foresee us having to twiddle those defaults and maybe refine
> the CPU cost model more, as all-in-RAM cases get more common.

This confused me.  If we are assuing the data is in
effective_cache_size, why are we adding sequential/random page cost to
the query cost routines?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Testing Sandforce SSD

2010-08-11 Thread Bruce Momjian
Greg Smith wrote:
> > * How to test for power failure?
> 
> I've had good results using one of the early programs used to 
> investigate this class of problems:  
> http://brad.livejournal.com/2116715.html?page=2

FYI, this tool is mentioned in the Postgres documentation:

http://www.postgresql.org/docs/9.0/static/wal-reliability.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-12 Thread Bruce Momjian
Craig Ringer wrote:
> It'll need to separate "running queries" from "running processes", or
> start threading backends, so that one way or the other a single query
> can benefit from the capabilities of multiple CPUs. The same separation,
> or a move to async I/O, might be needed to get one query to concurrently
> read multiple partitions of a table, or otherwise get maximum benefit
> from high-capacity I/O subsystems when running just a few big, expensive
> queries.
> 
> Otherwise I'm wondering if PostgreSQL will begin really suffering in
> performance on workloads where queries are big and expensive but there
> are relatively few of them running at a time.

Agreed.  We certainly are going to have to go in that direction someday.
We have TODO items for these.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


  1   2   3   4   5   >