Re: [PERFORM] Table partitioning problem

2011-03-14 Thread Conor Walsh
On Mon, Mar 14, 2011 at 12:40 PM, Robert Haas robertmh...@gmail.com wrote:
 Generally, table partitioning is not a good idea unless you are
 dealing with really large tables, and nearly all of your queries apply
 only to a single partition.  Most likely you are better off not using
 table inheritance in the first place if you need this feature.

I don't know if my tables count as 'large' or not, but I've gotten
some good mileage in the past out of time-based partitioning and
setting higher compression levels on old tables.  Also the ability to
drop-and-reload a day is sometimes useful, but I grant that it would
be better to never need to do that.

-C.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow join on partitioned table

2011-03-09 Thread Conor Walsh
On Fri, Mar 4, 2011 at 8:47 AM, Mark Thornton mthorn...@optrak.co.uk wrote:
 It is a temporary table and thus I hadn't thought to analyze it. How should
 such tables be treated? Should I analyze it immediately after creation (i.e.
 when it is empty), after filling it or ... ? The expected usage is such that
 the temporary table will have less than 100 or so rows.

When in doubt, analyze.

If you're engaging in OLAP, or some other workload pattern that
involves writing many small batches, then analyzing all the time is
bad.  If you're engaged in any workload that writes rarely - a bulk
insert here, a create-table-as-select there, etc - always analyze.

The cost of analyzing when you shouldn't is low and O(1) per analysis,
and the cost of not analyzing when you should have can easily be
O(n^2) or worse w/r/t data size.

The cost of analyzing is especially low on a temp table only owned by
your current session, because no one else will be disturbed by the
table lock you acquire if you context-switch out before it's done.

-Conor

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Conor Walsh
 I can't remember
 anyone ever complaining ANALYZE took too long to run.  I only
 remember complaints of the form I had to remember to manually run it
 and I wish it had just happened by itself.

Robert,

This sounds like an argument in favor of an implicit ANALYZE after all
COPY statements, and/or an implicit autoanalyze check after all
INSERT/UPDATE statements.

-Conor

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Conor Walsh
On Thu, Feb 3, 2011 at 6:33 PM, Joshua D. Drake j...@commandprompt.com wrote:
 Well that already happens...

My understanding is that auto-analyze will fire only after my
transaction is completed, because it is a seperate daemon.  If I do
like so:

BEGIN;
COPY ...;
-- Dangerously un-analyzed
SELECT complicated-stuff ...;
END;

Auto-analyze does not benefit me, or might not because it won't fire
often enough.  I agree that analyze is very fast, and it often seems
to me like the cost/benefit ratio suggests making auto-analyze even
more aggressive.

Disclaimer/disclosure: I deal exclusively with very large data sets
these days, so analyzing all the time is almost a highly effective
worst-case amortization.  I understand that constant analyze is not so
great in, say, an OLTP setting.  But if the check is cheap, making
auto-analyze more integrated and less daemon-driven might be a net
win.  I'm not sure.

-Conor

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Array interface

2010-11-02 Thread Conor Walsh
On Tue, Nov 2, 2010 at 2:32 PM, Mladen Gogala mladen.gog...@vmsinfo.com wrote:
 I was surprised because I expected array bind to produce better
 results over the network than the row-by-row operations, yet it
 didn't.  Can anybody elaborate a bit?

While all of the bulk-execute functions are likely to have
implementations, they are not necessarily likely to actually be
efficient implementations.

I ran into this with DBD::ODBC a while back because DBD::ODBC
implements execute_array() as execute($_) foreach(@_).  DBD::Pg
doesn't appear to implement execute_array() at all, so perhaps it's
falling back on a similar default implementation in the superclass.

I generally suspect this is a Perl problem rather than a Postgres
problem, but can't say more without code.  Maybe try pastebin if
you're having email censorship issues.

-Conor

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance