[GENERAL] Improve replication usability

2011-12-10 Thread Andrew Armstrong
Hello,

First off, thanks for a great product.

I've been looking at setting up replication on Windows between two servers
using pgsql 9.1

I'm going to give up for now though because I'm finding it difficult to get
it working correctly; after copying the \data directory as per the guide at
http://wiki.postgresql.org/wiki/Streaming_Replication the server won't
start anymore - no errors are logged.

It seems to be this problem, but after making sure the postgres user owns
all data files, it still wont start,
http://archives.postgresql.org/pgsql-admin/2010-11/msg00258.php

I'm a software developer and setup mysql/postgresql when needed; but I find
that both of these products are too complicated when it comes to something
like HA/replication.

Can I please suggest the following improvements for a future release?

1) One line setup of replication. Allow me to issue a 'SELECT
pg_replication_connect(remote host)' that will:
a) Perform a backup/download the current database (from scratch) - don't
need me to copy database files
b) Come online and enter a standby state

2) Allow me to easily test failover (issue a SELECT, or just via the
trigger file?)

3) Options like archive_command etc that need to be typed out manually
should be obsolete; let the database server sync itself without needing a
manual file copy.

Basically, I'm suggesting to please remove the administrator overhead for
setting up replication. I appreciate the complexity of such a system, but
it would be fantastic to make it _super trivial_ to get replication working
and online without the possibility of these problems.

Thanks,
Andrew


Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-10 Thread Greg Smith

On 12/10/2011 09:28 PM, Craig Ringer wrote:
One thing I think would be interesting for this would be to identify 
slow queries (without doing detailed plan timing) and flag them for 
more detailed timing if they're run again within  time. I suspect 
this would only be practical with parameterised prepared statements 
where the query string remained the same, but that'd still be interesting


There are actually two patches sitting in the current PostgreSQL 
CommitFest that allow normalizing query strings in a way that they could 
be handled like this even if not prepared, as part of 
pg_stat_statements.  What you're asking for is basically a hybrid of 
that and auto_explain, with something smarter deciding when the explain 
is triggered.  Interesting idea, I hadn't thought of that heuristic 
before.  It won't be hard to do if the query normalization stuff commits.


Personally I'd choose good performance monitoring over user/query 
priorities any day. With good perf monitoring I can script from the 
outside I have a lot more control, can send alerts, etc etc.


Luckily for you it's hard to do it in any other order.  When I think 
about how we'd have to validate whether query prioritization code was 
operating as expected or not, I imagine some extra monitoring tools 
really need to get built first.  Might as well expose those for people 
like yourself too, once they're built for that purpose.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-10 Thread Scott Marlowe
On Sat, Dec 10, 2011 at 7:28 PM, Craig Ringer  wrote:
> The main issue would be exempting queries that're expected to take longer
> than the slow query threshold, like reporting queries, where you wouldn't
> want to pay that overhead.

One trick you can use for this is to assign the reporting application
a different user and then alter user yada yada to turn off logging of
slow queries etc for that user.  Since it's often useful to increase
work_mem and / or change random_page_cost and so on for a reporting
user, it's pretty common to do this anyway.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-10 Thread Craig Ringer

On 12/10/2011 09:54 AM, Greg Smith wrote:
I'm planning to put that instrumentation into the database directly, 
which is what people with Oracle background are asking for.


FWIW, even for folks like me who've come from a general OSS DB 
background with a smattering of old Sybase and other primitive stuff, 
it's still a concern. I know enough about Linux's guts, I/O and memory 
behaviour to be able to track down many issues but it takes *time*, time 
I can't spend on any of the numerous other things I have to also be 
doing. Right now Pg performance fine-tuning is more of an expert skill 
(as you know!) and for those with a less strong background in Linux/UNIX 
systems and tuning it's a bit of a barrier.


Thankfully my workload is so light I don't need to care; manual EXPLAIN 
ANALYSE and auto_explain along with vmstat/iotop is enough for me.


-How can the overhead of collecting the timing data be kept down?  
It's really high in some places.  This is being worked out right now 
on pgsql-hackers, see "Timing overhead and Linux clock sources"


One thing I think would be interesting for this would be to identify 
slow queries (without doing detailed plan timing) and flag them for more 
detailed timing if they're run again within  time. I suspect this 
would only be practical with parameterised prepared statements where the 
query string remained the same, but that'd still be interesting - 
essentially automatically upgrading the log level for problem queries 
from slow query logging to auto_explain with explain analyse.


The main issue would be exempting queries that're expected to take 
longer than the slow query threshold, like reporting queries, where you 
wouldn't want to pay that overhead. That should be handled by forgetting 
about slow queries that aren't run again too soon, so they get flagged 
for EXPLAIN ANALYZE next run but forgotten about before they're next run.


I don't actually need this myself, it's just something I've been 
thinking about as a way to reduce the admin load of identifying and 
tuning problem queries.


I feel this will increasingly be the top blocker for performance 
sensitive deployments in the coming year, people used to having these 
tools in Oracle cannot imagine how they would operate without them.


Yep, especially since there's nothing in Pg to manage user/query 
priorities for I/O or CPU, so the ability to actively manage performance 
problems from the outside is particularly crucial. You'll always want to 
do that of course, and it's probably _better_ than relying on work 
priorities, especially since it sounds from recent comments like even on 
Oracle those priority features aren't what you'd call friendly.


Personally I'd choose good performance monitoring over user/query 
priorities any day. With good perf monitoring I can script from the 
outside I have a lot more control, can send alerts, etc etc.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does aggregate query allow select of non-group by or aggregate values?

2011-12-10 Thread Jack Christensen

On 12/9/2011 4:57 PM, David Johnston wrote:

Functions are evaluated once for each row that it generated by the
surrounding query.  This is particularly useful if the function in question
takes an aggregate as an input:

SELECT col1,  array_processing_function( ARRAY_AGG( col2 ) )
FROM table
GROUP BY col1;

Without this particular behavior you would need to sub-query.

> From a layman's perspective the reason why you cannot use non-aggregates
outside of GROUP BY it that it is ambiguous as to what value to output; with
an uncorrelated function call that is not the case.

David J.




Thanks. This makes sense now. I also went back to the original query 
that provoked this question. It had a correlated subquery in the select 
statement. I thought that this could yield ambiguous results. But when I 
examined it closely, all the correlated fields were included in the 
group by of the outer query, and when I tried to use a non-grouped 
column from the outer query I correctly got a ERROR: subquery uses 
ungrouped column "foo" from outer query


Thanks again.

--
Jack Christensen
ja...@hylesanderson.edu


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Statistics mismatch between n_live_tup and actual row count

2011-12-10 Thread Tom Lane
Andreas Brandl  writes:
>> The planner doesn't use n_live_tup;

> I'm just curious: where does the planner take the (approximate) row-count 
> from?

It uses the tuple density estimated by the last vacuum or analyze (viz,
reltuples/relpages) and multiplies that by the current relation size.
There are various reasons for not using n_live_tup, some historical and
some still pretty relevant.

> Might there be a link between n_live_tup drifting and doing unnecessary 
> (blind) updates, which do not change any information of a row?

Possibly.  It's premature to speculate with no test case, but I'm
wondering if HOT updates confuse that arithmetic.  No-op updates
would follow the HOT path as long as there was room on the page...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Statistics mismatch between n_live_tup and actual row count

2011-12-10 Thread Andreas Brandl
Hi,

> Andreas Brandl  writes:
> > we're currently investigating a statistics issue on postgres. We
> > have some tables which frequently show up with strange values for
> > n_live_tup. If you compare those values with a count on that
> > particular table, there is a mismatch of factor 10-30. This causes
> > the planner to come up with very bad plans (we also have this issue
> > on bigger table like the one below).
> 
> The planner doesn't use n_live_tup; the only thing that that's used
> for
> is decisions about when to autovacuum/autoanalyze. So you have two
> problems here not one.

So, you're saying that having a mismatch between n_live_tup and the actual row 
count is not that much of a problem (besides it influences when to 
auto-vacuum/analyze), right? 

I'm just curious: where does the planner take the (approximate) row-count from?

> Can you provide a test case for the n_live_tup drift? That is,
> something that when done over and over causes n_live_tup to get
> further
> and further from reality?

I'll try to implement a minimal piece of code showing this, although I'm not 
sure if this will work.

Might there be a link between n_live_tup drifting and doing unnecessary (blind) 
updates, which do not change any information of a row?

Thank you!

Best regards
Andreas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL]

2011-12-10 Thread frank
This message has been digitally signed by the sender.

Re___GENERAL_.eml
Description: Binary data


-
Hi-Tech Gears Ltd, Gurgaon, India


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL]

2011-12-10 Thread Frank Lanitz
On Thu, 08 Dec 2011 23:12:36 +
Raymond O'Donnell  wrote:

> Just wondering, and without intending to cast any aspersions on the
> poster - is this spam or legit? I didn't take the risk of actually
> clicking it...
> 
> There have been a few posts like this recently - links without any
> commentary or explanation.

Take it as spam and don't rethink about. 

Cheers, 
Frank
-- 
Frank Lanitz 


pgpcfuux8tyNM.pgp
Description: PGP signature


Re: [GENERAL] How could I find the last modified procedure in the database?

2011-12-10 Thread Jasen Betts
On 2011-11-16, Dhimant Patel  wrote:

> I have postgres *(PostgreSQL) 9.0.3 running.*
> I also created several procedures/functions and now I don't remember the
> last procedure I worked on! - I thought I could always get this from
> metadata.
>
> Now I'm stuck - couldn't find this details anywhere in catalog tables!
>
> Is there anyway I can get this information?

if you generated any exceptions during your work on the functions you
may find evidence in the database log.

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general