Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-15 Thread Heikki Linnakangas
Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 So you'd prefer a message that is sometimes flat-out wrong over a
 message that is correct but less informative in the common case?  I
 guess that could be right call, but it's not what I'd pick.
 
 Well, as I said, I think the only way to really improve this message
 is to use a different wording for the REJECT case.  I'm unconvinced
 that the problem justifies that, but if you're sufficiently hot about
 it, that is the direction to go in; not making the the message less
 useful for the 99% case.

How about a hint?

FATAL:  connection not authorized for host [local], user foo,
database postgres
HINT:  Make sure that you have a matching accept line in pg_hba.conf

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and a disk full in primary

2010-04-15 Thread Heikki Linnakangas
Robert Haas wrote:
 I've realized another problem with this patch.  standby_keep_segments
 only controls the number of segments that we keep around for purposes
 of streaming: it doesn't affect archiving at all.  And of course, a
 standby server based on archiving is every bit as much of a standby
 server as one that uses streaming replication.  So at a minimum, the
 name of this GUC is very confusing.

Hmm, I guess streaming_keep_segments would be more accurate. Somehow
doesn't feel as good otherwise, though. Any other suggestions?

  We should also probably think a
 little bit about why we feel like it's OK to throw away data that is
 needed for SR to work, but we don't feel like we ever want to throw
 away WAL segments that we can't manage to archive.

Failure to archive is considered more serious, because your continuous
archiving backup becomes invalid if we delete a segment before it's
archived. And a streaming standby server can catch up using the archive
if it falls behind too much. Plus the primary doesn't know how many
standby servers there is, so it doesn't know which segments are still
needed for SR.

 In the department of minor nits, I also don't like the fact that the
 GUC is called standby_keep_segments and the variable is called
 StandbySegments.  If we really have to capitalize them differently, we
 should at least make it StandbyKeepSegments, but personally I think we
 should use standby_keep_segments in both places so that it doesn't
 take quite so many greps to find all the references.

Well, it's consistent with checkpoint_segments/CheckPointSegments. There
is no consistent style on naming the global variables behind GUCs. If
you feel like changing it though, I won't object.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rogue TODO list created

2010-04-15 Thread Simon Riggs
On Wed, 2010-04-14 at 23:28 -0400, Bruce Momjian wrote:
 Greg Smith wrote:
  Bruce Momjian wrote:
   What is Prioritised Todo?  It looks like a copy of the TODO list that
   was created on March 23, 2010, and only you and Simon have modified it:
  
 
   http://wiki.postgresql.org/index.php?title=Prioritised_Todoaction=history
 
  
  Well, the updates I made to that one were strictly an accident; I didn't 
  notice I was editing the forked version.  I have put everything I did in 
  that session back onto the right one.  The Prioritised Todo wasn't 
  linked to anywhere that you'd find it except via a bit of bad late night 
  searching like I did. 
  
  I'm not sure what Simon was tinkering with there, but having fallen 
  victim to it myself I agree having it there with that name is not a 
  great choice.  I moved that bit of work in progress he was doing to 
  http://wiki.postgresql.org/wiki/User:Simon and deleted the one with the 
  confusing name.  Sorry about propagating my own confusion to others.
 
 Well, unless Simon wants to keep it for some reason, it should be
 removed, and if kept, renamed.  Simon?

Happy for it to be deleted.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-15 Thread Simon Riggs
On Thu, 2010-04-15 at 00:24 -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  So you'd prefer a message that is sometimes flat-out wrong over a
  message that is correct but less informative in the common case?  I
  guess that could be right call, but it's not what I'd pick.
 
 Well, as I said, I think the only way to really improve this message
 is to use a different wording for the REJECT case.  I'm unconvinced
 that the problem justifies that, but if you're sufficiently hot about
 it, that is the direction to go in; not making the the message less
 useful for the 99% case.

I think that would solve my original gripe, if I understood the idea.

So instead of the typical reject instruction we also add a
rejectverbose instruction that has a more verbose message. Docs would
describe it as an additional instruction to assist with debugging a
complex pg_hba.conf, with warning that if used it may assist the bad
guys also.

pg_hba.conf rejects entry for host...

Patch for that would be simple and clear; I can add that if we agree.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-04-15 Thread Magnus Hagander
On Thu, Apr 15, 2010 at 4:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Looking at the call-sites, there are bugs now - if PQexec() returns
 NULL, we don't deal with it. It also doesn't always free the result
 properly. I've added checks for that.

 I think you're just adding useless complexity there.  PQresultStatus
 defends itself just fine against a NULL input, and most other libpq
 functions likewise.

Yeah, I realized that after posting it. I was still stuck in ancient
times when at least some of those functions couldn't be called with
NULL pointers, so I just put that in there by default :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-04-15 Thread Magnus Hagander
On Thu, Apr 15, 2010 at 5:13 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 11:15 PM, Magnus Hagander mag...@hagander.net wrote:
 The patch also seems confused about whether it's intending to be a
 general-purpose solution or not.  You can maybe take some shortcuts
 if it's only going to be for walreceiver, but if you're going to put
 it in dblink it is *not* acceptable to take shortcuts like not
 processing errors completely.

 The patch still takes some shortcuts since we decided to postpone
 the fix for dblink to 9.1 or later.

 Given those shortcuts, can't we simplify it even further like
 attached?

 No, we need to repeat PQgetResult() at least two times. The first call
 of it reads the RowDescription, DataRow and CommandComplete messages
 and switches the state to PGASYNC_READY. The second one reads the
 ReadyForQuery message and switches the state to PGASYNC_IDLE. So if we
 don't repeat it, libpqrcv_PQexec() would end in a half-finished state.

Ah, ok. That's what I get for not testing it :-)

I still think that could be implemented in a much clearer way though.
Just calling PQgetResult() twice, and checking the return values
sequentially would be much easier to read, imho. Looking through taht
set of break statements at the end of the loop is just confusing. If
nothing else, it needs more comments.

But maybe I'm just bikeshedding ;)


 (If nothing else, your code did PQclear() on an
 uninitialized pointer - must've been pure luck that it worked)

 PQclear(NULL) might be called in my patch, but this is not a problem
 since PQclear() does nothing if the specified PGresult argument is NULL.

Ah, I missed that you initialized it to NULL.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-15 Thread Magnus Hagander
On Thu, Apr 15, 2010 at 2:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 [ back to this... ]

 Magnus Hagander mag...@hagander.net writes:
 On Wed, Apr 7, 2010 at 21:06, Tom Lane t...@sss.pgh.pa.us wrote:
 I suppose we had a reason for doing it the first way but I can't see
 what.  GMT seems a fairly English-centric way of referring to UTC
 anyhow; translators might wish to put in UTC instead, or some other
 spelling.  Shouldn't we let them?

 UTC and GMT aren't actually the same thing.

 Tell it to the zic people --- they are identical except for the zone
 abbreviation itself, according to the zic database.  There might be some
 pedantic argument for preferring the name UTC, but I'm hesitant to
 change that behavior just to satisfy pedants.

Agreed, I don't think it's worth changing. However, that also goes to
the translation of it - let's keep *one* term, that'll make it a lot
less confusing.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Win32 timezone matching

2010-04-15 Thread Magnus Hagander
On Thu, Apr 15, 2010 at 3:48 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Wed, Apr 7, 2010 at 21:01, Tom Lane t...@sss.pgh.pa.us wrote:
 ... lack either the note about defaulting to GMT or the hint.  I guess
 we should add both of those to the failure cases in the Windows version
 of identify_system_timezone.  Should we also change the WARNING errlevel
 to LOG?  I think the latter is more likely to actually get into the log.

 You are suggesting adding this after the could not find match
 message, correct? Not replacing it? Because if we replace it, we loose
 the information of what we failed to match. So basically like
 attached?

 No, I was thinking more like the attached.  This changes the Unix code
 to separate the info about the fallback timezone into errdetail, and
 then makes the Windows messages follow that style.

Yeah, that looks good.


 Also, would LOG be *more* likely to be seen than a WARNING? Why would that 
 be?

 Because that's how log levels sort for the postmaster log.  This isn't
 an interactive warning --- we will never be executing this code in a
 regular backend, only in the postmaster.

Well, when the dba looks through the log, he'll be looking a lot
harder at something that says WARNING.

And if somebody is filtering his log so hard that it doesn't even
contain WARNING's, frankly, he's ignorant ;)

But that's just me, and I've never really agreed with that soring in
the first place, so maybe I should just be ignored...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timezone matching script (win32)

2010-04-15 Thread Magnus Hagander
On Tue, Apr 13, 2010 at 7:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 Magnus Hagander wrote:
 This script should probably live in CVS, and be run when Microsoft
 releases new timezone data. Where should I put it - src/timezone or
 somewhere in src/tools? (it does read pgtz.c in the current directory,
 but it doesn't actually edit the file - just outputs on stdout a list
 of changes to be made to the file manually)

 I think it should live in src/tools and be mentioned in
 src/tools/RELEASE_CHANGES

 +1 for src/tools/, but the documentation note should probably be in
 src/timezone/README.  This task isn't especially coupled to release
 times (at least not *our* release times).  And it's definitely not
 likely to happen at the times anyone would be consulting
 RELEASE_CHANGES, because none of the release leadup work gets done
 on Windows machines.

I've applied the script with a small change to both the README and the
RELEASE_CHANGES files. Feel free to adjust that if you feel it's
wrong.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-15 Thread Stephen Frost
Simon,

* Simon Riggs (si...@2ndquadrant.com) wrote:
 So instead of the typical reject instruction we also add a
 rejectverbose instruction that has a more verbose message. Docs would
 describe it as an additional instruction to assist with debugging a
 complex pg_hba.conf, with warning that if used it may assist the bad
 guys also.

Erm, so we'd add an option for this?  That strikes me as pretty
excessive.  Not to be a pain, but I feel like the 'connection not
authorized' argument plus a hint makes alot more sense.

 pg_hba.conf rejects entry for host...

connection not authorized for host X user Y database Z
HINT: Make sure your pg_hba.conf has the entries needed and the user
has CONNECT privileges for the database

Or something along those lines (I added the other CONNECT issue because
it's one I've run into in the past.. :).

I do also wonder if we should consider having the error that's reported
to the log differ from that which is sent to the user..  I realize
that's a much bigger animal and might not help the novice, but it could
help with debugging complex pg_hba's without exposing info to possible
bad guys.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-15 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Simon Riggs (si...@2ndquadrant.com) wrote:
 So instead of the typical reject instruction we also add a
 rejectverbose instruction that has a more verbose message.

 Erm, so we'd add an option for this?  That strikes me as pretty
 excessive.

I think Simon's point was that we'd need a different uaReject enum
value internally in the code, so that the place where the message
gets issued would be able to distinguish explicit REJECT entry from
falling off the end of the file.  Changing what the user is expected
to put in the file would be silly.  (I don't care for rejectverbose
though.  Maybe uaImplicitReject for the end-of-file case would be
the most readable way.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing HS/SR - invalid magic number

2010-04-15 Thread Erik Rijkers
On Wed, April 14, 2010 08:23, Heikki Linnakangas wrote:
 Erik Rijkers wrote:
 This replication test that was working well earlier (it ran daily), stopped 
 working

 This is probably because of this change:

 date: 2010/04/12 09:52:29;  author: heikki;  state: Exp;  lines: +71 -23
 Change the logic to decide when to delete old WAL segments, so that it
 doesn't take into account how far the WAL senders are. This way a hung
 WAL sender doesn't prevent old WAL segments from being recycled/removed
 in the primary, ultimately causing the disk to fill up. Instead add
 standby_keep_segments setting to control how many old WAL segments are
 kept in the primary. This also makes it more reliable to use streaming
 replication without WAL archiving, assuming that you set
 standby_keep_segments high enough.

 If you generate enough WAL records in the master that the standby can't
 keep up, the primary will eventually delete a WAL segment that hasn't
 been streamed to the standby yet, hence the requested WAL segment
 00010032 has already been removed error.

ah, I hadn't seen that change, and setting standby_keep_segments
to some higher value (I chose 1 for now) prevents these errors.

It shouldn't
 remove the segment before it's successfully archived, though, and your
 logs show that the standby can't find that file in the archive either.
 Is archiving set up properly?

It wasn't really - I was archiving on the primary server, instead of copying
to the standby server.  I'll fix that, thanks.


Btw, typo alert:  documentation 18.5.4, subhead 'standby_keep_segments',
says: 'replciation' for 'replication'.

http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#RUNTIME-CONFIG-REPLICATION


thanks,


Erik Rijkers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-04-15 Thread Robert Haas
On Wed, Apr 14, 2010 at 11:13 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 11:15 PM, Magnus Hagander mag...@hagander.net wrote:
 http://archives.postgresql.org/pgsql-hackers/2010-04/msg00077.php
 As for the code itself, don't we need a CHECK_FOR_INTERRUPTS in there
 for it to be actually useful?

 Since the backend version of select() receives any incoming signals
 on Win32, CHECK_FOR_INTERRUPTS seems not to be needed in the loop,
 at least in walreceiver. No? The patch doesn't put it in there, and
 I was able to interrupt walreceiver executing libpqrcv_PQexec() when
 I tested the patch on Win32.

 It will call the signal handler, yes. Normally, the signal handler
 just sets a flag somewhere, which needs to be checked with
 CHECK_FOR_INTERRUPTS.

 From how I read the walreceiver.c code (which I'm not that familiar
 with), the signal handlers call ProcessWalRcvInterrupts() which in
 turn has CHECK_FOR_INTERRUPTS in it, and this is where it ends up
 being called.

 Yes. While establishing replication connection (i.e., executing
 walrcv_connect function), the SIGTERM signal handler directly calls
 ProcessWalRcvInterrupts() which does CHECK_FOR_INTERRUPTS() and
 elog(FATAL).

 The patch also seems confused about whether it's intending to be a
 general-purpose solution or not.  You can maybe take some shortcuts
 if it's only going to be for walreceiver, but if you're going to put
 it in dblink it is *not* acceptable to take shortcuts like not
 processing errors completely.

 The patch still takes some shortcuts since we decided to postpone
 the fix for dblink to 9.1 or later.

 Given those shortcuts, can't we simplify it even further like
 attached?

 No, we need to repeat PQgetResult() at least two times. The first call
 of it reads the RowDescription, DataRow and CommandComplete messages
 and switches the state to PGASYNC_READY. The second one reads the
 ReadyForQuery message and switches the state to PGASYNC_IDLE. So if we
 don't repeat it, libpqrcv_PQexec() would end in a half-finished state.

 (If nothing else, your code did PQclear() on an
 uninitialized pointer - must've been pure luck that it worked)

 PQclear(NULL) might be called in my patch, but this is not a problem
 since PQclear() does nothing if the specified PGresult argument is NULL.

 Looking at the call-sites, there are bugs now - if PQexec() returns
 NULL, we don't deal with it. It also doesn't always free the result
 properly. I've added checks for that.

 As Tom pointed out in another post, we don't need to treat the
 result is NULL case as special. OTOH, as you pointed out, I
 forgot calling PQclear() when the second call of libpqrcv_PQexec()
 in libpqrcv_connect() fails. I added it to the patch. Thanks!

 Finally, I've updated some of the comments.

 Thanks a lot! I applied that improvements to the patch.

 I attached the revised patch.

I have to admit to finding this confusing.  According to the comments:

+   /*
+* Don't emulate the PQexec()'s behavior of returning the last
+* result when there are many, since walreceiver never sends a
+* query returning multiple results.
+*/

...but it looks like the code actually is implementing some sort of
loop-that-returns-the-last result.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and a disk full in primary

2010-04-15 Thread Alvaro Herrera
Robert Haas escribió:

 In the department of minor nits, I also don't like the fact that the
 GUC is called standby_keep_segments and the variable is called
 StandbySegments.  If we really have to capitalize them differently, we
 should at least make it StandbyKeepSegments, but personally I think we
 should use standby_keep_segments in both places so that it doesn't
 take quite so many greps to find all the references.

+1, using both names capitalized identically makes the code easier to navigate.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on pg_hba.conf rejection

2010-04-15 Thread David Fetter
On Wed, Apr 14, 2010 at 08:37:18PM -0400, Robert Haas wrote:
 On Wed, Apr 14, 2010 at 8:31 PM, Bruce Momjian br...@momjian.us wrote:
  Tom Lane wrote:
  Robert Haas robertmh...@gmail.com writes:
   What's wrong with something like connection not permitted or
   connection not authorized?
 
  The case that we're trying to cater to with the existing wording
  is novice DBAs, who are likely to stare at such a message and not
  even realize that pg_hba.conf is what they need to change.
   Frankly, by the time anyone is using REJECT entries they are
  probably advanced enough to not need much help from the error
  message; but what you propose is an absolute lock to increase the
  number of newbie questions on the lists by a large factor.
 
  Agreed.  I would rather have an inaccurate error message that
  mentions pg_hba.conf than an accurate one that doesn't.
 
  Error messages should always point at a solution, if possible.
 
 OK, how about connection not authorized by pg_hba.conf?

+1.  It's clear, and if an attacker can compromise pg_hba.conf,
there's nothing PostgreSQL can do to help.

I'd like to bring up the idea of an attacker who both has that access
and doesn't know about pg_hba.conf just to dismiss it.  Such a person
might exist, but we don't need to bend things around a case so rare
that it makes being struck by lightning look like a certainty. :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing HS/SR - invalid magic number

2010-04-15 Thread Heikki Linnakangas
Erik Rijkers wrote:
 Btw, typo alert:  documentation 18.5.4, subhead 'standby_keep_segments',
 says: 'replciation' for 'replication'.
 
 http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html#RUNTIME-CONFIG-REPLICATION

Thanks, fixed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Oleg Bartunov

Hi there,

below is an example of interesting query and two plans - the bad plan, which 
uses merge join and big sorting, took 216 sec, and good plan with merge join disabled took 
8 sec. Sorry for odd names, they were generated by popular accounting

engine in Russia. 8.4.3 and HEAD show the same behaviour.


The query:

--set enable_mergejoin to off;

explain analyze
SELECT
_V8TblAli1_Q_000_T_001._AccountRRef AS f_3,
_V8TblAli1_Q_000_T_001._Value1_TYPE AS f_4,
_V8TblAli1_Q_000_T_001._Value1_RTRef AS f_5,
_V8TblAli1_Q_000_T_001._Value1_RRRef AS f_6,
_V8TblAli1_Q_000_T_001._Value2_TYPE AS f_7,
_V8TblAli1_Q_000_T_001._Value2_RTRef AS f_8,
_V8TblAli1_Q_000_T_001._Value2_RRRef AS f_9,
_V8TblAli1_Q_000_T_001._Value3_TYPE AS f_10,
_V8TblAli1_Q_000_T_001._Value3_RTRef AS f_11,
_V8TblAli1_Q_000_T_001._Value3_RRRef AS f_12,
0 AS f_13,
0 AS f_14,
0 AS f_15,
0 AS f_16,
SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_17,
SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_18,
SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_19,
SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_20,
SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_21,
SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_22
FROM
(
SELECT
_V8TblAli1_R._Period AS _Period,
_V8TblAli1_R._RecorderTRef AS _RecorderTRef,
_V8TblAli1_R._RecorderRRef AS _RecorderRRef,
_V8TblAli1_R._AccountRRef AS _AccountRRef,
_V8TblAli1_R._Value1_TYPE AS _Value1_TYPE,
_V8TblAli1_R._Value1_RTRef AS _Value1_RTRef,
_V8TblAli1_R._Value1_RRRef AS _Value1_RRRef,
_V8TblAli1_R._Value2_TYPE AS _Value2_TYPE,
_V8TblAli1_R._Value2_RTRef AS _Value2_RTRef,
_V8TblAli1_R._Value2_RRRef AS _Value2_RRRef,
_V8TblAli1_R._Value3_TYPE AS _Value3_TYPE,
_V8TblAli1_R._Value3_RTRef AS _Value3_RTRef,
_V8TblAli1_R._Value3_RRRef AS _Value3_RRRef,
CASE
WHEN SUM(_V8TblAli1_R._Fld7178TurnoverCt) IS NULL
THEN CAST(0 AS NUMERIC(22,2))
ELSE SUM(_V8TblAli1_R._Fld7178TurnoverCt)
END AS _Fld7178TurnoverCt,
CASE
WHEN SUM(_V8TblAli1_R._Fld7180TurnoverCt) IS NULL
THEN CAST(0 AS NUMERIC(22,3))
ELSE SUM(_V8TblAli1_R._Fld7180TurnoverCt)
END AS _Fld7180TurnoverCt
FROM
(
SELECT
_AccRg7175_R._Period AS _Period,
_AccRg7175_R._RecorderTRef AS _RecorderTRef,
_AccRg7175_R._RecorderRRef AS _RecorderRRef,
_AccRg7175_R._AccountDtRRef AS _AccountRRef,
CASE
WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE
THEN CASE
WHEN _AccRgED7200_TED1._Value_TYPE = '\\001'::bytea OR
_AccRgED7200_TED1._Value_TYPE = '\\010'::bytea AND 
_AccRgED7200_TED1._Value_RTRef IS NOT NULL AND _AccRgED7200_TED1._Value_RRRef 
IS NOT NULL
THEN _AccRgED7200_TED1._Value_TYPE
ELSE NULL
END
ELSE CASE
WHEN CAST(NULL AS BYTEA) = '\\001'::bytea OR
CAST(NULL AS BYTEA) = '\\010'::bytea AND CAST(NULL AS BYTEA) IS NOT NULL
THEN CAST(NULL AS BYTEA)
ELSE NULL
END
END AS _Value1_TYPE,
CASE
WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE
THEN CASE
WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea
THEN _AccRgED7200_TED1._Value_RTRef
WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR
_AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea)
THEN NULL
ELSE '\\000\\000\\000\\000'::bytea
END
ELSE CASE
WHEN CAST(NULL AS BYTEA) = '\\010'::bytea
THEN CAST(NULL AS BYTEA)
WHEN CAST(NULL AS BYTEA) IS NULL OR
CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea)
THEN NULL
ELSE '\\000\\000\\000\\000'::bytea
END
END AS _Value1_RTRef,
CASE
WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE
THEN CASE
WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea
THEN _AccRgED7200_TED1._Value_RRRef
WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR
_AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea)
THEN NULL
ELSE 
'\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea
END
ELSE CASE
WHEN CAST(NULL AS BYTEA) = '\\010'::bytea
THEN CAST(NULL AS BYTEA)
WHEN CAST(NULL AS BYTEA) IS NULL OR
CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea)
THEN NULL
ELSE 
'\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea
END
END AS _Value1_RRRef,
CASE
WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED2 = TRUE
THEN CASE
WHEN _AccRgED7200_TED2._Value_TYPE = '\\001'::bytea OR
_AccRgED7200_TED2._Value_TYPE = '\\010'::bytea AND 
_AccRgED7200_TED2._Value_RTRef IS NOT NULL AND _AccRgED7200_TED2._Value_RRRef 
IS NOT NULL
THEN _AccRgED7200_TED2._Value_TYPE
ELSE NULL
END
ELSE CASE
WHEN CAST(NULL AS BYTEA) = '\\001'::bytea OR
CAST(NULL AS BYTEA) = '\\010'::bytea AND CAST(NULL AS BYTEA) IS NOT NULL
THEN CAST(NULL AS BYTEA)
ELSE NULL
END
END AS _Value2_TYPE,
CASE
WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED2 = TRUE
THEN CASE
WHEN _AccRgED7200_TED2._Value_TYPE = '\\010'::bytea
THEN _AccRgED7200_TED2._Value_RTRef
WHEN _AccRgED7200_TED2._Value_TYPE IS NULL OR
_AccRgED7200_TED2._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea)
THEN NULL
ELSE '\\000\\000\\000\\000'::bytea
END
ELSE CASE
WHEN CAST(NULL AS BYTEA) = '\\010'::bytea
THEN CAST(NULL AS BYTEA)
WHEN CAST(NULL AS BYTEA) IS NULL OR
CAST(NULL AS BYTEA) NOT 

Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Kevin Grittner
Oleg Bartunov o...@sai.msu.su wrote:
 
 Sorry for odd names, they were generated by popular accounting
 engine in Russia.
 
How much of that can you trim out and still see the problem?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add --ordered option to pg_dump

2010-04-15 Thread Bob Lunney
I needed a way to run diffs on two
database dumps to see what data developers put in their
local databases versus the initial database load.  The
pg_dump utility with --inserts works well for this, but
since the order in which the data is returned of the server
is not guaranteed I hacked this patch to make life
simpler.  

Using --ordered will order the data by primary key or
unique index, if one exists, and use the smallest ordering
(i.e. least number of columns required for a unique
order).  

Note that --ordered could crush your database server if you
try to order very large tables, so use judiciously.

This is my first patch submission, so I hope I've followed
protocol.  If not, please be gentle!

Regards,

Bob Lunney
bob_lunney dot yahoo dot com


diff -cNr src/bin/pg_dump/pg_dump.c.orig
src/bin/pg_dump/pg_dump.c
*** src/bin/pg_dump/pg_dump.c.orig      2010-04-06
11:21:48.0 -0400                       
            
--- src/bin/pg_dump/pg_dump.c   2010-04-15
10:28:49.0 -0400                       
                    
***                             
                                       
                          
*** 111,116                            
                                       
                           
--- 111,117                            
                                       
                           
  static int    disable_dollar_quoting = 0;       
                                       
                    
  static int    dump_inserts = 0;               
                                       
                      
  static int    column_inserts = 0;             
                                       
                      
+ static int      ordered = 0;                 
                                       
                       
                                       
                                       
                               
                                       
                                       
                               
  static void help(const char *progname);           
                                       
                  
***                             
                                       
                          
*** 275,280                            
                                       
                           
--- 276,282                            
                                       
                           
                {inserts, no_argument,
dump_inserts, 1},                         
                          
                {lock-wait-timeout,
required_argument, NULL, 2},                     
                       
                {no-tablespaces, no_argument,
outputNoTablespaces, 1},                   
                  
+               {ordered, no_argument,
ordered, 1},                           
                             
                {role, required_argument, NULL,
3},                                     
                    
                {use-set-session-authorization,
no_argument, use_setsessauth, 1},             
             
                                       
                                       
                               
***                             
                                       
                          
*** 493,498                            
                                       
                           
--- 495,506                            
                                       
                           
                exit(1);                 
                                       
                             
        }                             
                                       
                                
                                       
                                       
                               
+       if (!dump_inserts  ordered)     
                                       
                            
+       {                             
                                       
                                
+               write_msg(NULL, option --ordered
cannot be used without --inserts or --column_inserts\n); 
  
+               exit(1);                 
                                       
                             
+       }                             
                                       
                                
+                                     
                                       
                                
        /* open the output file */             
                                       
                       
        if (pg_strcasecmp(format, a) == 0 ||
pg_strcasecmp(format, append) == 0)               
           
        {
***
*** 822,827 
--- 830,836 
        printf(_(  --disable-dollar-quoting   
disable dollar quoting, use SQL standard quoting\n));

Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Oleg Bartunov

On Thu, 15 Apr 2010, Kevin Grittner wrote:


Oleg Bartunov o...@sai.msu.su wrote:


Sorry for odd names, they were generated by popular accounting
engine in Russia.


How much of that can you trim out and still see the problem?


It's difficult, since I don't know semantics of data. I reduced query, though.

query:

explain analyze

SELECT
_AccRg7175_R._Period AS _Period,
_AccRg7175_R._RecorderTRef AS _RecorderTRef,
_AccRg7175_R._RecorderRRef AS _RecorderRRef,
_AccRg7175_R._AccountDtRRef AS _AccountRRef
FROM
_AccRg7175 _AccRg7175_R
LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1
ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef 
AND _Acc7_ExtDim7144_TEDAcc1._LineNo = 1

LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2
ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef 
AND _Acc7_ExtDim7144_TEDAcc2._LineNo = 2

LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3
ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef 
AND _Acc7_ExtDim7144_TEDAcc3._LineNo = 3

LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3
ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef 
AND _AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef 
AND _AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo 
AND _AccRgED7200_TED3._Period = _AccRg7175_R._Period 
AND _AccRgED7200_TED3._Correspond = 0 
AND _AccRgED7200_TED3._KindRRef = _Acc7_ExtDim7144_TEDAcc3._DimKindRRef

WHERE
_AccRg7175_R._Active = TRUE 
AND _AccRg7175_R._AccountDtRRef IN (SELECT tt2._REFFIELDRRef AS f_1 FROM tt2) 
AND _AccRg7175_R._Period = '2009-10-01 00:00:00'::timestamp 
AND _AccRg7175_R._Period = '2009-10-31 23:59:59'::timestamp

;

default plan:

---
 Hash Left Join  (cost=762035.09..822357.84 rows=224988 width=56) (actual 
time=25007.488..25022.338 rows=9189 loops=1)
   Hash Cond: (_accrg7175_r._accountdtrref = 
_acc7_extdim7144_tedacc1._acc7_idrref)
   -  Hash Left Join  (cost=762017.69..819134.13 rows=153030 width=56) (actual 
time=25007.173..25017.249 rows=9189 loops=1)
 Hash Cond: (_accrg7175_r._accountdtrref = 
_acc7_extdim7144_tedacc2._acc7_idrref)
 -  Merge Right Join  (cost=762001.76..816793.89 rows=153030 width=56) 
(actual time=25006.895..25012.218 rows=9189 loops=1)
   Merge Cond: ((_accrged7200_ted3._lineno = _accrg7175_r._lineno) 
AND (_accrged7200_ted3._recordertref = _accrg7175_r._recordertref) AND 
(_accrged7200_ted3._recorderrref = _accrg7175_r._recorderrref) AND 
(_accrged7200_ted3._period = _accrg7175_r._period) AND 
(_accrged7200_ted3._kindrref = _acc7_extdim7144_tedacc3._dimkindrref))
   -  Sort  (cost=694652.60..703399.93 rows=3498930 width=63) 
(actual time=24794.738..24794.738 rows=1 loops=1)
 Sort Key: _accrged7200_ted3._lineno, 
_accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref, 
_accrged7200_ted3._period, _accrged7200_ted3._kindrref
 Sort Method:  external merge  Disk: 230896kB
 -  Seq Scan on _accrged7200 _accrged7200_ted3  
(cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474 
rows=3526745 loops=1)
   Filter: (_correspond = 0::numeric)
   -  Sort  (cost=67344.64..67727.22 rows=153030 width=83) (actual 
time=212.145..213.289 rows=9189 loops=1)
 Sort Key: _accrg7175_r._lineno, 
_accrg7175_r._recordertref, _accrg7175_r._recorderrref, _accrg7175_r._period, 
_acc7_extdim7144_tedacc3._dimkindrref
 Sort Method:  quicksort  Memory: 1677kB
 -  Hash Left Join  (cost=10322.30..54166.12 rows=153030 
width=83) (actual time=39.489..184.046 rows=9189 loops=1)
   Hash Cond: (_accrg7175_r._accountdtrref = 
_acc7_extdim7144_tedacc3._acc7_idrref)
   -  Hash Join  (cost=10308.08..52844.15 rows=153030 
width=63) (actual time=39.256..180.388 rows=9189 loops=1)
 Hash Cond: (_accrg7175_r._accountdtrref = 
tt2._reffieldrref)
 -  Bitmap Heap Scan on _accrg7175 
_accrg7175_r  (cost=10305.89..51535.65 rows=237384 width=63) (actual 
time=38.913..130.715 rows=235636 loops=1)
   Recheck Cond: ((_period = '2009-10-01 
00:00:00'::timestamp without time zone) AND (_period = '2009-10-31 
23:59:59'::timestamp without time zone))
   Filter: _active
   -  Bitmap Index Scan on 
_accntr7175_byperiod_trn  (cost=0.00..10246.54 rows=237384 width=0) (actual 
time=37.281..37.281 rows=235636 loops=1)
 

Re: [HACKERS] [PATCH] Add --ordered option to pg_dump

2010-04-15 Thread Peter Eisentraut
On tor, 2010-04-15 at 10:48 -0700, Bob Lunney wrote:
 I needed a way to run diffs on two
 database dumps to see what data developers put in their
 local databases versus the initial database load.

Maybe pg_comparator would help you?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgindent and tabs in comments

2010-04-15 Thread Peter Eisentraut
Apparently, pgindent replaces multiple spaces in comments by a tab
(possibly subject to additional logic).  An example among thousands:

http://git.postgresql.org/gitweb?p=postgresql.git;a=blobdiff_plain;f=src/backend/access/gin/ginentrypage.c;h=c23415c0075b5ec52f08e8ef698f7b7ec2f97b19;hp=5cbbc7455519eba6c37be465784a02b350065716;hb=aa1e9bb51c102b239340992f2fcce138edb39d8a;hpb=03ee49a016e69e7594978352df6da4e0bbd7d04a

(or just rgrep -F '.TAB' the tree to see more).

This doesn't make any sense to me.  Could this please be fixed, and if
possible reverted sometime?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Pavel Stehule
Hello

there is significant problem in statistics I think,

Regards
Pavel Stehule

2010/4/15 Oleg Bartunov o...@sai.msu.su:
 On Thu, 15 Apr 2010, Kevin Grittner wrote:

 Oleg Bartunov o...@sai.msu.su wrote:

 Sorry for odd names, they were generated by popular accounting
 engine in Russia.

 How much of that can you trim out and still see the problem?

 It's difficult, since I don't know semantics of data. I reduced query,
 though.

 query:

 explain analyze

 SELECT
 _AccRg7175_R._Period AS _Period,
 _AccRg7175_R._RecorderTRef AS _RecorderTRef,
 _AccRg7175_R._RecorderRRef AS _RecorderRRef,
 _AccRg7175_R._AccountDtRRef AS _AccountRRef
 FROM
 _AccRg7175 _AccRg7175_R
 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1
 ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
 _Acc7_ExtDim7144_TEDAcc1._LineNo = 1
 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2
 ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
 _Acc7_ExtDim7144_TEDAcc2._LineNo = 2
 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3
 ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
 _Acc7_ExtDim7144_TEDAcc3._LineNo = 3
 LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3
 ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef AND
 _AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef AND
 _AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo AND
 _AccRgED7200_TED3._Period = _AccRg7175_R._Period AND
 _AccRgED7200_TED3._Correspond = 0 AND _AccRgED7200_TED3._KindRRef =
 _Acc7_ExtDim7144_TEDAcc3._DimKindRRef
 WHERE
 _AccRg7175_R._Active = TRUE AND _AccRg7175_R._AccountDtRRef IN (SELECT
 tt2._REFFIELDRRef AS f_1 FROM tt2) AND _AccRg7175_R._Period = '2009-10-01
 00:00:00'::timestamp AND _AccRg7175_R._Period = '2009-10-31
 23:59:59'::timestamp
 ;

 default plan:

 ---
  Hash Left Join  (cost=762035.09..822357.84 rows=224988 width=56) (actual
 time=25007.488..25022.338 rows=9189 loops=1)
   Hash Cond: (_accrg7175_r._accountdtrref =
 _acc7_extdim7144_tedacc1._acc7_idrref)
   -  Hash Left Join  (cost=762017.69..819134.13 rows=153030 width=56)
 (actual time=25007.173..25017.249 rows=9189 loops=1)
         Hash Cond: (_accrg7175_r._accountdtrref =
 _acc7_extdim7144_tedacc2._acc7_idrref)
         -  Merge Right Join  (cost=762001.76..816793.89 rows=153030
 width=56) (actual time=25006.895..25012.218 rows=9189 loops=1)
               Merge Cond: ((_accrged7200_ted3._lineno =
 _accrg7175_r._lineno) AND (_accrged7200_ted3._recordertref =
 _accrg7175_r._recordertref) AND (_accrged7200_ted3._recorderrref =
 _accrg7175_r._recorderrref) AND (_accrged7200_ted3._period =
 _accrg7175_r._period) AND (_accrged7200_ted3._kindrref =
 _acc7_extdim7144_tedacc3._dimkindrref))
               -  Sort  (cost=694652.60..703399.93 rows=3498930 width=63)
 (actual time=24794.738..24794.738 rows=1 loops=1)
                     Sort Key: _accrged7200_ted3._lineno,
 _accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref,
 _accrged7200_ted3._period, _accrged7200_ted3._kindrref
                     Sort Method:  external merge  Disk: 230896kB
                     -  Seq Scan on _accrged7200 _accrged7200_ted3
  (cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474
 rows=3526745 loops=1)
                           Filter: (_correspond = 0::numeric)
               -  Sort  (cost=67344.64..67727.22 rows=153030 width=83)
 (actual time=212.145..213.289 rows=9189 loops=1)
                     Sort Key: _accrg7175_r._lineno,
 _accrg7175_r._recordertref, _accrg7175_r._recorderrref,
 _accrg7175_r._period, _acc7_extdim7144_tedacc3._dimkindrref
                     Sort Method:  quicksort  Memory: 1677kB
                     -  Hash Left Join  (cost=10322.30..54166.12 rows=153030
 width=83) (actual time=39.489..184.046 rows=9189 loops=1)
                           Hash Cond: (_accrg7175_r._accountdtrref =
 _acc7_extdim7144_tedacc3._acc7_idrref)
                           -  Hash Join  (cost=10308.08..52844.15
 rows=153030 width=63) (actual time=39.256..180.388 rows=9189 loops=1)
                                 Hash Cond: (_accrg7175_r._accountdtrref =
 tt2._reffieldrref)
                                 -  Bitmap Heap Scan on _accrg7175
 _accrg7175_r  (cost=10305.89..51535.65 rows=237384 width=63) (actual
 time=38.913..130.715 rows=235636 loops=1)
                                       Recheck Cond: ((_period = '2009-10-01
 00:00:00'::timestamp without time zone) AND (_period = '2009-10-31
 23:59:59'::timestamp without time zone))
                                       Filter: _active
                                       -  Bitmap 

Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Oleg Bartunov

On Thu, 15 Apr 2010, Pavel Stehule wrote:


Hello

there is significant problem in statistics I think,


Ah, you're right !



Regards
Pavel Stehule

2010/4/15 Oleg Bartunov o...@sai.msu.su:

On Thu, 15 Apr 2010, Kevin Grittner wrote:


Oleg Bartunov o...@sai.msu.su wrote:


Sorry for odd names, they were generated by popular accounting
engine in Russia.


How much of that can you trim out and still see the problem?


It's difficult, since I don't know semantics of data. I reduced query,
though.

query:

explain analyze

SELECT
_AccRg7175_R._Period AS _Period,
_AccRg7175_R._RecorderTRef AS _RecorderTRef,
_AccRg7175_R._RecorderRRef AS _RecorderRRef,
_AccRg7175_R._AccountDtRRef AS _AccountRRef
FROM
_AccRg7175 _AccRg7175_R
LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1
ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
_Acc7_ExtDim7144_TEDAcc1._LineNo = 1
LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2
ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
_Acc7_ExtDim7144_TEDAcc2._LineNo = 2
LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3
ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND
_Acc7_ExtDim7144_TEDAcc3._LineNo = 3
LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3
ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef AND
_AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef AND
_AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo AND
_AccRgED7200_TED3._Period = _AccRg7175_R._Period AND
_AccRgED7200_TED3._Correspond = 0 AND _AccRgED7200_TED3._KindRRef =
_Acc7_ExtDim7144_TEDAcc3._DimKindRRef
WHERE
_AccRg7175_R._Active = TRUE AND _AccRg7175_R._AccountDtRRef IN (SELECT
tt2._REFFIELDRRef AS f_1 FROM tt2) AND _AccRg7175_R._Period = '2009-10-01
00:00:00'::timestamp AND _AccRg7175_R._Period = '2009-10-31
23:59:59'::timestamp
;

default plan:

---
 Hash Left Join  (cost=762035.09..822357.84 rows=224988 width=56) (actual
time=25007.488..25022.338 rows=9189 loops=1)
  Hash Cond: (_accrg7175_r._accountdtrref =
_acc7_extdim7144_tedacc1._acc7_idrref)
  -  Hash Left Join  (cost=762017.69..819134.13 rows=153030 width=56)
(actual time=25007.173..25017.249 rows=9189 loops=1)
        Hash Cond: (_accrg7175_r._accountdtrref =
_acc7_extdim7144_tedacc2._acc7_idrref)
        -  Merge Right Join  (cost=762001.76..816793.89 rows=153030
width=56) (actual time=25006.895..25012.218 rows=9189 loops=1)
              Merge Cond: ((_accrged7200_ted3._lineno =
_accrg7175_r._lineno) AND (_accrged7200_ted3._recordertref =
_accrg7175_r._recordertref) AND (_accrged7200_ted3._recorderrref =
_accrg7175_r._recorderrref) AND (_accrged7200_ted3._period =
_accrg7175_r._period) AND (_accrged7200_ted3._kindrref =
_acc7_extdim7144_tedacc3._dimkindrref))
              -  Sort  (cost=694652.60..703399.93 rows=3498930 width=63)
(actual time=24794.738..24794.738 rows=1 loops=1)
                    Sort Key: _accrged7200_ted3._lineno,
_accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref,
_accrged7200_ted3._period, _accrged7200_ted3._kindrref
                    Sort Method:  external merge  Disk: 230896kB
                    -  Seq Scan on _accrged7200 _accrged7200_ted3
 (cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474
rows=3526745 loops=1)
                          Filter: (_correspond = 0::numeric)
              -  Sort  (cost=67344.64..67727.22 rows=153030 width=83)
(actual time=212.145..213.289 rows=9189 loops=1)
                    Sort Key: _accrg7175_r._lineno,
_accrg7175_r._recordertref, _accrg7175_r._recorderrref,
_accrg7175_r._period, _acc7_extdim7144_tedacc3._dimkindrref
                    Sort Method:  quicksort  Memory: 1677kB
                    -  Hash Left Join  (cost=10322.30..54166.12 rows=153030
width=83) (actual time=39.489..184.046 rows=9189 loops=1)
                          Hash Cond: (_accrg7175_r._accountdtrref =
_acc7_extdim7144_tedacc3._acc7_idrref)
                          -  Hash Join  (cost=10308.08..52844.15
rows=153030 width=63) (actual time=39.256..180.388 rows=9189 loops=1)
                                Hash Cond: (_accrg7175_r._accountdtrref =
tt2._reffieldrref)
                                -  Bitmap Heap Scan on _accrg7175
_accrg7175_r  (cost=10305.89..51535.65 rows=237384 width=63) (actual
time=38.913..130.715 rows=235636 loops=1)
                                      Recheck Cond: ((_period = '2009-10-01
00:00:00'::timestamp without time zone) AND (_period = '2009-10-31
23:59:59'::timestamp without time zone))
                                      Filter: _active
                                      -  Bitmap Index Scan on

Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Tom Lane
Oleg Bartunov o...@sai.msu.su writes:
 below is an example of interesting query and two plans - the bad plan, which 
 uses merge join and big sorting, took 216 sec, and good plan with merge join 
 disabled took 
 8 sec.

The good plan seems to be fast mainly because of heavily cached inner
indexscans.  If that's the normal operating state for this database, you
should try reducing random_page_cost.

Also, as Pavel noted, the sub-join size estimates aren't very good, and
those overestimates are discouraging it from using inner-indexscan
nestloops.  I'm not sure how much it would help to increase the
statistics targets, but that would be worth trying.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rogue TODO list created

2010-04-15 Thread Bruce Momjian
Greg Smith wrote:
 Bruce Momjian wrote:
  http://wiki.postgresql.org/wiki/User:Simon
  Well, unless Simon wants to keep it for some reason, it should be
  removed, and if kept, renamed.  Simon?

 
 I already retitled the copy left on the personal page and deleted the 
 one that was causing the confusion.  I doubt anyone will accidentally 
 consider official a page labeled Simon's Work in Progress: Prioritised 
 Todo that's attached to User:Simon, that nothing links to, and that 
 doesn't show up on the first set of results if you search for todo either.

Thanks for cleaning this up.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm not sure how much it would help to increase the statistics
 targets, but that would be worth trying.
 
I notice that the scan rowcount estimates are very accurate, there's
that one hash join result that's way off, though.
 
What's up with the sort of _accrged7200 (in the slower plan) taking
in 3.5 million rows and putting out 1 row?  There's something there
I'm not understanding.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very ineffective plan with merge join

2010-04-15 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 What's up with the sort of _accrged7200 (in the slower plan) taking
 in 3.5 million rows and putting out 1 row?  There's something there
 I'm not understanding.

It's under a merge join, so what probably happened is that the first
row from that side had a larger key than any row from the other side.
A mergejoin will never bother to look at the remaining rows in such
a case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and a disk full in primary

2010-04-15 Thread Robert Haas
On Thu, Apr 15, 2010 at 2:54 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Robert Haas wrote:
 I've realized another problem with this patch.  standby_keep_segments
 only controls the number of segments that we keep around for purposes
 of streaming: it doesn't affect archiving at all.  And of course, a
 standby server based on archiving is every bit as much of a standby
 server as one that uses streaming replication.  So at a minimum, the
 name of this GUC is very confusing.

 Hmm, I guess streaming_keep_segments would be more accurate. Somehow
 doesn't feel as good otherwise, though. Any other suggestions?

I sort of feel like the correct description is something like
num_extra_retained_wal_segments, but that's sort of long.  The actual
behavior is not tied to streaming, although the use case is.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add --ordered option to pg_dump

2010-04-15 Thread Mark Kirkwood

Peter Eisentraut wrote:

On tor, 2010-04-15 at 10:48 -0700, Bob Lunney wrote:
  

I needed a way to run diffs on two
database dumps to see what data developers put in their
local databases versus the initial database load.



Maybe pg_comparator would help you?


  

Or DBIx::Compare if you like perl :-)

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgindent and tabs in comments

2010-04-15 Thread Bruce Momjian
Peter Eisentraut wrote:
 Apparently, pgindent replaces multiple spaces in comments by a tab
 (possibly subject to additional logic).  An example among thousands:
 
 http://git.postgresql.org/gitweb?p=postgresql.git;a=blobdiff_plain;f=src/backend/access/gin/ginentrypage.c;h=c23415c0075b5ec52f08e8ef698f7b7ec2f97b19;hp=5cbbc7455519eba6c37be465784a02b350065716;hb=aa1e9bb51c102b239340992f2fcce138edb39d8a;hpb=03ee49a016e69e7594978352df6da4e0bbd7d04a
 
 (or just rgrep -F '.TAB' the tree to see more).
 
 This doesn't make any sense to me.  Could this please be fixed, and if
 possible reverted sometime?

Oh, that is an interesting example. What the code does is if there are
several spaces and the next word is on a tab stop, the spaces are
convered to tabs, except if we are in a string.  This conversion is done
by 'entab' which we distribute in src/tools.  I am unclear how to fix
this _except_ to remove all tabs if the line starts with '*', but that
isn't foolproof, e.g.:

*var = 12;

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum and temp tables support

2010-04-15 Thread Bruce Momjian
Robert Haas wrote:
 On Thu, Apr 8, 2010 at 3:22 PM, Oleg Bartunov o...@sai.msu.su wrote:
  On general thought I've had is that it would be nice if the first
  attempt to SELECT against a table with no statistics would trigger an
  automatic ANALYZE by the backend on which the query was executed.
  It's pretty common to populate a table using INSERT, or CTAS, or COPY
  and then try to immediately run a query against it, and I've often
  found that it's necessary to insert manual analyze statements in there
  to get decent query plans.
 
  Oracle does this. So, is't worth to add support (configurable, like
  Oracle's optimizer_dynamic_sampling) ?
 
 Well, dynamic sampling is considerably more complicated than what I
 proposed, which is just to force an ordinary ANALYZE before the first
 query against the table.  It would be a very powerful feature if we
 could use it to ameliorate, for example, the gross statistical errors
 that sometimes occur when multiple, correlated filter conditions are
 applied to the same base table; but I don't think it's in the direct
 path of solving the present complaint.

I have added this TODO:

Consider analyzing temporary tables when they are first used in a query

Autovacuum cannot analyze or vacuum temporary tables.

* 
http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php 

I have also applied the following documentation patch to document this
behavior.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com
Index: doc/src/sgml/maintenance.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v
retrieving revision 1.102
diff -c -c -r1.102 maintenance.sgml
*** doc/src/sgml/maintenance.sgml	3 Apr 2010 07:22:55 -	1.102
--- doc/src/sgml/maintenance.sgml	16 Apr 2010 02:21:23 -
***
*** 643,648 
--- 643,654 
 /para
  
 para
+ Temporary tables cannot be accessed by autovacuum.  Therefore,
+ appropriate vacuum and analyze operations should be performed via
+ session SQL commands.
+/para
+ 
+para
  The default thresholds and scale factors are taken from
  filenamepostgresql.conf/filename, but it is possible to override them
  on a table-by-table basis; see
Index: doc/src/sgml/ref/create_table.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_table.sgml,v
retrieving revision 1.125
diff -c -c -r1.125 create_table.sgml
*** doc/src/sgml/ref/create_table.sgml	3 Apr 2010 07:23:00 -	1.125
--- doc/src/sgml/ref/create_table.sgml	16 Apr 2010 02:21:27 -
***
*** 145,150 
--- 145,159 
   /para
  
   para
+   The link linkend=autovacuumautovacuum daemon/link cannot
+   access and therefore cannot vacuum or analyze temporary tables.
+   For this reason, appropriate vacuum and analyze operations should be
+   performed via session SQL commands.  For example, if a temporary
+   table is going to be used in complex queries, it is wise to run
+   commandANALYZE/ on the temporary table after it is populated.
+  /para
+   
+  para
Optionally, literalGLOBAL/literal or literalLOCAL/literal
can be written before literalTEMPORARY/ or literalTEMP/.
This makes no difference in productnamePostgreSQL/, but see

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] solaris sparc 64bit binary release

2010-04-15 Thread John R Pierce




the 8.4.3 binary tarball for solaris sparc
64bit on postgresql.com was shipped with the 32bit includes and the
Makefile fragments from 8.4-community/lib/64/pgxs/src/ 

I'm specifically hitting this contradition:

 $ grep FLOAT8 include/server/pg_config.h
 #define FLOAT8PASSBYVAL false
and
 $ pg_controldata /var/postgres/8.4-community/data_64 | grep Float8
 Float8 argument passing: by value


For mostly corporate reasons, I need to get some C modules,
specifically pl/java, working with the binary release: postgresql-8.4.3-S10.sparc-64.tar.bz2
rather than have to build my own postgres and freak out the data center
operations guys.

Is there any chance I can get the correct
include/server/pg_config.h and lib/64/pgxs/src/Makefile.global from the
packager, whom I gather is Bjorn Much and has been seen on this
list? 











Re: [HACKERS] [PATCH] Add --ordered option to pg_dump

2010-04-15 Thread Bob Lunney
Thanks for the suggestion, Peter.  It looks like pg_comparator is for comparing 
the contents of two different servers.  I need to compare the contents of two 
dump files from the same server separated by time and busy developers.

Regards,

Bob Lunney

--- On Thu, 4/15/10, Peter Eisentraut pete...@gmx.net wrote:

 From: Peter Eisentraut pete...@gmx.net
 Subject: Re: [HACKERS] [PATCH] Add --ordered option to pg_dump
 To: Bob Lunney bob_lun...@yahoo.com
 Cc: pgsql-hackers@postgresql.org
 Date: Thursday, April 15, 2010, 4:36 PM
 On tor, 2010-04-15 at 10:48 -0700,
 Bob Lunney wrote:
  I needed a way to run diffs on two
  database dumps to see what data developers put in
 their
  local databases versus the initial database load.
 
 Maybe pg_comparator would help you?
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 


  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers