Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Albe Laurenz
Robert Haas wrote:
 Well, I guess I'm still of the opinion that the real question is
 whether the particular lint checks that Pavel's implemented are good
 and useful things.  Has anyone spent any time looking at *that*?

Actually, I did when I reviewed the patch the first time round.
I think that the checks implemented are clearly good and useful,
since any problem reported will lead to an error at runtime if
a certain code path in the function is taken.  And if the code path
is never taken, that's valuable information too.

I don't say that there are no good checks missing, but the ones
that are there are good IMO.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup streaming issue from standby

2012-03-08 Thread Magnus Hagander
On Thu, Mar 8, 2012 at 02:45, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 8, 2012 at 10:36 AM, Thom Brown t...@linux.com wrote:
 I've just tried using pg_basebackup to take a backup of a standby with
 -x stream but it never finishes.

 Thanks for the report! This is the same problem as I reported before.
 We are now discussing how to fix that.
 http://archives.postgresql.org/message-id/CAHGQGwFim5F61AfdLQH4PvARPr0Ace2=9qh62khygrawy4e...@mail.gmail.com

Yeah, it sounds like just that issue. And a good kick mmy way that I
need to get back on that thread :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup streaming issue from standby

2012-03-08 Thread Thom Brown
On 8 March 2012 01:45, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 8, 2012 at 10:36 AM, Thom Brown t...@linux.com wrote:
 I've just tried using pg_basebackup to take a backup of a standby with
 -x stream but it never finishes.

 Thanks for the report! This is the same problem as I reported before.
 We are now discussing how to fix that.
 http://archives.postgresql.org/message-id/CAHGQGwFim5F61AfdLQH4PvARPr0Ace2=9qh62khygrawy4e...@mail.gmail.com

I hadn't read your previous post before, so apologies for the noise.
I did note from your post that the base backup can be finished by
generating enough data on the source, which worked for me too.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Petr Jelinek

On 03/08/2012 08:35 AM, Pavel Stehule wrote:

Here is updated patch (with regress tests, with documentation).

I removed a CHECK FUNCTION and CHECK TRIGGER statements and replaced
it by pg_check_function and pg_check_trigger like Tom proposed.

The core of this patch is same - plpgsql checker, only user interface
was reduced.

postgres=  select pg_check_function('f2()');
pg_check_function
---
  error:42703:3:RETURN:column missing_variable does not exist
  Query: SELECT 'Hello' || missing_variable
  --   ^
(3 rows)

postgres=  select pg_check_trigger('t','t1');
 pg_check_trigger

  error:42703:3:assignment:record new has no field b
  Context: SQL statement SELECT new.b + 1
(2 rows)



I did rereview and rechecked with few dozen of real-world functions, and 
it still looks good from my point of view. I made bit of adjustment of 
english in new comments and Pavel sent me privately fix for proper 
handling of languages that don't have checker function. Updated patch 
attached.


Regards
Petr Jelinek



reduced_pl_checker_2012-03-08_3.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] check function patch

2012-03-08 Thread Alvaro Herrera
Hi guys,

sorry, I'm stuck in an unfamiliar webmail.

I checked the patch Petr just posted.
http://archives.postgresql.org/pgsql-hackers/2012-03/msg00482.php

I have two comments.  First, I notice that the documentation changes
has two places that describe the columns that a function checker
returns -- one in the plhandler page, another in the create language
page.  I think it should exist only on one of those, probably the
create language one; and the plhandler page should just say the
checker should comply with the specification at . Or something like
that.   Also, the fact that the tuple description is prose makes it
hard to read; I think it should be a table -- three columns: name,
type, description.

My second comment is that the checker tuple descriptor seems to have
changed in the code.  In the patch I posted,
the FunctionCheckerDesc() function was not static; in this patch it
has been made static.  But what I intended was that the other places
that need a descriptor for anything would use this function to get
one, instead of building them by hand.  There are two such places
currently, one in CreateProceduralLanguage. I think this should be
simply walking the tupdesc-attrs array to create the arrays it needs
for the ProcedureCreate call -- shoud be a rather easy change.  The
other place is plpgsql's report_error(). Honestly I don't like this
function at all due to the way it's assuming what the tupledesc looks
like.  I'm not sure how to improve it, however, but it seems wrong to
me.  One reason to do this this way (i.e. centralize knowledge of
what the tupdesc looks like) is that otherwise they get out of sync --
I notice that CreateProcedureLanguage now knows that there are 15
columns while the other places believe there are only 11. 




Re: [HACKERS] Patch review for logging hooks (CF 2012-01)

2012-03-08 Thread Martin Pihlak
On Tue, Mar 6, 2012 at 10:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Applied with minor editorialization (mainly just improving the
 comments).

Thanks and kudos to all the reviewers!

regards,
Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Robert Haas
On Wed, Mar 7, 2012 at 9:59 PM, Fujii Masao masao.fu...@gmail.com wrote:
 I'd like to have the planning time in a number of other places as
 well, such as EXPLAIN, and maybe statistics views.

 +1 for EXPLAIN. But which statistics views are in your mind?

I don't know.  I'm not sure if it's interesting to be able to count
planning time vs. execution time on a server-wide basis, or even a
per-database basis, but it seems like it might be, if we can do it
cheaply.  Then again, considering that gettimeofday is kinda
expensive, I suppose that would have to be optional if we were to have
it at all.  Just thinking out loud, mostly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-03-08 Thread Robert Haas
On Fri, Feb 24, 2012 at 9:01 PM, Noah Misch n...@leadboat.com wrote:
 I consider these the core changes needed to reach Ready for Committer:

 - Fix crash in array_replace(arr, null, null).
 - Don't look through the domain before calling can_coerce_type().
 - Compare operator family of pfeqop and TYPECACHE_EQ_OPR at creation.
 - Move post-processing from gram.y and remove t/f magic values.

So, is someone working on making these changes?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inline Extension

2012-03-08 Thread Robert Haas
On Thu, Jan 26, 2012 at 12:40 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 So I'm going to prepare the next version of the patch with this design:

Considering that this constitutes a major redesign and that the
updated patch hasn't been posted over a month later, it seems past
time to mark this Returned with Feedback.  So I've done that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Finer Extension dependencies

2012-03-08 Thread Robert Haas
On Tue, Feb 28, 2012 at 5:34 AM, Hitoshi Harada umi.tan...@gmail.com wrote:
 Quickly reviewed the patch and found some issues.

 - There are some mixture of pg_extension_feature and pg_extension_features
 - The doc says pg_extension_features has four columns but it's not true.
 - Line 608 is bad. In the loop, provides_itself is repeatedly changed
 to true and false and I guess that's not what you meant.
 - Line 854+, you can fold two blocks into one.  The two blocks are
 similar and by giving provides list with list_make1 when
 control-provides  == NIL you can do it in one block.
 - s/trak/track/
 - Line 960, you missed updating classId for dependency.

 That's pretty much from me.  I just looked at the patch and have no
 idea about grand architecture.  Marking Waiting on Author.

Dimitri, are you going to post an updated patch for this CF?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding shutdown checkpoint at failover

2012-03-08 Thread Robert Haas
On Sat, Jan 28, 2012 at 8:57 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Jan 26, 2012 at 5:27 AM, Fujii Masao masao.fu...@gmail.com wrote:

 One thing I would like to ask is that why you think walreceiver is more
 appropriate for writing XLOG_END_OF_RECOVERY record than startup
 process. I was thinking the opposite, because if we do so, we might be
 able to skip the end-of-recovery checkpoint even in file-based log-shipping
 case.

 Right now, WALReceiver has one code path/use case.

 Startup has so many, its much harder to know whether we'll screw up one of 
 them.

 If we can add it in either place then I choose the simplest, most
 relevant place. If the code is the same, we can move it around later.

 Let me write the code and then we can think some more.

Are we still considering trying to do this for 9.2?  Seems it's been
over a month without a new patch, and it's not entirely clear that we
know what the design should be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of pg_archivecleanup -x option patch

2012-03-08 Thread Robert Haas
On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith g...@2ndquadrant.com wrote:
 The smaller step of automatically stripping the specified suffix from the
 input file name, when it matches the one we've told the program to expect,
 seems like a usability improvement unlikely to lead to bad things though.
  I've certainly made the mistake you've shown when using the patched version
 of the program myself, just didn't think about catching and correcting it
 myself before.  We can rev this to add that feature and resubmit easily
 enough, will turn that around soon.

This change seems plenty small enough to slip in even at this late
date, but I guess we're lacking a new version of the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Peter Geoghegan
On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote:
 Then again, considering that gettimeofday is kinda
 expensive, I suppose that would have to be optional if we were to have
 it at all.

+1. I'm not opposed to having such a mechanism, but it really ought to
impose exactly no overhead on the common case where we don't
particularly care about plan time.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.

On Mar 7, 2012, at 11:39 PM, Bruce Momjian wrote:

 On Thu, Mar 01, 2012 at 09:06:10PM -0500, Bruce Momjian wrote:
 OK, combining your and Robert's ideas, how about I have pg_upgrade write
 the server log to a file, and the pg_dump output to a file (with its
 stderr), and if pg_upgrade fails, I report the failure and mention those
 files.  If pg_upgrade succeeds, I remove the files?  pg_upgrade already
 creates temporary files that it removes on completion.
 
 OK, I have completed a rework of pg_upgrade logging.  pg_upgrade had 4
 logging options, -g, -G, -l, and -v, and still it wasn't possible to get
 useful logging.  :-(
 
 What I have done with this patch is to remove -g, -G, and -l, and
 unconditionally write to 4 log files in the current directory (in
 addition to the 3 SQL files I already create).
 
 If pg_upgrade succeeds, the files are removed, but if it fails (or if
 the new -r/retain option is used), the files remain.  Here is a sample
 failure when I create a plpgsql function in the old server, but truncate
 plpgsql.so in the new server:

It looks like the patch will overwrite the logs in the current working 
directory, for example, if pg_upgrade is run twice in the same place. Is that 
intentional? I had imagined that the logs would have been dumped in the /tmp 
directory so that one can compare results if the first pg_upgrade run had been 
errant.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Client Messages

2012-03-08 Thread Robert Haas
On Wed, Feb 29, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  Do we have an updated patch?  Fujii?

 No. I believe that the author Jim will submit the updated version.

Jim, are you going to submit an updated version?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Finer Extension dependencies

2012-03-08 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Dimitri, are you going to post an updated patch for this CF?

Yes, I intend to do that.  Not sure about diverting from the command
trigger patch while Thom is full speed on reviewing and helping me write
the full covering test cases, though.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote:
  Then again, considering that gettimeofday is kinda
 expensive, I suppose that would have to be optional if we were to have
 it at all.

 +1. I'm not opposed to having such a mechanism, but it really ought to
 impose exactly no overhead on the common case where we don't
 particularly care about plan time.

I thought the proposal was to add it to (1) pg_stat_statement and (2)
EXPLAIN, both of which are not in the normal code execution path.
pg_stat_statement is already a drag on a machine with slow gettimeofday,
but it's not clear why users of it would think that two gettimeofday's
per query are acceptable and four are not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Bruce Momjian
On Thu, Mar 08, 2012 at 08:34:53AM -0500, A.M. wrote:
 
 On Mar 7, 2012, at 11:39 PM, Bruce Momjian wrote:
 
  On Thu, Mar 01, 2012 at 09:06:10PM -0500, Bruce Momjian wrote:
  OK, combining your and Robert's ideas, how about I have pg_upgrade write
  the server log to a file, and the pg_dump output to a file (with its
  stderr), and if pg_upgrade fails, I report the failure and mention those
  files.  If pg_upgrade succeeds, I remove the files?  pg_upgrade already
  creates temporary files that it removes on completion.
  
  OK, I have completed a rework of pg_upgrade logging.  pg_upgrade had 4
  logging options, -g, -G, -l, and -v, and still it wasn't possible to get
  useful logging.  :-(
  
  What I have done with this patch is to remove -g, -G, and -l, and
  unconditionally write to 4 log files in the current directory (in
  addition to the 3 SQL files I already create).
  
  If pg_upgrade succeeds, the files are removed, but if it fails (or if
  the new -r/retain option is used), the files remain.  Here is a sample
  failure when I create a plpgsql function in the old server, but truncate
  plpgsql.so in the new server:
 
 It looks like the patch will overwrite the logs in the current working
 directory, for example, if pg_upgrade is run twice in the same place. Is
 that intentional? I had imagined that the logs would have been dumped in

Yes.  I was afraid that continually appending to a log file on every run
would be too confusing.  I could do only appends, or number the log
files, that those seemed confusing.

 the /tmp directory so that one can compare results if the first pg_upgrade
 run had been errant.

You would have to copy the file to a new name before re-running
pg_upgrade.

The only reason I truncate them on start is that I am appending to them
in many places in the code, and it was easier to just truncate them on
start rather than to remember where I first write to them.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.

On Mar 8, 2012, at 10:00 AM, Bruce Momjian wrote:
 
 Yes.  I was afraid that continually appending to a log file on every run
 would be too confusing.  I could do only appends, or number the log
 files, that those seemed confusing.
 
 the /tmp directory so that one can compare results if the first pg_upgrade
 run had been errant.
 
 You would have to copy the file to a new name before re-running
 pg_upgrade.
 
 The only reason I truncate them on start is that I am appending to them
 in many places in the code, and it was easier to just truncate them on
 start rather than to remember where I first write to them.
 


mktemps?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql COPY vs. ON_ERROR_ROLLBACK, multi-command strings

2012-03-08 Thread Noah Misch
On Wed, Mar 07, 2012 at 04:57:12PM -0500, Robert Haas wrote:
 On Sat, Feb 25, 2012 at 12:57 AM, Noah Misch n...@leadboat.com wrote:
  Thanks. ?While testing a crashing function, I noticed that my above patch
  added some noise to psql output when the server crashes:
 
  [local] test=# select crashme();
  The connection to the server was lost. Attempting reset: Failed.
  The connection to the server was lost. Attempting reset: Failed.
  unexpected transaction status (4)
  Time: 6.681 ms
  ?! \q
 
  Status 4 is PQTRANS_UNKNOWN, which is expected when the connection is not
  CONNECTION_OK. ?The double message arrives because ProcessResult() now calls
  CheckConnection() at least twice, for the benefit of COPY. ?(Incidentally, 
  the
  reconnect fails because the server has not yet finished recovering; that 
  part
  is nothing new.)
 
  The attached small patch has SendQuery() keep quiet about PQTRANS_UNKNOWN 
  when
  the connection is down. ?It makes ProcessResult() skip the second
  CheckConnection() when the command string had no COPY results. ?This 
  restores
  the pre-08146775acd8bfe0fcc509c71857abb928697171 psql output:
 
  [local] test=# select crashme();
  The connection to the server was lost. Attempting reset: Failed.
  Time: 4.798 ms
  ?! \q
 
 Committed, but... why do we EVER need to call CheckConnection() at the
 bottom of ProcessResult(), after exiting that function's main loop?  I
 don't see any way to get out of that loop without first calling
 AcceptResult on every PGresult we've seen, and that function already
 calls CheckConnection() if any failure is observed.

You can reproduce a case where it helps by sending SIGKILL to a backend
serving a long-running COPY TO, such as this:

copy (select n, pg_sleep(case when n  1000 then 1 else 0 end)
  from generate_series(1,1030) t(n)) to stdout;

The connection dies during handleCopyOut().  By the time control returns to
ProcessResult(), there's no remaining PGresult to check.  Only that last-ditch
CheckConnection() notices the lost connection.

[I notice more examples of poor error reporting cosmetics in some of these
COPY corner cases, so I anticipate another patch someday.]

 As a side note, the documentation for PQexec() is misleading about
 what will happen if COPY is present in a multi-command string.  It
 says: Note however that the returned PGresult structure describes
 only the result of the last command executed from the string. Should
 one of the commands fail, processing of the string stops with it and
 the returned PGresult describes the error condition.  It does not
 explain that it also stops if it hits a COPY.  I had to read the
 source code for libpq to understand why this psql logic was coded the
 way it is.

Agreed; I went through a similar process.  Awhile after reading the code, I
found the behavior documented in section Functions Associated with the COPY
Command:

  If a COPY command is issued via PQexec in a string that could contain
  additional commands, the application must continue fetching results via
  PQgetResult after completing the COPY sequence. Only when PQgetResult
  returns NULL is it certain that the PQexec command string is done and it is
  safe to issue more commands.

I'm not quite sure what revision would help most here -- a cross reference,
moving that content, duplicating that content.  Offhand, I'm inclined to move
it to the PQexec() documentation with some kind of reference back from its
original location.  Thoughts?

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Thu, Mar 08, 2012 at 08:34:53AM -0500, A.M. wrote:
 It looks like the patch will overwrite the logs in the current working
 directory, for example, if pg_upgrade is run twice in the same place. Is
 that intentional? I had imagined that the logs would have been dumped in

 Yes.  I was afraid that continually appending to a log file on every run
 would be too confusing.  I could do only appends, or number the log
 files, that those seemed confusing.

Use one (set of) files, and always append, but at the beginning of each
run print \npg_upgrade starting at [timestamp]\n\n.  Should make it
reasonably clear, while not losing information.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-03-08 Thread Noah Misch
On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;
 
 On unpatched master, this takes about 416 ms (plus or minus a few).
 With the attached patch, it takes about 389 ms (plus or minus a very
 few), a speedup of about 7%.
 
 I repeated the experiment using the C locale, like this:
 
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t COLLATE C) x;
 
 Here, it takes about 202 ms with the patch, and about 231 ms on
 unpatched master, a savings of about 13%.

 [oprofile report, further discussion]

Thanks for looking into this.  Your patch is also a nice demonstration of
sortsupport's ability to help with more than just fmgr overhead.

 Considering all that, I
 had hoped for more like a 15-20% gain from this approach, but it
 didn't happen, I suppose because some of the instructions saved just
 resulted in more processor stalls.  All the same, I'm inclined to
 think it's still worth doing.

This is a border case, but I suggest that a 13% speedup on a narrowly-tailored
benchmark, degrading to 7% in common configurations, is too meager to justify
adopting this patch.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-08 Thread Noah Misch
On Wed, Mar 07, 2012 at 07:51:42PM -0500, Tom Lane wrote:
 Alexander Korotkov aekorot...@gmail.com writes:
  True. If (max count - min count + 1) is small, enumerating of frequencies
  is both more compact and more precise representation. Simultaneously,
  if (max count - min count + 1) is large, we can run out of
  statistics_target with such representation. We can use same representation
  of count distribution as for scalar column value: MCV and HISTOGRAM, but it
  would require additional statkind and statistics slot. Probably, you've
  better ideas?
 
 I wasn't thinking of introducing two different representations,
 but just trimming the histogram length when it's larger than necessary.
 
 On reflection my idea above is wrong; for example assume that we have a
 column with 900 arrays of length 1 and 100 arrays of length 2.  Going by
 what I said, we'd reduce the histogram to {1,2}, which might accurately
 capture the set of lengths present but fails to show that 1 is much more
 common than 2.  However, a histogram {1,1,1,1,1,1,1,1,1,2} (ten entries)
 would capture the situation perfectly in one-tenth the space that the
 current logic does.

Granted.  When the next sample finds 899/101 instead, though, the optimization
vanishes.  You save 90% of the space, perhaps 10% of the time.  If you want to
materially narrow typical statistics, Alexander's proposal looks like the way
to go.  I'd guess array columns always having DEC = default_statistics_target
are common enough to make that representation the dominant representation, if
not the only necessary representation.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] check function patch

2012-03-08 Thread Pavel Stehule
Hello

there are other version related to your last comments. I removed magic
constants.

This is not merged with Peter's changes. I'll do it tomorrow. Probably
there will be some bigger changes in header files, but this can be
next step.

Regards

Pavel

2012/3/8 Alvaro Herrera alvhe...@alvh.no-ip.org:
 Hi guys,

 sorry, I'm stuck in an unfamiliar webmail.

 I checked the patch Petr just posted.
 http://archives.postgresql.org/pgsql-hackers/2012-03/msg00482.php

 I have two comments.  First, I notice that the documentation changes has two
 places that describe the columns that a function checker returns -- one in
 the plhandler page, another in the create language page.  I think it
 should exist only on one of those, probably the create language one; and the
 plhandler page should just say the checker should comply with the
 specification at create language. Or something like that.   Also, the
 fact that the tuple description is prose makes it hard to read; I think it
 should be a table -- three columns: name, type, description.

 My second comment is that the checker tuple descriptor seems to have changed
 in the code.  In the patch I posted, the FunctionCheckerDesc() function was
 not static; in this patch it has been made static.  But what I intended was
 that the other places that need a descriptor for anything would use this
 function to get one, instead of building them by hand.  There are two such
 places currently, one in CreateProceduralLanguage. I think this should be
 simply walking the tupdesc-attrs array to create the arrays it needs for
 the ProcedureCreate call -- shoud be a rather easy change.  The other place
 is plpgsql's report_error(). Honestly I don't like this function at all due
 to the way it's assuming what the tupledesc looks like.  I'm not sure how to
 improve it, however, but it seems wrong to me.

One reason to do this this
 way (i.e. centralize knowledge of what the tupdesc looks like) is that
 otherwise they get out of sync -- I notice that CreateProcedureLanguage now
 knows that there are 15 columns while the other places believe there are
 only 11.




reduced_pl_checker_2012-03-08_3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inline Extension

2012-03-08 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Considering that this constitutes a major redesign and that the
 updated patch hasn't been posted over a month later, it seems past
 time to mark this Returned with Feedback.  So I've done that.

I was hoping to be able to code that while the CF is running but
obviously that won't happen given the other patches I'm playing with.

This major redesign you're mentioning is a very good feedback already
and will allow me to implement a non controversial patch for the next
release.  Thanks for dealing with the CF parts of that.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-03-08 Thread Tom Lane
Shigeru Hanada shigeru.han...@gmail.com writes:
 [ pgsql_fdw_v15.patch ]

I've been looking at this patch a little bit over the past day or so.
I'm pretty unhappy with deparse.c --- it seems like a real kluge,
inefficient and full of corner-case bugs.  After some thought I believe
that you're ultimately going to have to abandon depending on ruleutils.c
for reverse-listing services, and it would be best to bite that bullet
now and rewrite this code from scratch.  ruleutils.c is serving two
masters already (rule-dumping and EXPLAIN) and it's not going to be
practical to tweak its behavior further for this usage; yet there are
all sorts of clear problems that you are going to run into, boiling down
to the fact that names on the remote end aren't necessarily the same as
names on the local end.  For instance, ruleutils.c is not going to be
helpful at schema-qualifying function names in a way that's correct for
the foreign server environment.  Another issue is that as soon as you
try to push down join clauses for parameterized paths, you are going to
want Vars of other relations to be printed as parameters ($n, most
likely) and ruleutils is not going to know to do that.  Seeing that
semantic constraints will greatly limit the set of node types that can
ever be pushed down anyway, I think it's likely to be easiest to just
write your own node-printing code and not even try to use ruleutils.c.

There are a couple of other points that make me think we need to revisit
the PlanForeignScan API definition some more, too.  First, deparse.c is
far from cheap.  While this doesn't matter greatly as long as there's
only one possible path for a foreign table, as soon as you want to
create more than one it's going to be annoying to do all that work N
times and then throw away N-1 of the results.  I also choked on the fact
that the pushdown patch thinks it can modify baserel-baserestrictinfo.
That might accidentally fail to malfunction right now, but it's never
going to scale to multiple paths with potentially different sets of
remotely-applied constraints.  So I'm thinking we really need to let
FDWs in on the Path versus Plan distinction --- that is, a Path just
needs to be a cheap summary of a way to do things, and then at
createplan.c time you convert the selected Path into a full-fledged
Plan.  Most of the work done in deparse.c could be postponed to
createplan time and done only once, even with multiple paths.
The baserestrictinfo hack would be unnecessary too if the FDW had more
direct control over generation of the ForeignScan plan node.

Another thing I'm thinking we should let FDWs in on is the distinction
between rowcount estimation and path generation.  When we did the first
API design last year it was okay to expect a single call to do both,
but as of a couple months ago allpaths.c does those steps in two
separate passes over the baserels, and it'd be handy if FDWs would
cooperate.

So we need to break down what PlanForeignScan currently does into three
separate steps.  The first idea that comes to mind is to call them
GetForeignRelSize, GetForeignPaths, GetForeignPlan; but maybe somebody
has a better idea for names?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Bruce Momjian
On Thu, Mar 01, 2012 at 08:45:26AM -0500, Robert Haas wrote:
 On Wed, Feb 29, 2012 at 6:02 PM, Bruce Momjian br...@momjian.us wrote:
  Any ideas about improving the error reporting more generally, so that
  when reloading the dump fails, the user can easily see what went
  belly-up, even if they didn't use -l?
 
  The only idea I have is to write the psql log to a temporary file and
  report the last X lines from the file in case of failure.  Does that
  help?
 
 Why not just redirect stdout but not stderr?  If there are error
 messages, surely we want the user to just see those.

See my patch;  the problem with splitting stdout and stderr is that you
lose the correspondence between the SQL queries and the error messages. 
With my patch, they will all be right next to each other at the end of
the log file.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-03-08 Thread Kevin Grittner
Noah Misch n...@leadboat.com wrote:
 On Fri, Mar 02, 2012 at 03:45:38PM -0500, Robert Haas wrote:
 
 SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;
 
 [13% faster with patch for C collation; 7% faster for UTF8]
 
 I had hoped for more like a 15-20% gain from this approach, but
 it didn't happen, I suppose because some of the instructions
 saved just resulted in more processor stalls.  All the same, I'm
 inclined to think it's still worth doing.
 
 This is a border case, but I suggest that a 13% speedup on a 
 narrowly-tailored benchmark, degrading to 7% in common
 configurations, is too meager to justify adopting this patch.
 
We use the C collation and have character strings in most indexes
and ORDER BY clauses.  Unless there are significant contra-
indications, I'm in favor of adopting this patch.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-08 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Wed, Mar 07, 2012 at 07:51:42PM -0500, Tom Lane wrote:
 On reflection my idea above is wrong; for example assume that we have a
 column with 900 arrays of length 1 and 100 arrays of length 2.  Going by
 what I said, we'd reduce the histogram to {1,2}, which might accurately
 capture the set of lengths present but fails to show that 1 is much more
 common than 2.  However, a histogram {1,1,1,1,1,1,1,1,1,2} (ten entries)
 would capture the situation perfectly in one-tenth the space that the
 current logic does.

 Granted.  When the next sample finds 899/101 instead, though, the optimization
 vanishes.

No, you missed my next point.  That example shows that sometimes a
smaller histogram can represent the situation with zero error, but in
all cases a smaller histogram can represent the situation with perhaps
more error than a larger one.  Since we already have a defined error
tolerance, we should try to generate a histogram that is as small as
possible while still not exceeding the error tolerance.

Now, it might be that doing that is computationally impractical, or
too complicated to be reasonable.  But it seems to me to be worth
looking into.

 If you want to materially narrow typical statistics, Alexander's
 proposal looks like the way to go.  I'd guess array columns always
 having DEC = default_statistics_target are common enough to make that
 representation the dominant representation, if not the only necessary
 representation.

Well, I don't want to have two representations; I don't think it's worth
the complexity.  But certainly we could consider switching to a
different representation if it seems likely to usually be smaller.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Peter Geoghegan
On 8 March 2012 14:44, Tom Lane t...@sss.pgh.pa.us wrote:
 I thought the proposal was to add it to (1) pg_stat_statement and (2)
 EXPLAIN, both of which are not in the normal code execution path.
 pg_stat_statement is already a drag on a machine with slow gettimeofday,
 but it's not clear why users of it would think that two gettimeofday's
 per query are acceptable and four are not.

To be clear, I don't see any reasons to not just have it within
EXPLAIN output under all circumstances.

pg_stat_statements will slow down query execution, but I see no reason
to force users to pay for something that they may well not want by not
including an 'off' switch for this additional instrumentation, given
that it doubles the number of gettimeofdays. I'm not particularly
concerned about platforms with slow gettimeofdays. I'm concerned with
keeping the overhead of running pg_stat_statements as low as possible
generally.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL = IS NOT NULL

2012-03-08 Thread Marti Raudsepp
Hi list,

This patch enables a simple optimization in
eval_const_expressions_mutator. If we know that one argument to
DistinctExpr is NULL then we can optimize it to a NullTest, which can
be an indexable expression.

For example the query:
EXPLAIN (costs off) SELECT * FROM foo WHERE j IS NOT DISTINCT FROM NULL;

Old behavior:
Seq Scan on foo
  Filter: (NOT (j IS DISTINCT FROM NULL::integer))

New behavior:
Index Scan using foo_j_idx on foo
  Index Cond: (j IS NULL)

Regards,
Marti
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index cd3da46..d9568ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2436,7 +2436,7 @@ eval_const_expressions_mutator(Node *node,
 ListCell   *arg;
 bool		has_null_input = false;
 bool		all_null_input = true;
-bool		has_nonconst_input = false;
+Expr	   *nonconst_expr = NULL;
 Expr	   *simple;
 DistinctExpr *newexpr;
 
@@ -2463,11 +2463,11 @@ eval_const_expressions_mutator(Node *node,
 		all_null_input = ((Const *) lfirst(arg))-constisnull;
 	}
 	else
-		has_nonconst_input = true;
+		nonconst_expr = lfirst(arg);
 }
 
 /* all constants? then can optimize this out */
-if (!has_nonconst_input)
+if (nonconst_expr == NULL)
 {
 	/* all nulls? then not distinct */
 	if (all_null_input)
@@ -2512,6 +2512,24 @@ eval_const_expressions_mutator(Node *node,
 		return (Node *) csimple;
 	}
 }
+else if (has_null_input)
+{
+	/*
+	 * We can optimize: (expr) IS DISTINCT FROM NULL
+	 * into: (expr) IS NOT NULL
+	 */
+	NullTest   *newntest;
+
+	newntest = makeNode(NullTest);
+	newntest-nulltesttype = IS_NOT_NULL;
+	newntest-arg = (Expr *) nonconst_expr;
+
+	/* make_row_distinct_op already flattens row comparisons */
+	Assert(! type_is_rowtype(exprType((Node *) nonconst_expr)));
+	newntest-argisrow = false;
+
+	return (Node *) newntest;
+}
 
 /*
  * The expression cannot be simplified any further, so build
diff --git a/src/test/regress/expected/select_distinct.out b/src/test/regress/expected/select_distinct.out
index 38107a0..e8ddc49 100644
--- a/src/test/regress/expected/select_distinct.out
+++ b/src/test/regress/expected/select_distinct.out
@@ -195,6 +195,13 @@ SELECT null IS DISTINCT FROM null as no;
  f
 (1 row)
 
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS DISTINCT FROM NULL;
+ QUERY PLAN 
+
+ Seq Scan on disttable
+   Filter: (f1 IS NOT NULL)
+(2 rows)
+
 -- negated form
 SELECT 1 IS NOT DISTINCT FROM 2 as no;
  no 
@@ -220,3 +227,10 @@ SELECT null IS NOT DISTINCT FROM null as yes;
  t
 (1 row)
 
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS NOT DISTINCT FROM NULL;
+   QUERY PLAN   
+
+ Seq Scan on disttable
+   Filter: (f1 IS NULL)
+(2 rows)
+
diff --git a/src/test/regress/sql/select_distinct.sql b/src/test/regress/sql/select_distinct.sql
index 328ba51..9767eed 100644
--- a/src/test/regress/sql/select_distinct.sql
+++ b/src/test/regress/sql/select_distinct.sql
@@ -56,9 +56,11 @@ SELECT 1 IS DISTINCT FROM 2 as yes;
 SELECT 2 IS DISTINCT FROM 2 as no;
 SELECT 2 IS DISTINCT FROM null as yes;
 SELECT null IS DISTINCT FROM null as no;
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS DISTINCT FROM NULL;
 
 -- negated form
 SELECT 1 IS NOT DISTINCT FROM 2 as no;
 SELECT 2 IS NOT DISTINCT FROM 2 as yes;
 SELECT 2 IS NOT DISTINCT FROM null as no;
 SELECT null IS NOT DISTINCT FROM null as yes;
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS NOT DISTINCT FROM NULL;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL = IS NOT NULL

2012-03-08 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 This patch enables a simple optimization in
 eval_const_expressions_mutator. If we know that one argument to
 DistinctExpr is NULL then we can optimize it to a NullTest, which can
 be an indexable expression.

Uh ... how much do we care about that?  I can't say that I've heard many
people complain about the fact that IS [NOT] DISTINCT FROM is poorly
optimized -- which it is, in general, and this patch chips away at that
only a tiny bit, not enough to make it recommendable.  If we really
wanted to make that a first-class operation we would need far more work
than this.  Plus I don't see why anyone would write the specific case
IS [NOT] DISTINCT FROM NULL when they could write half as much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL = IS NOT NULL

2012-03-08 Thread Marti Raudsepp
On Thu, Mar 8, 2012 at 19:35, Tom Lane t...@sss.pgh.pa.us wrote:
 Uh ... how much do we care about that?  I can't say that I've heard many
 people complain about the fact that IS [NOT] DISTINCT FROM is poorly
 optimized -- which it is, in general, and this patch chips away at that
 only a tiny bit, not enough to make it recommendable.

Agreed, but it was very simple to code, so I figured why not.

 Plus I don't see why anyone would write the specific case
 IS [NOT] DISTINCT FROM NULL when they could write half as much.

Well I can see how it might be useful in generated queries, when
comparing a column to a parameter. If they're using IS DISTINCT FROM
then it's reasonable to expect that the parameter could be NULL
sometimes.

But I don't feel strongly about this, maybe it's not worth
complicating this big function further. :)

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] regress bug

2012-03-08 Thread Andrew Dunstan
I have just found, after an hour or two of frustration, that pg_regress' 
--inputdir and --outputdir options don't play nicely with the 
convert_sourcefiles routines. In effect these options are ignored as 
destinations for converted files, and the destination for converted 
files is apparently always $CWD/{sql,expected}. The upshot is that these 
options are pretty much unusable if you want to have converted files 
(which would, for example, specify the location of data files).


This seems like an outright bug. I don't recall any discussion on it. 
Maybe nobody's come across it before. ISTM the correct behaviour would 
be to put converted sql files in $inputdir/sql and converted results 
files in $outputdir/expected.


Thoughts?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A minor rant: pay attention to updating the comments!

2012-03-08 Thread Tom Lane
I think I've complained about this before, but:

When you are preparing a patch, fixing the comments is JUST AS IMPORTANT
AS FIXING THE CODE.  Obsolete comments are worse than useless.  Look to
see if you've invalidated something in the head-of-file comments about
the module you're changing.  Look to see if you've invalidated something
in the particular function's header comment (in particular, if it has a
list of explanations of its arguments, for heaven's sake update that
list when you add or change an argument).  Look to see if some block
comment within the function needs to be updated.  If you are grepping to
fix all references to a function or global variable, do not overlook the
ones in comments.  If you use a grep substitute that does not search
comments, throw it away and get one that does.

I've gotten tired of fixing other people's oversights in this regard,
including major committers who ought to know better (and do not have
even the thin excuse of poor facility with English).  If we don't
maintain the comments adequately, we are hastening the decline of the
Postgres code base into an unintelligible, unfixable morass.

There, I feel better now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Operators Cannot be Found for Composite Type Values

2012-03-08 Thread David E. Wheeler
On Mar 7, 2012, at 8:23 PM, Tom Lane wrote:

 You have not told the system that your operator is equality for the
 datatype.  It's just a random operator that happens to be named =.
 We try to avoid depending on operator names as cues to semantics.
 
 You need to incorporate it into a default hash or btree opclass before
 the composite-type logic will accept it as the thing to use for
 comparing that column.

Ah, okay. Just need more stuff, I guess:

CREATE OR REPLACE FUNCTION json_cmp(
json,
json
) RETURNS INTEGER LANGUAGE SQL STRICT IMMUTABLE AS $$
SELECT bttextcmp($1::text, $2::text);
$$;

CREATE OR REPLACE FUNCTION json_eq(
json,
json
) RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
SELECT bttextcmp($1::text, $2::text) = 0;
$$;

CREATE OPERATOR = (
LEFTARG   = json,
RIGHTARG  = json,
PROCEDURE = json_eq
);

CREATE OPERATOR CLASS json_ops
DEFAULT FOR TYPE JSON USING btree AS
OPERATOR3   =  (json, json),
FUNCTION1   json_cmp(json, json);

This seems to work.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Operators Cannot be Found for Composite Type Values

2012-03-08 Thread Tom Lane
David E. Wheeler da...@justatheory.com writes:
 CREATE OPERATOR CLASS json_ops
 DEFAULT FOR TYPE JSON USING btree AS
 OPERATOR3   =  (json, json),
 FUNCTION1   json_cmp(json, json);

 This seems to work.

Urk.  You really ought to provide the whole opclass (all 5 operators).
I'm not sure what will blow up if you leave it like that, but it won't
be pretty.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-08 Thread Noah Misch
On Thu, Mar 08, 2012 at 11:30:52AM -0500, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Wed, Mar 07, 2012 at 07:51:42PM -0500, Tom Lane wrote:
  On reflection my idea above is wrong; for example assume that we have a
  column with 900 arrays of length 1 and 100 arrays of length 2.  Going by
  what I said, we'd reduce the histogram to {1,2}, which might accurately
  capture the set of lengths present but fails to show that 1 is much more
  common than 2.  However, a histogram {1,1,1,1,1,1,1,1,1,2} (ten entries)
  would capture the situation perfectly in one-tenth the space that the
  current logic does.
 
  Granted.  When the next sample finds 899/101 instead, though, the 
  optimization
  vanishes.
 
 No, you missed my next point.  That example shows that sometimes a
 smaller histogram can represent the situation with zero error, but in
 all cases a smaller histogram can represent the situation with perhaps
 more error than a larger one.  Since we already have a defined error
 tolerance, we should try to generate a histogram that is as small as
 possible while still not exceeding the error tolerance.
 
 Now, it might be that doing that is computationally impractical, or
 too complicated to be reasonable.  But it seems to me to be worth
 looking into.

Yes, I did miss your point.

One characteristic favoring this approach is its equal applicability to both
STATISTIC_KIND_HISTOGRAM and STATISTIC_KIND_DECHIST.

  If you want to materially narrow typical statistics, Alexander's
  proposal looks like the way to go.  I'd guess array columns always
  having DEC = default_statistics_target are common enough to make that
  representation the dominant representation, if not the only necessary
  representation.
 
 Well, I don't want to have two representations; I don't think it's worth
 the complexity.  But certainly we could consider switching to a
 different representation if it seems likely to usually be smaller.

Perhaps some heavy array users could provide input: what are some typical
length ranges among arrays in your applications on which you use arr 
const, arr @ const or arr @ const searches?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Operators Cannot be Found for Composite Type Values

2012-03-08 Thread Andrew Dunstan



On 03/08/2012 02:16 PM, Tom Lane wrote:

David E. Wheelerda...@justatheory.com  writes:

 CREATE OPERATOR CLASS json_ops
 DEFAULT FOR TYPE JSON USING btree AS
 OPERATOR3   =  (json, json),
 FUNCTION1   json_cmp(json, json);
This seems to work.

Urk.  You really ought to provide the whole opclass (all 5 operators).
I'm not sure what will blow up if you leave it like that, but it won't
be pretty.


Yeah. Note too that this is at best dubious:

CREATE OR REPLACE FUNCTION json_cmp(
json,
json
) RETURNS INTEGER LANGUAGE SQL STRICT IMMUTABLE AS $$
SELECT bttextcmp($1::text, $2::text);
$$;


Two pieces of JSON might well be textually different but semantically 
identical (e.g. by one having additional non-semantic whitespace).



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Operators Cannot be Found for Composite Type Values

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 11:16 AM, Tom Lane wrote:

 This seems to work.
 
 Urk.  You really ought to provide the whole opclass (all 5 operators).
 I'm not sure what will blow up if you leave it like that, but it won't
 be pretty.

Yes, I expect to have to fill in gaps as I go. These are just for unit tests, 
so I’m not too worried about it (yet).

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom Operators Cannot be Found for Composite Type Values

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 11:27 AM, Andrew Dunstan wrote:

 Yeah. Note too that this is at best dubious:
 
CREATE OR REPLACE FUNCTION json_cmp(
json,
json
) RETURNS INTEGER LANGUAGE SQL STRICT IMMUTABLE AS $$
SELECT bttextcmp($1::text, $2::text);
$$;
 
 
 Two pieces of JSON might well be textually different but semantically 
 identical (e.g. by one having additional non-semantic whitespace).

Yes. This is just for unit tests, and is fine for the moment. If I end up with 
abnormalities, I will likely rewrite json_cmp() in Perl and use JSON::XS to do 
normalization. Not needed yet, though.

Thanks,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 10:22 AM, Andrew Dunstan wrote:

 This seems like an outright bug. I don't recall any discussion on it. Maybe 
 nobody's come across it before. ISTM the correct behaviour would be to put 
 converted sql files in $inputdir/sql and converted results files in 
 $outputdir/expected.

In my extension distributions, I have

tests/sql
tests/expected

And for that, --inputdir=test works just fine. I don't mess with --outputdir, 
which just seems to affect where the actual output is written to, which is just 
a directory named regression.out at the top of the project.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread Andrew Dunstan

On 03/08/2012 02:59 PM, David E. Wheeler wrote:

On Mar 8, 2012, at 10:22 AM, Andrew Dunstan wrote:


This seems like an outright bug. I don't recall any discussion on it. Maybe 
nobody's come across it before. ISTM the correct behaviour would be to put 
converted sql files in $inputdir/sql and converted results files in 
$outputdir/expected.

In my extension distributions, I have

 tests/sql
 tests/expected

And for that, --inputdir=test works just fine. I don't mess with --outputdir, 
which just seems to affect where the actual output is written to, which is just 
a directory named regression.out at the top of the project.



It works fine if you don't need to do any file conversions (i.e. if you 
don't have input or output directories). But file_textarray_fdw does.


Here's a patch that I think fixes the problem.

cheers

andrew


*** a/src/test/regress/pg_regress.c
--- b/src/test/regress/pg_regress.c
***
*** 407,413  replace_string(char *string, char *replace, char *replacement)
   * the given suffix.
   */
  static void
! convert_sourcefiles_in(char *source_subdir, char *dest_subdir, char *suffix)
  {
  	char		testtablespace[MAXPGPATH];
  	char		indir[MAXPGPATH];
--- 407,413 
   * the given suffix.
   */
  static void
! convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, char *suffix)
  {
  	char		testtablespace[MAXPGPATH];
  	char		indir[MAXPGPATH];
***
*** 475,481  convert_sourcefiles_in(char *source_subdir, char *dest_subdir, char *suffix)
  		/* build the full actual paths to open */
  		snprintf(prefix, strlen(*name) - 6, %s, *name);
  		snprintf(srcfile, MAXPGPATH, %s/%s, indir, *name);
! 		snprintf(destfile, MAXPGPATH, %s/%s.%s, dest_subdir, prefix, suffix);
  
  		infile = fopen(srcfile, r);
  		if (!infile)
--- 475,482 
  		/* build the full actual paths to open */
  		snprintf(prefix, strlen(*name) - 6, %s, *name);
  		snprintf(srcfile, MAXPGPATH, %s/%s, indir, *name);
! 		snprintf(destfile, MAXPGPATH, %s/%s/%s.%s, dest_dir, dest_subdir, 
!  prefix, suffix);
  
  		infile = fopen(srcfile, r);
  		if (!infile)
***
*** 522,529  convert_sourcefiles_in(char *source_subdir, char *dest_subdir, char *suffix)
  static void
  convert_sourcefiles(void)
  {
! 	convert_sourcefiles_in(input, sql, sql);
! 	convert_sourcefiles_in(output, expected, out);
  }
  
  /*
--- 523,530 
  static void
  convert_sourcefiles(void)
  {
! 	convert_sourcefiles_in(input, inputdir, sql, sql);
! 	convert_sourcefiles_in(output, outputdir, expected, out);
  }
  
  /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 12:20 PM, Andrew Dunstan wrote:

 It works fine if you don't need to do any file conversions (i.e. if you don't 
 have input or output directories). But file_textarray_fdw does.
 
 Here's a patch that I think fixes the problem.

While you’re there, an issue I noticed is that the MODULE_PATHNAME 
substitutions do not work if you have your SQL files in a subdirectory. My 
extensions have the .sql files in an sql/ directory. So that means when I have 
something like this in sql/plproxy.sql.in:

CREATE FUNCTION plproxy_call_handler ()
RETURNS language_handler AS 'MODULE_PATHNAME' LANGUAGE C;

What I end up with in sql/plproxy.sql is:

CREATE FUNCTION plproxy_call_handler ()
RETURNS language_handler AS 'sql/plproxy' LANGUAGE C;

Which doesn’t work at all, because the file is not installed in an `sql` 
subdirectory, it's just that way in my repository (and the distribution 
tarball). So I avoid the whole MODULE_PATHNAME business for now (and the .in 
file) and just do this, instead:

CREATE FUNCTION plproxy_call_handler ()
RETURNS language_handler AS 'plproxy' LANGUAGE C;

Which is an okay workaround, but I’m not sure that MODULE_PATHNAME is actually 
working correctly.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] t_natts references in comments

2012-03-08 Thread Kevin Grittner
I wasted a few minutes today tracking down what a couple comments
really meant to say.  t_natts no longer exists as a separate field;
the equivalent value is now pulled from t_infomask2 using a macro.
 
Small patch to correct the comments is attached.
 
-Kevin

*** a/src/backend/access/common/heaptuple.c
--- b/src/backend/access/common/heaptuple.c
***
*** 871,877  heap_modifytuple(HeapTuple tuple,
   *the inverse of heap_form_tuple.
   *
   *Storage for the values/isnull arrays is provided by the caller;
!  *it should be sized according to tupleDesc-natts not 
tuple-t_natts.
   *
   *Note that for pass-by-reference datatypes, the pointer placed
   *in the Datum will point into the given tuple.
--- 871,878 
   *the inverse of heap_form_tuple.
   *
   *Storage for the values/isnull arrays is provided by the caller;
!  *it should be sized according to tupleDesc-natts not
!  *HeapTupleHeaderGetNatts(tuple-t_data).
   *
   *Note that for pass-by-reference datatypes, the pointer placed
   *in the Datum will point into the given tuple.
***
*** 978,984  heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
   *the inverse of heap_formtuple.
   *
   *Storage for the values/nulls arrays is provided by the caller;
!  *it should be sized according to tupleDesc-natts not 
tuple-t_natts.
   *
   *Note that for pass-by-reference datatypes, the pointer placed
   *in the Datum will point into the given tuple.
--- 979,986 
   *the inverse of heap_formtuple.
   *
   *Storage for the values/nulls arrays is provided by the caller;
!  *it should be sized according to tupleDesc-natts not
!  *HeapTupleHeaderGetNatts(tuple-t_data).
   *
   *Note that for pass-by-reference datatypes, the pointer placed
   *in the Datum will point into the given tuple.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Checksums, state of play

2012-03-08 Thread Peter Geoghegan
On 7 March 2012 20:56, Bruce Momjian br...@momjian.us wrote:
 Yep, good summary.  Giving ourselves a few months to think about it and
 consider other failure cases will make this a great 9.3 addition.

Recent Intel processors that support SSE 4.2, including those based on
the core microarchitecture, can calculate a CRC-32C in hardware using
a higher level instruction, similar to the AES related instructions
that Intel chips have had for some time now. Perhaps we should
consider using hardware acceleration where available.

Some interesting details are available from here:

http://lwn.net/Articles/292984/

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread Tom Lane
David E. Wheeler da...@justatheory.com writes:
 While you’re there, an issue I noticed is that the MODULE_PATHNAME
 substitutions do not work if you have your SQL files in a
 subdirectory.

Huh?  MODULE_PATHNAME is not substituted by pg_regress at all (anymore
anyway).  There's still some vestigial support for it in pgxs.mk, but
the future of that code is to vanish, not get improved.  You should
not be needing it to get substituted at build time either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Checksums, state of play

2012-03-08 Thread Thom Brown
On 8 March 2012 20:55, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 7 March 2012 20:56, Bruce Momjian br...@momjian.us wrote:
 Yep, good summary.  Giving ourselves a few months to think about it and
 consider other failure cases will make this a great 9.3 addition.

 Recent Intel processors that support SSE 4.2, including those based on
 the core microarchitecture, can calculate a CRC-32C in hardware using
 a higher level instruction, similar to the AES related instructions
 that Intel chips have had for some time now. Perhaps we should
 consider using hardware acceleration where available.

 Some interesting details are available from here:

 http://lwn.net/Articles/292984/

That's what Facebook did to speed up MySQL.
-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL = IS NOT NULL

2012-03-08 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 But I don't feel strongly about this, maybe it's not worth
 complicating this big function further. :)

Yeah, that was kind of what I felt about it.  If this patch were part of
a grand plan to make IS DISTINCT FROM smarter, that would be one thing.
But if we were to embark on that, likely as not it would involve a
redesign that would invalidate this code anyway.  So I'd just as soon
keep it simple for now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 12:59 PM, Tom Lane wrote:

 Huh?  MODULE_PATHNAME is not substituted by pg_regress at all (anymore
 anyway).

Yeah, sorry, I meant `make`.

 There's still some vestigial support for it in pgxs.mk, but
 the future of that code is to vanish, not get improved.  You should
 not be needing it to get substituted at build time either.

I still see this pattern a *lot*; I removed it from PL/Proxy last week. The 
attached tarball demonstrates it:

 make
sed 's,MODULE_PATHNAME,$libdir/sql/exttest,g' sql/exttest.sql.in 
sql/exttest.sql
make: *** No rule to make target `exttest.so', needed by `all'.  Stop.

So MODULE_PATHNAME is replaced with $libdir/sql/exttest rather than 
$libdir/exttest. Maybe that should not be fixed, but there are a *lot* of 
extensions out there using this approach (copied from contrib, which used it 
for years, albeit without the .sql.in files in a subdirectory).

So perhaps DATA_built is to be removed from pgxs.mk? And if so, is the idea 
then that one should just put the module name in the .sql file, rather than 
MODULE_PATHNAME in a .sql.in file?

Best,

David


exttest.tgz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Peter Eisentraut
On tor, 2012-03-08 at 10:06 -0500, A.M. wrote:
  The only reason I truncate them on start is that I am appending to
 them
  in many places in the code, and it was easier to just truncate them
 on
  start rather than to remember where I first write to them.
  

 mktemps? 

I don't want to see some tool unconditionally writing files (log or
otherwise) with unpredictable names.  That would make it impossible to
clean up in a wrapper script.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread A.M.

On Mar 8, 2012, at 4:37 PM, Peter Eisentraut wrote:

 On tor, 2012-03-08 at 10:06 -0500, A.M. wrote:
 The only reason I truncate them on start is that I am appending to
 them
 in many places in the code, and it was easier to just truncate them
 on
 start rather than to remember where I first write to them.
 
 
 mktemps? 
 
 I don't want to see some tool unconditionally writing files (log or
 otherwise) with unpredictable names.  That would make it impossible to
 clean up in a wrapper script.


The point of writing temp files to the /tmp/ directory is that they don't need 
to be cleaned up.

You really prefer having log files written to your current working directory? I 
don't know of any utility that pollutes the cwd like that- it seems like an 
easy way to forget where one left the log files.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread Andrew Dunstan

On 03/08/2012 04:33 PM, David E. Wheeler wrote:

On Mar 8, 2012, at 12:59 PM, Tom Lane wrote:


Huh?  MODULE_PATHNAME is not substituted by pg_regress at all (anymore
anyway).

Yeah, sorry, I meant `make`.


There's still some vestigial support for it in pgxs.mk, but
the future of that code is to vanish, not get improved.  You should
not be needing it to get substituted at build time either.

I still see this pattern a *lot*; I removed it from PL/Proxy last week. The 
attached tarball demonstrates it:

   make
 sed 's,MODULE_PATHNAME,$libdir/sql/exttest,g' 
sql/exttest.sql.insql/exttest.sql
 make: *** No rule to make target `exttest.so', needed by `all'.  Stop.

So MODULE_PATHNAME is replaced with $libdir/sql/exttest rather than 
$libdir/exttest. Maybe that should not be fixed, but there are a *lot* of 
extensions out there using this approach (copied from contrib, which used it 
for years, albeit without the .sql.in files in a subdirectory).

So perhaps DATA_built is to be removed from pgxs.mk? And if so, is the idea 
then that one should just put the module name in the .sql file, rather than 
MODULE_PATHNAME in a .sql.in file?





Extensions (unlike non-extension modules) should not monkey with 
MODULE_PATHNAME at all.


Change the Makefile def from DATA_built to DATA and rename the file to 
exttest.sql


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regress bug

2012-03-08 Thread David E. Wheeler
On Mar 8, 2012, at 1:45 PM, Andrew Dunstan wrote:

 So perhaps DATA_built is to be removed from pgxs.mk? And if so, is the idea 
 then that one should just put the module name in the .sql file, rather than 
 MODULE_PATHNAME in a .sql.in file?
 
 Extensions (unlike non-extension modules) should not monkey with 
 MODULE_PATHNAME at all.
 
 Change the Makefile def from DATA_built to DATA and rename the file to 
 exttest.sql

I did (and I do). But this is not documented or recommended anywhere. Something 
should be promulgated to encourage existing authors to do that.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Peter Eisentraut
On ons, 2012-03-07 at 12:31 -0500, Robert Haas wrote:
 I might agree with you if we had more than one checker function, but
 right now we are proposing to implement this for PL/pgsql and only
 PL/pgsql.  It seems to me that we can add that when and if a second
 checker function shows up, if it still seems like a good idea.

I had mentioned upthread that I would like to use this for PL/Python.  
There are a number of code quality checkers out there for Python.  I
currently have 3 hooked into Emacs, and 2 or 3 are typically used in the
builds of projects I'm working on.  All of these are shipped separately
from Python.

This leads to the following requirements:

  * Multiple checkers per language must be supported.
  * It must be possible to add checkers to a language after it is
created.  For example, a checker could be shipped in an
extension.
  * It's not terribly important to me to be able to run checkers
separately.  If I wanted to do that, I would just disable or
remove the checker.
  * Just to make things interesting, it should be possible to
implement checkers for language X in language X.

If it would help, given an API (even if only in C at the moment), I
could probably write up one or two checker function prototypes that
could be run against the PL/Python regression test corpus.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Peter Eisentraut
On tor, 2012-03-08 at 10:49 +0100, Albe Laurenz wrote:
 Actually, I did when I reviewed the patch the first time round.
 I think that the checks implemented are clearly good and useful,
 since any problem reported will lead to an error at runtime if
 a certain code path in the function is taken.

Shouldn't the validator just reject the function in those cases?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Pavel Stehule
2012/3/8 Peter Eisentraut pete...@gmx.net:
 On tor, 2012-03-08 at 10:49 +0100, Albe Laurenz wrote:
 Actually, I did when I reviewed the patch the first time round.
 I think that the checks implemented are clearly good and useful,
 since any problem reported will lead to an error at runtime if
 a certain code path in the function is taken.

 Shouldn't the validator just reject the function in those cases?


Validator check syntax only (and cannot do more, because there should
not be dependency between functions). But it doesn't verify if table
exists, if table has refereed columns, if number of expressions in
raise statement is equal to number of substitute symbols ...

Regards

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Peter Eisentraut
On tor, 2012-03-08 at 16:44 -0500, A.M. wrote:
 The point of writing temp files to the /tmp/ directory is that they
 don't need to be cleaned up.

I don't think so.  If everyone just left their junk lying around
in /tmp, it would fill up just like any other partition.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGXS ignores SHLIB_LINK when linking modules

2012-03-08 Thread Peter Eisentraut
On ons, 2012-03-07 at 21:39 +0200, Marti Raudsepp wrote:
 Is there any reason that pgxs ignores SHLIB_LINK when building
 MODULES?

No, it's just never been needed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Pavel Stehule
2012/3/8 Peter Eisentraut pete...@gmx.net:
 On ons, 2012-03-07 at 12:31 -0500, Robert Haas wrote:
 I might agree with you if we had more than one checker function, but
 right now we are proposing to implement this for PL/pgsql and only
 PL/pgsql.  It seems to me that we can add that when and if a second
 checker function shows up, if it still seems like a good idea.

 I had mentioned upthread that I would like to use this for PL/Python.
 There are a number of code quality checkers out there for Python.  I
 currently have 3 hooked into Emacs, and 2 or 3 are typically used in the
 builds of projects I'm working on.  All of these are shipped separately
 from Python.

 This leads to the following requirements:

      * Multiple checkers per language must be supported.
      * It must be possible to add checkers to a language after it is
        created.  For example, a checker could be shipped in an
        extension.
      * It's not terribly important to me to be able to run checkers
        separately.  If I wanted to do that, I would just disable or
        remove the checker.
      * Just to make things interesting, it should be possible to
        implement checkers for language X in language X.

But you propose some little bit different than is current plpgsql
checker and current design. You need hook on CREATE FUNCTION probably?
It's not bad, but it is some different and it is not useful for
plpgsql - external stored procedures are different, than SQL
procedures and probably you will check different issues.

I don't think so multiple checkers and external checkers are necessary
- if some can living outside, then it should to live outside. Internal
checker can be one for PL language. It is parametrized - so you can
control goals of checking.


 If it would help, given an API (even if only in C at the moment), I
 could probably write up one or two checker function prototypes that
 could be run against the PL/Python regression test corpus.


sure, please look on patch and plpgsql checker function. Checker can
be any function with same interface. Some PL/Python checker can be
good.

Regards

Pavel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-08 Thread Dimitri Fontaine
Hi,

Thom Brown t...@linux.com writes:
 The message returned by creating a command trigger after create index
 is still problematic:

Fixed.  I'm attaching an incremental patch here, the github branch is
updated too.

 CREATE VIEW doesn't return schema:

Fixed, and as an added bonus I fixed the CREATE SEQUENCE oddity about
that too.

 No specific triggers fire when altering a conversion:

Couldn't reproduce, works here, added tests.

 No specific triggers fire when altering the properties of a function:

Fixed.

 No specific triggers fire when altering a sequence:

Couldn't reproduce, added tests.

 No specific triggers when altering a view:

Same again.

 The object name shown in specific triggers when dropping aggregates
 shows the schema name:
 Same for collations:
 Dropping functions shows the object name as the schema name:
 Same with dropping operators:
 ...and operator family:
 … and the same for dropping text search
 configuration/dictionary/parser/template.

Fixed in the attached (all of those where located at exactly the same
place, by the way, that's one fix).

 When dropping domains, the name of the domain includes the schema name:

I'm using format_type_be(objectId) so that int4 is integer and int8
bigint etc, but that's adding the schemaname. I didn't have time to look
for another API that wouldn't add the schemaname nor to add one myself,
will do that soon.

 I hadn't previously tested triggers against CLUSTER, REINDEX, VACUUM
 and LOAD, but have tested them now.

Cool :)

 When creating a trigger on REINDEX, I get the following message:

Fixed.

 Creating AFTER CLUSTER command triggers produce an error (as expected
 since it's not supported), but AFTER REINDEX only produces a warning.
 These should be the same, probably both an error.

Fixed.

 VACUUM doesn't fire a specific command trigger:

I though it was better this way, I changed my mind and completed the code.

 REINDEX on a table seems to show no schema name but an object name for
 specific triggers:

Still on the TODO.

 When REINDEXing an index rather than a table, the table's details are
 shown in the trigger.  Is this expected?:

Yeah well.  Will see about how much damage needs to be done in the
current APIs, running out of steam for tonight's batch.

 REINDEXing the whole database doesn't fire specific command triggers:

We don't want to because REINDEX DATABASE is managing transactions on
its own, same limitation as with AFTER VACUUM and all.  Will have a look
at what it takes to document that properly.

 Documentation:

Fixed.

Thom Brown t...@linux.com writes:
 I've also since found that if I issue a VACUUM, CLUSTER or REINDEX on
 a read-only standby, the BEFORE ANY COMMAND trigger fires.  I don't
 think any trigger should fire on a read-only standby.

Well I'm not sold on that myself (think pl/untrusted that would reach
out to the OS and do whatever is needed there).  You can even set the
session_replication_role GUC to replica and only have the replica
command triggers fired.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/doc/src/sgml/ref/create_command_trigger.sgml
--- b/doc/src/sgml/ref/create_command_trigger.sgml
***
*** 41,48  CREATE COMMAND TRIGGER replaceable class=PARAMETERname/replaceable { BEFOR
  ALTER LANGUAGE
  ALTER OPERATOR
  ALTER OPERATOR CLASS
- ALTER OPERATOR CLASS
- ALTER OPERATOR FAMILY
  ALTER OPERATOR FAMILY
  ALTER SCHEMA
  ALTER SEQUENCE
--- 41,46 
***
*** 137,146  CREATE COMMAND TRIGGER replaceable class=PARAMETERname/replaceable { BEFOR
/para
  
para
!Note that objects dropped by effect of literalDROP CASCADE/literal
!will not provoque firing a command trigger, only the top-level command
!for the main object will fire a command trigger. That also applis to
!other dependencies following, as in literalDROP OWNED BY/literal.
/para
  
para
--- 135,145 
/para
  
para
!Note that objects dropped by the effect of literalDROP
!CASCADE/literal will not result in a command trigger firing, only the
!top-level command for the main object will fire a command trigger. That
!also applies to other dependencies following, as in literalDROP OWNED
!BY/literal.
/para
  
para
***
*** 192,198  CREATE COMMAND TRIGGER replaceable class=PARAMETERname/replaceable { BEFOR
you are connected to.
   /para
   para
!   Commands that exercize their own transaction control are only
supported in literalBEFORE/literal command triggers, that's the
case for literalVACUUM/literal, literalCLUSTER/literal
literalCREATE INDEX CONCURRENTLY/literal, and literalREINDEX
--- 191,197 
you are connected to.
   /para
   para
!   Commands that exercise their own transaction control are only
supported in literalBEFORE/literal command triggers, that's the

Re: [HACKERS] poll: CHECK TRIGGER?

2012-03-08 Thread Robert Haas
On Thu, Mar 8, 2012 at 4:54 PM, Peter Eisentraut pete...@gmx.net wrote:
      * It's not terribly important to me to be able to run checkers
        separately.  If I wanted to do that, I would just disable or
        remove the checker.

Does this requirement mean that you want to essentially associate a
set of checkers with each language and then, when asked to check a
function, run all of them serially in an undefined order?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-08 Thread Thom Brown
On 8 March 2012 22:24, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:

We're getting there. :)

 Hi,

 Thom Brown t...@linux.com writes:
 The message returned by creating a command trigger after create index
 is still problematic:

 Fixed.  I'm attaching an incremental patch here, the github branch is
 updated too.

Confirmed.

 CREATE VIEW doesn't return schema:

 Fixed, and as an added bonus I fixed the CREATE SEQUENCE oddity about
 that too.

Yes, working now.

 No specific triggers fire when altering a conversion:

 Couldn't reproduce, works here, added tests.

My apologies.  It seems I neglected to set up a specific trigger for
it.  It does indeed work.

 No specific triggers fire when altering the properties of a function:

 Fixed.

Yes, tried every property available and working in every case now.

 No specific triggers fire when altering a sequence:

 Couldn't reproduce, added tests.

Again, I wrongly assumed I had set up a command trigger for this.  I
think it's because I based my specific triggers on what was listed on
the CREATE COMMAND TRIGGER documentation page, and those were only
recently added.  This is working fine.

 No specific triggers when altering a view:

 Same again.

Working correctly. (see above)

 The object name shown in specific triggers when dropping aggregates
 shows the schema name:
 Same for collations:
 Dropping functions shows the object name as the schema name:
 Same with dropping operators:
 ...and operator family:
 … and the same for dropping text search
 configuration/dictionary/parser/template.

 Fixed in the attached (all of those where located at exactly the same
 place, by the way, that's one fix).

CREATE OPERATOR CLASS now shows objectname as 'hash' instead of its
name where it's declared as USING hash.  This isn't a problem with
ALTER/DROP OPERATOR CLASS.  Everything else above works as expected
now.

 When dropping domains, the name of the domain includes the schema name:

 I'm using format_type_be(objectId) so that int4 is integer and int8
 bigint etc, but that's adding the schemaname. I didn't have time to look
 for another API that wouldn't add the schemaname nor to add one myself,
 will do that soon.

Okay, skipping test.

 When creating a trigger on REINDEX, I get the following message:

 Fixed.

Could we change this to REINDEX DATABASE triggers are not supported?
 This way it would be consistent with the AFTER CREATE INDEX
CONCURRENTLY warning.

 VACUUM doesn't fire a specific command trigger:

 I though it was better this way, I changed my mind and completed the code.

Yes, working now.

 REINDEX on a table seems to show no schema name but an object name for
 specific triggers:

 Still on the TODO.

Skipped.

 When REINDEXing an index rather than a table, the table's details are
 shown in the trigger.  Is this expected?:

 Yeah well.  Will see about how much damage needs to be done in the
 current APIs, running out of steam for tonight's batch.

Skipped.

 Documentation:

 Fixed.

ALTER CAST is still listed and needs removing, not just from the
documentation but every place it's used your code too.  I can
currently create a trigger for it, but it's impossible for it to fire
since there's no such command.

All these corrections I mentioned previously still needs to be made:

“A command trigger's function must return void, the only it can aborts
the execution of the command is by raising an exception.”
should be:
“A command trigger's function must return void.  It can then only
abort the execution of the command by raising an exception.”

Remove:
“For a constraint trigger, this is also the name to use when modifying
the trigger's behavior using SET CONSTRAINTS.”

“that's the case for VACUUM, CLUSTER CREATE INDEX CONCURRENTLY, and
REINDEX DATABASE.”
should be:
“that's the case for VACUUM, CLUSTER, CREATE INDEX CONCURRENTLY, and
REINDEX DATABASE.”

 Thom Brown t...@linux.com writes:
 I've also since found that if I issue a VACUUM, CLUSTER or REINDEX on
 a read-only standby, the BEFORE ANY COMMAND trigger fires.  I don't
 think any trigger should fire on a read-only standby.

 Well I'm not sold on that myself (think pl/untrusted that would reach
 out to the OS and do whatever is needed there).  You can even set the
 session_replication_role GUC to replica and only have the replica
 command triggers fired.

All other command triggers don't fire on read-only standbys, and the
inconsistency doesn't seem right.  On the one hand all
CREATE/DROP/ALTER triggers aren't fired because of the cannot execute
command in a read-only transaction error message, but triggers do
occur before utility commands, which would otherwise display the same
message, and might not display it at all if the trigger causes an
error in its function call.  So it seems like they should either all
fire, or none of them should.  What are you thoughts?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Bruce Momjian
On Thu, Mar 08, 2012 at 10:19:05AM -0500, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Thu, Mar 08, 2012 at 08:34:53AM -0500, A.M. wrote:
  It looks like the patch will overwrite the logs in the current working
  directory, for example, if pg_upgrade is run twice in the same place. Is
  that intentional? I had imagined that the logs would have been dumped in
 
  Yes.  I was afraid that continually appending to a log file on every run
  would be too confusing.  I could do only appends, or number the log
  files, that those seemed confusing.
 
 Use one (set of) files, and always append, but at the beginning of each
 run print \npg_upgrade starting at [timestamp]\n\n.  Should make it
 reasonably clear, while not losing information.

OK, it seems people do care about keeping log files from multiple runs
so I went with Tom's idea and have:

-
  pg_upgrade run on Thu Mar  8 19:30:12 2012
-

Performing Consistency Checks
-

Updated patch attached.

FYI, in retain mode, these are the files left in the current directory:

delete_old_cluster.sh
pg_upgrade_dump_all.sql
pg_upgrade_dump_db.sql
pg_upgrade_dump_globals.sql
pg_upgrade_internal.log
pg_upgrade_restore.log
pg_upgrade_server.log
pg_upgrade_utility.log

I will address the idea of using /tmp in another email.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index a5f63eb..7905534
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** issue_warnings(char *sequence_script_fil
*** 165,176 
  		if (sequence_script_file_name)
  		{
  			prep_status(Adjusting sequences);
! 			exec_prog(true,
! 	  SYSTEMQUOTE \%s/psql\ --set ON_ERROR_STOP=on 
  	  --no-psqlrc --port %d --username \%s\ 
! 	  -f \%s\ --dbname template1  \%s\ SYSTEMQUOTE,
  	  new_cluster.bindir, new_cluster.port, os_info.user,
! 	  sequence_script_file_name, log_opts.filename2);
  			unlink(sequence_script_file_name);
  			check_ok();
  		}
--- 165,177 
  		if (sequence_script_file_name)
  		{
  			prep_status(Adjusting sequences);
! 			exec_prog(true, UTILITY_LOG_FILE,
! 	  SYSTEMQUOTE \%s/psql\ --echo-queries 
! 	  --set ON_ERROR_STOP=on 
  	  --no-psqlrc --port %d --username \%s\ 
! 	  -f \%s\ --dbname template1  \%s\ 21 SYSTEMQUOTE,
  	  new_cluster.bindir, new_cluster.port, os_info.user,
! 	  sequence_script_file_name, UTILITY_LOG_FILE);
  			unlink(sequence_script_file_name);
  			check_ok();
  		}
diff --git a/contrib/pg_upgrade/controldata.c b/contrib/pg_upgrade/controldata.c
new file mode 100644
index 5239601..e01280d
*** a/contrib/pg_upgrade/controldata.c
--- b/contrib/pg_upgrade/controldata.c
*** get_control_data(ClusterInfo *cluster, b
*** 126,136 
  	/* we have the result of cmd in output. so parse it line by line now */
  	while (fgets(bufin, sizeof(bufin), output))
  	{
! 		if (log_opts.debug)
! 			fputs(bufin, log_opts.debug_fd);
  
  #ifdef WIN32
- 
  		/*
  		 * Due to an installer bug, LANG=C doesn't work for PG 8.3.3, but does
  		 * work 8.2.6 and 8.3.7, so check for non-ASCII output and suggest a
--- 126,134 
  	/* we have the result of cmd in output. so parse it line by line now */
  	while (fgets(bufin, sizeof(bufin), output))
  	{
! 		pg_log(PG_VERBOSE, %s, bufin);
  
  #ifdef WIN32
  		/*
  		 * Due to an installer bug, LANG=C doesn't work for PG 8.3.3, but does
  		 * work 8.2.6 and 8.3.7, so check for non-ASCII output and suggest a
diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c
new file mode 100644
index 772ca37..7ec94ad
*** a/contrib/pg_upgrade/dump.c
--- b/contrib/pg_upgrade/dump.c
*** generate_old_dump(void)
*** 22,31 
  	 * --binary-upgrade records the width of dropped columns in pg_class, and
  	 * restores the frozenid's for databases and relations.
  	 */
! 	exec_prog(true,
  			  SYSTEMQUOTE \%s/pg_dumpall\ --port %d --username \%s\ 
! 			  --schema-only --binary-upgrade  \%s/ ALL_DUMP_FILE \
! 			  SYSTEMQUOTE, new_cluster.bindir, old_cluster.port, os_info.user, os_info.cwd);
  	check_ok();
  }
  
--- 22,32 
  	 * --binary-upgrade records the width of dropped columns in pg_class, and
  	 * restores the frozenid's for databases and relations.
  	 */
! 	exec_prog(true, UTILITY_LOG_FILE,
  			  SYSTEMQUOTE \%s/pg_dumpall\ --port %d --username \%s\ 
! 			  --schema-only --binary-upgrade  \%s/%s\ 2 \%s\
! 			  SYSTEMQUOTE, new_cluster.bindir, old_cluster.port, os_info.user,
! 			  os_info.cwd, ALL_DUMP_FILE, UTILITY_LOG_FILE);
  	

Re: [HACKERS] pg_upgrade --logfile option documentation

2012-03-08 Thread Bruce Momjian
On Thu, Mar 08, 2012 at 11:59:59PM +0200, Peter Eisentraut wrote:
 On tor, 2012-03-08 at 16:44 -0500, A.M. wrote:
  The point of writing temp files to the /tmp/ directory is that they
  don't need to be cleaned up.
 
 I don't think so.  If everyone just left their junk lying around
 in /tmp, it would fill up just like any other partition.

/tmp has a larger problem; see:

http://lwn.net/Articles/390323/

The only proper fix is to create a directory in /tmp and write into
there.

pg_upgrade has been writing into the current directory since it was
started, and no one has complained, and it leaves these files in the
current directory only if it fails, or --retain is used, so it will
clean up after itself.  Actually, I think it originally write into the
user's home directory, but that caused problems.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more MVCC-safe

2012-03-08 Thread Robert Haas
On Wed, Mar 7, 2012 at 5:41 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Case #2 is certainly a problem for FrozenXID as well, because anything
 that's marked with FrozenXID is going to look visible to everybody,
 including our older snapshots.  And I gather you're saying it's also a
 problem for HEAP_XMIN_COMMITTED.

 The problem there is that later subtransactions often have xids that
 are greater than xmax, so the xid shows as running when we do
 XidInMVCCSnapshot(), which must then be altered for this one weird
 case. I tried that and not happy with result.

Altering XidInMVCCSnapshot() seems like a good thing to avoid, but I
confess I don't quite follow what you're describing here otherwise.

 I had assumed that the way we were
 fixing this problem was to disable these optimizations for
 transactions that had more than one snapshot floating around.  I'm not
 sure whether the patch does that or not, but I think it probably needs
 to

 It does. I thought you already read the patch?

I glanced over it, but did not look through it in detail.  I'll do a
more careful look at your next version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_prewarm

2012-03-08 Thread Robert Haas
It's been bugging me for a while now that we don't have a prewarming
utility, for a couple of reasons, including:

1. Our customers look at me funny when I suggest that they use
pg_relation_filepath() and /bin/dd for this purpose.

2. Sometimes when I'm benchmarking stuff, I want to get all the data
cached in shared_buffers.  This is surprisingly hard to do if the size
of any relation involved is =1/4 of shared buffers, because the
BAS_BULKREAD stuff kicks in.  You can do it by repeatedly seq-scanning
the relation - eventually all the blocks trickle in - but it takes a
long time, and that's annoying.

So I wrote a prewarming utility.  Patch is attached.  You can prewarm
either the OS cache or PostgreSQL's cache, and there are two options
for prewarming the OS cache to meet different needs.  By passing the
correct arguments to the function, you can prewarm an entire relation
or just the blocks you choose; prewarming of blocks from alternate
relation forks is also supported, for completeness.

Hope you like it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pg_prewarm_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_prewarm

2012-03-08 Thread Jaime Casanova
On Thu, Mar 8, 2012 at 11:13 PM, Robert Haas robertmh...@gmail.com wrote:
 It's been bugging me for a while now that we don't have a prewarming
 utility, for a couple of reasons, including:

 1. Our customers look at me funny when I suggest that they use
 pg_relation_filepath() and /bin/dd for this purpose.


well, you can't deny that is funny see people doing faces ;)


 So I wrote a prewarming utility.  Patch is attached.

cool!

just a suggestion, can we relax this check? just send a WARNING or a
NOTICE and set last_block = nblocks - 1
just an opinion

+   if (PG_ARGISNULL(4))
+   last_block = nblocks - 1;
+   else
+   {
+   last_block = PG_GETARG_INT64(4);
+   if (last_block  nblocks)
+   ereport(ERROR,
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg(ending block number  
INT64_FORMAT  exceeds number of
blocks in relation  INT64_FORMAT, last_block, nblocks)));
+   }

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-03-08 Thread Tom Lane
I wrote:
 There are a couple of other points that make me think we need to revisit
 the PlanForeignScan API definition some more, too.  ...
 So we need to break down what PlanForeignScan currently does into three
 separate steps.  The first idea that comes to mind is to call them
 GetForeignRelSize, GetForeignPaths, GetForeignPlan; but maybe somebody
 has a better idea for names?

Attached is a draft patch for that.  While I was working on this
I realized that we were very far short of allowing FDWs to set up
expressions of their choice for execution; there was nothing for that in
setrefs.c, nor some other places that need to post-process expressions.
I had originally supposed that fdw_private could just contain some
expression trees, but that wasn't going to work without post-processing.
So this patch attempts to cover that too, by breaking what had been
fdw_private into a private part and an fdw_exprs list that will be
subject to expression post-processing.  (The alternative to this would
be to do post-processing on all of fdw_private, but that would
considerably restrict what can be in fdw_private, so it seemed better
to decree two separate fields.)

Working on this also helped me identify some other things that had been
subliminally bothering me about pgsql_fdw's qual pushdown code.  That
patch is set up with the idea of pushing entire quals (boolean
RestrictInfo expressions) across to the remote side, but I think that
is probably the wrong granularity, or at least not the only mechanism
we should have.  IMO it is more important to provide a structure similar
to index quals; that is, what you want to identify is RestrictInfo
expressions of the form
remote_variable operator local_expression
where the operator has to be one that the remote can execute with the
same semantics as we think it has, but the only real restriction on the
local_expression is that it be stable, because we'll execute it locally
and send only its result value across to the remote.  (The SQL sent to
the remote looks like remote_variable operator $1, or some such.)
Thus, to take an example that's said to be unsafe in the existing code
comments, there's no problem at all with
remote_timestamp_col = now()
as long as we execute now() locally.

There might be some value in pushing entire quals across too, for
clauses like remote_variable_1 = remote_variable_2, but I believe
that these are not nearly as important as variable = constant and
variable = join_variable cases.  Consider that when dealing with a
local table, only the latter two cases can be accelerated by indexes.

regards, tom lane

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 29f203c6f10d194670aa37956a6190e8e7a1..e8907709bd90a6342384dfb6f10b00e55018d65d 100644
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 26,31 
--- 26,33 
  #include nodes/makefuncs.h
  #include optimizer/cost.h
  #include optimizer/pathnode.h
+ #include optimizer/planmain.h
+ #include optimizer/restrictinfo.h
  #include utils/rel.h
  
  PG_MODULE_MAGIC;
*** struct FileFdwOption
*** 48,54 
   * Note: If you are adding new option for user mapping, you need to modify
   * fileGetOptions(), which currently doesn't bother to look at user mappings.
   */
! static struct FileFdwOption valid_options[] = {
  	/* File options */
  	{filename, ForeignTableRelationId},
  
--- 50,56 
   * Note: If you are adding new option for user mapping, you need to modify
   * fileGetOptions(), which currently doesn't bother to look at user mappings.
   */
! static const struct FileFdwOption valid_options[] = {
  	/* File options */
  	{filename, ForeignTableRelationId},
  
*** static struct FileFdwOption valid_option
*** 72,77 
--- 74,90 
  };
  
  /*
+  * FDW-specific information for RelOptInfo.fdw_private.
+  */
+ typedef struct FileFdwPlanState
+ {
+ 	char	   *filename;		/* file to read */
+ 	List	   *options;		/* merged COPY options, excluding filename */
+ 	BlockNumber pages;			/* estimate of file's physical size */
+ 	double		ntuples;		/* estimate of number of rows in file */
+ } FileFdwPlanState;
+ 
+ /*
   * FDW-specific information for ForeignScanState.fdw_state.
   */
  typedef struct FileFdwExecutionState
*** PG_FUNCTION_INFO_V1(file_fdw_validator);
*** 93,101 
  /*
   * FDW callback routines
   */
! static void filePlanForeignScan(Oid foreigntableid,
! 	PlannerInfo *root,
! 	RelOptInfo *baserel);
  static void fileExplainForeignScan(ForeignScanState *node, ExplainState *es);
  static void fileBeginForeignScan(ForeignScanState *node, int eflags);
  static TupleTableSlot *fileIterateForeignScan(ForeignScanState *node);
--- 106,123 
  /*
   * FDW callback routines
   */
! static void fileGetForeignRelSize(PlannerInfo *root,
!   RelOptInfo *baserel,
!   Oid foreigntableid);
! static void fileGetForeignPaths(PlannerInfo 

Re: [HACKERS] Review of pg_archivecleanup -x option patch

2012-03-08 Thread Jaime Casanova
On Thu, Mar 8, 2012 at 8:25 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith g...@2ndquadrant.com wrote:
 The smaller step of automatically stripping the specified suffix from the
 input file name, when it matches the one we've told the program to expect,
 seems like a usability improvement unlikely to lead to bad things though.
  I've certainly made the mistake you've shown when using the patched version
 of the program myself, just didn't think about catching and correcting it
 myself before.  We can rev this to add that feature and resubmit easily
 enough, will turn that around soon.

 This change seems plenty small enough to slip in even at this late
 date, but I guess we're lacking a new version of the patch.


Sorry, here's the patch rebased and with the suggestion from Alex.
Which reminds me, I never thank him for the review (shame on me) :D

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of pg_archivecleanup -x option patch

2012-03-08 Thread Jaime Casanova
On Fri, Mar 9, 2012 at 12:46 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Thu, Mar 8, 2012 at 8:25 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith g...@2ndquadrant.com wrote:
 The smaller step of automatically stripping the specified suffix from the
 input file name, when it matches the one we've told the program to expect,
 seems like a usability improvement unlikely to lead to bad things though.
  I've certainly made the mistake you've shown when using the patched version
 of the program myself, just didn't think about catching and correcting it
 myself before.  We can rev this to add that feature and resubmit easily
 enough, will turn that around soon.

 This change seems plenty small enough to slip in even at this late
 date, but I guess we're lacking a new version of the patch.


 Sorry, here's the patch rebased and with the suggestion from Alex.
 Which reminds me, I never thank him for the review (shame on me) :D


with the patch this time

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
diff --git a/contrib/pg_archivecleanup/pg_archivecleanup.c b/contrib/pg_archivecleanup/pg_archivecleanup.c
new file mode 100644
index adfc83d..8f6ca2c
*** a/contrib/pg_archivecleanup/pg_archivecleanup.c
--- b/contrib/pg_archivecleanup/pg_archivecleanup.c
*** const char *progname;
*** 37,42 
--- 37,43 
  /* Options and defaults */
  bool		debug = false;		/* are we debugging? */
  bool		dryrun = false;		/* are we performing a dry-run operation? */
+ char	   *additional_ext = NULL;	/* Extension to remove from filenames */
  
  char	   *archiveLocation;	/* where to find the archive? */
  char	   *restartWALFileName; /* the file from which we can restart restore */
*** static void
*** 94,106 
--- 95,136 
  CleanupPriorWALFiles(void)
  {
  	int			rc;
+ 	int			chop_at;
  	DIR		   *xldir;
  	struct dirent *xlde;
+ 	char		walfile[MAXPGPATH];
  
  	if ((xldir = opendir(archiveLocation)) != NULL)
  	{
  		while ((xlde = readdir(xldir)) != NULL)
  		{
+ 			strncpy(walfile, xlde-d_name, MAXPGPATH);
+ 			/* 
+ 			 * Remove any specified additional extension from the filename
+ 			 * before testing it against the conditions below.
+ 			 */
+ 			if (additional_ext)
+ 			{
+ chop_at = strlen(walfile) - strlen(additional_ext);
+ /*
+  * Only chop if this is long enough to be a file name and the
+  * extension matches.
+  */
+ if ((chop_at = (XLOG_DATA_FNAME_LEN - 1))  
+ 	(strcmp(walfile + chop_at,additional_ext)==0))
+ {
+ 	walfile[chop_at] = '\0';
+ 	/* 
+ 	 * This is too chatty even for regular debug output, but
+ 	 * leaving it in for program testing.
+ 	 */
+ 	if (false)
+ 		fprintf(stderr, 
+ 			removed extension='%s' from file=%s result=%s\n,
+ 			additional_ext,xlde-d_name,walfile);
+ }
+ 			}
+ 
  			/*
  			 * We ignore the timeline part of the XLOG segment identifiers in
  			 * deciding whether a segment is still needed.	This ensures that
*** CleanupPriorWALFiles(void)
*** 114,123 
  			 * file. Note that this means files are not removed in the order
  			 * they were originally written, in case this worries you.
  			 */
! 			if (strlen(xlde-d_name) == XLOG_DATA_FNAME_LEN 
! 			strspn(xlde-d_name, 0123456789ABCDEF) == XLOG_DATA_FNAME_LEN 
! strcmp(xlde-d_name + 8, exclusiveCleanupFileName + 8)  0)
  			{
  snprintf(WALFilePath, MAXPGPATH, %s/%s,
  		 archiveLocation, xlde-d_name);
  
--- 144,157 
  			 * file. Note that this means files are not removed in the order
  			 * they were originally written, in case this worries you.
  			 */
! 			if (strlen(walfile) == XLOG_DATA_FNAME_LEN 
! 			strspn(walfile, 0123456789ABCDEF) == XLOG_DATA_FNAME_LEN 
! strcmp(walfile + 8, exclusiveCleanupFileName + 8)  0)
  			{
+ /* 
+  * Use the original file name again now, including any extension
+  * that might have been chopped off before testing the sequence.
+  */
  snprintf(WALFilePath, MAXPGPATH, %s/%s,
  		 archiveLocation, xlde-d_name);
  
*** SetWALFileNameForCleanup(void)
*** 167,172 
--- 201,228 
  {
  	bool		fnameOK = false;
  
+ 	/* if restartWALFileName was specified with an extension remove it first */
+ 	if (additional_ext)
+ 	{
+ 		int		max_len = 0; /* Initialize just to keep quiet the compiler */
+ 		int		chop_at = strlen(restartWALFileName) - strlen(additional_ext);
+ 
+ 		if (strlen(restartWALFileName) == (XLOG_DATA_FNAME_LEN + strlen(additional_ext))) 
+ 			max_len = XLOG_DATA_FNAME_LEN;	
+ 		else if (strlen(restartWALFileName) == (XLOG_BACKUP_FNAME_LEN + strlen(additional_ext) + 1))
+ 			max_len = XLOG_BACKUP_FNAME_LEN;	
+ 
+ 		/*
+ 		 * Only chop if this is long enough to be a file name and the
+ 		 * extension matches.
+ 		 */
+ 		if ((chop_at = (max_len - 1))  
+ 			(strcmp(restartWALFileName + chop_at, 

Re: [HACKERS] t_natts references in comments

2012-03-08 Thread Heikki Linnakangas

On 08.03.2012 22:33, Kevin Grittner wrote:

I wasted a few minutes today tracking down what a couple comments
really meant to say.  t_natts no longer exists as a separate field;
the equivalent value is now pulled from t_infomask2 using a macro.

Small patch to correct the comments is attached.


Thanks, applied.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Measuring relation free space

2012-03-08 Thread Jaime Casanova
On Wed, Feb 22, 2012 at 12:27 AM, Noah Misch n...@leadboat.com wrote:
 On Tue, Feb 14, 2012 at 02:04:26AM -0500, Jaime Casanova wrote:

 1) pgstattuple-gin_spgist.patch
 This first patch adds gin and spgist support to pgstattuple, also
 makes pgstattuple use a ring buffer when reading tables or indexes.

 The buffer access strategy usage bits look fine to commit.


ok. i extracted that part. which basically makes pgstattuple usable in
production (i mean, by not bloating shared buffers when using the
function)

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
new file mode 100644
index beff1b9..9f2ec1f
*** a/contrib/pgstattuple/pgstatindex.c
--- b/contrib/pgstattuple/pgstatindex.c
*** pgstatindex(PG_FUNCTION_ARGS)
*** 95,100 
--- 95,101 
  	BlockNumber nblocks;
  	BlockNumber blkno;
  	BTIndexStat indexStat;
+  	BufferAccessStrategy bstrategy = GetAccessStrategy(BAS_BULKREAD);
  
  	if (!superuser())
  		ereport(ERROR,
*** pgstatindex(PG_FUNCTION_ARGS)
*** 122,128 
  	 * Read metapage
  	 */
  	{
! 		Buffer		buffer = ReadBuffer(rel, 0);
  		Page		page = BufferGetPage(buffer);
  		BTMetaPageData *metad = BTPageGetMeta(page);
  
--- 123,129 
  	 * Read metapage
  	 */
  	{
! 		Buffer		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, 0, RBM_NORMAL, bstrategy);
  		Page		page = BufferGetPage(buffer);
  		BTMetaPageData *metad = BTPageGetMeta(page);
  
*** pgstatindex(PG_FUNCTION_ARGS)
*** 159,165 
  		CHECK_FOR_INTERRUPTS();
  
  		/* Read and lock buffer */
! 		buffer = ReadBuffer(rel, blkno);
  		LockBuffer(buffer, BUFFER_LOCK_SHARE);
  
  		page = BufferGetPage(buffer);
--- 160,166 
  		CHECK_FOR_INTERRUPTS();
  
  		/* Read and lock buffer */
!  		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, blkno, RBM_NORMAL, bstrategy);
  		LockBuffer(buffer, BUFFER_LOCK_SHARE);
  
  		page = BufferGetPage(buffer);
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
new file mode 100644
index e5ddd87..580e24e
*** a/contrib/pgstattuple/pgstattuple.c
--- b/contrib/pgstattuple/pgstattuple.c
*** static Datum pgstat_index(Relation rel,
*** 78,83 
--- 78,90 
  static void pgstat_index_page(pgstattuple_type *stat, Page page,
    OffsetNumber minoff, OffsetNumber maxoff);
  
+ /* 
+  * Buffer access strategy for reading relations, it's simpler to keep it
+  * global because pgstat_*_page() functions read one buffer at a time.
+  * pgstat_heap() and pgstat_index() should initialize it before use.
+  */
+ BufferAccessStrategy bstrategy;
+ 
  /*
   * build_pgstattuple_type -- build a pgstattuple_type tuple
   */
*** pgstat_relation(Relation rel, FunctionCa
*** 231,236 
--- 238,246 
  case GIN_AM_OID:
  	err = gin index;
  	break;
+ case SPGIST_AM_OID:
+ 	err = spgist index;
+ 	break;
  default:
  	err = unknown index;
  	break;
*** pgstat_heap(Relation rel, FunctionCallIn
*** 276,281 
--- 286,295 
  
  	nblocks = scan-rs_nblocks; /* # blocks to be scanned */
  
+ 	/* prepare access strategy for this table */
+ 	bstrategy = GetAccessStrategy(BAS_BULKREAD);
+ 	scan-rs_strategy = bstrategy;
+ 
  	/* scan the relation */
  	while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
  	{
*** pgstat_heap(Relation rel, FunctionCallIn
*** 309,315 
  		{
  			CHECK_FOR_INTERRUPTS();
  
! 			buffer = ReadBuffer(rel, block);
  			LockBuffer(buffer, BUFFER_LOCK_SHARE);
  			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
  			UnlockReleaseBuffer(buffer);
--- 323,329 
  		{
  			CHECK_FOR_INTERRUPTS();
  
! 			buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, RBM_NORMAL, bstrategy);
  			LockBuffer(buffer, BUFFER_LOCK_SHARE);
  			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
  			UnlockReleaseBuffer(buffer);
*** pgstat_heap(Relation rel, FunctionCallIn
*** 322,328 
  	{
  		CHECK_FOR_INTERRUPTS();
  
! 		buffer = ReadBuffer(rel, block);
  		LockBuffer(buffer, BUFFER_LOCK_SHARE);
  		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
  		UnlockReleaseBuffer(buffer);
--- 336,342 
  	{
  		CHECK_FOR_INTERRUPTS();
  
! 		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, RBM_NORMAL, bstrategy);
  		LockBuffer(buffer, BUFFER_LOCK_SHARE);
  		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
  		UnlockReleaseBuffer(buffer);
*** pgstat_btree_page(pgstattuple_type *stat
*** 345,351 
  	Buffer		buf;
  	Page		page;
  
! 	buf = ReadBuffer(rel, blkno);
  	LockBuffer(buf, BT_READ);
  	page = BufferGetPage(buf);
  
--- 359,365 
  	Buffer		buf;
  	Page		page;
  
! 	buf = ReadBufferExtended(rel, MAIN_FORKNUM, blkno, RBM_NORMAL, bstrategy);
  	LockBuffer(buf, BT_READ);
  	page =